fix(security): add authenticated reverse proxy for local Ollama#1922
fix(security): add authenticated reverse proxy for local Ollama#1922brandonpelfrey merged 16 commits intomainfrom
Conversation
Ollama has no built-in auth and binding to 0.0.0.0 exposes it to the network (CWE-668, #1140). This adds an authenticated reverse proxy so Ollama stays on localhost while containers can still reach it. - Add scripts/ollama-auth-proxy.js — Node.js proxy on 0.0.0.0:11435 that validates a per-instance Bearer token before forwarding to Ollama on 127.0.0.1:11434. Health check (GET /api/tags) is exempt. Uses crypto.timingSafeEqual for timing-safe token comparison. - Bind Ollama to 127.0.0.1 instead of 0.0.0.0 during onboard - Start the auth proxy after Ollama, with stale proxy cleanup and startup verification - Route sandbox inference through proxy port (11435) with the generated token as the OpenAI API key credential - Gate macOS hint on process.platform === "darwin" - Add OLLAMA_PROXY_PORT (11435) to ports.ts - Add 7 e2e tests and CI job for the proxy - Update unit tests for new port and error messages Reimplements the approach from #679 (closed in favor of #1104) against the current TypeScript codebase, addressing CodeRabbit findings from the original PR (timing-safe comparison, stale proxy cleanup, startup verification). Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an Ollama authentication reverse proxy, integrates it into onboarding and runtime to start/ensure/use the proxy and token, routes container reachability checks through the proxy, updates tests and adds an e2e proxy test plus a CI job to run that test. Changes
Sequence Diagram(s)sequenceDiagram
participant Container as Client Container
participant Proxy as Ollama Auth Proxy\n(Port 11435)
participant Backend as Ollama Backend\n(Port 11434)
rect rgba(0, 200, 100, 0.5)
Note over Container,Backend: Authenticated Request Flow
Container->>Proxy: POST /api/generate\nAuthorization: Bearer {token}
Proxy->>Proxy: Validate token (timing-safe)
Proxy->>Backend: Forward request (strip auth/host)
Backend->>Proxy: Response stream
Proxy->>Container: Response stream
end
rect rgba(200, 100, 0, 0.5)
Note over Container,Proxy: Unauthorized Request Flow
Container->>Proxy: POST /api/generate (no/invalid token)
Proxy->>Container: HTTP 401 Unauthorized
end
rect rgba(100, 150, 200, 0.5)
Note over Container,Backend: Health Check (Exempted)
Container->>Proxy: GET /api/tags (no auth)
Proxy->>Backend: Forward health check
Backend->>Proxy: Model list
Proxy->>Container: Model list
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
✨
Related open issues: Possibly related open PRs:
Possibly related open issues: |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
.github/workflows/pr.yaml (1)
54-65: Gate this E2E job behind the existingchangesfilter.Right now this runs on docs-only PRs too, so it bypasses the fast path the workflow already has for expensive jobs. If that is not intentional, add the same
needs: [checks, changes]/if: needs.changes.outputs.code == 'true'guard used bysandbox-images-and-e2e.♻️ Suggested change
test-e2e-ollama-proxy: + needs: [checks, changes] + if: needs.changes.outputs.code == 'true' runs-on: ubuntu-latest timeout-minutes: 5🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/pr.yaml around lines 54 - 65, The E2E job "test-e2e-ollama-proxy" should be gated by the existing changes filter: add the same needs and conditional used by "sandbox-images-and-e2e" by adding needs: [checks, changes] to the job definition and adding if: needs.changes.outputs.code == 'true' so the job only runs when code changes are present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 1487-1522: Persist the generated ollamaProxyToken into the
onboarding session and handle resume: when startOllamaAuthProxy generates the
token, save it to onboardSession.ollamaProxyToken (or a similarly named field)
and reuse that value in getOllamaProxyToken instead of only the in-memory
variable; when resuming (e.g. in setupInference / after setupNim is skipped)
detect if the session provider is "ollama-local" and if
onboardSession.ollamaProxyToken exists reuse it and ensure the proxy is running,
otherwise call startOllamaAuthProxy to recreate the proxy and update
onboardSession.ollamaProxyToken so resumed runs always have a valid token and
proxy.
- Around line 1491-1499: The cleanup currently kills any PID from
runCapture(`lsof -ti :${OLLAMA_PROXY_PORT}`), which may terminate unrelated
services; instead track and validate the proxy process before killing: when you
spawn the proxy (where the proxy is started), record its PID (or write it to a
known file) and on cleanup read that PID and verify its command line contains
the proxy marker (e.g., "ollama-auth-proxy.js") using a ps lookup (or filter
lsof output by command) before calling run(`kill ...`); update the cleanup block
(the code using run, runCapture and OLLAMA_PROXY_PORT) to only kill the verified
PID and fall back to no-op if verification fails.
In `@test/e2e-ollama-proxy.sh`:
- Around line 17-20: The cleanup function and EXIT trap must guard against unset
PID variables to avoid unbound-variable errors under set -u; update cleanup (and
any EXIT trap invocation) to safely reference MOCK_PID and PROXY_PID using
parameter expansion or existence checks (e.g. ${MOCK_PID-} / ${PROXY_PID-} or if
[ -n "${MOCK_PID-}" ] ) before calling kill, so cleanup() only attempts kill
when the PID variables are set and non-empty.
---
Nitpick comments:
In @.github/workflows/pr.yaml:
- Around line 54-65: The E2E job "test-e2e-ollama-proxy" should be gated by the
existing changes filter: add the same needs and conditional used by
"sandbox-images-and-e2e" by adding needs: [checks, changes] to the job
definition and adding if: needs.changes.outputs.code == 'true' so the job only
runs when code changes are present.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: f4037d3e-ad05-4c00-a2da-d3698480e911
📒 Files selected for processing (7)
.github/workflows/pr.yamlscripts/ollama-auth-proxy.jssrc/lib/local-inference.test.tssrc/lib/local-inference.tssrc/lib/onboard.tssrc/lib/ports.tstest/e2e-ollama-proxy.sh
- Persist proxy token to ~/.nemoclaw/ollama-proxy-token (mode 0600) so it survives process restarts and onboard --resume - Add ensureOllamaAuthProxy() called on sandbox connect to auto-restart the proxy after host reboots - Restore WSL2 compatibility: skip proxy on WSL2 where Docker reaches the host directly (#1104), use OLLAMA_CONTAINER_PORT that adapts per platform - Fix e2e test: replace invalid 0.0.0.0 reachability test with localhost liveness check (0.0.0.0 as destination routes to loopback on both Linux and macOS) Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- killStaleProxy() now verifies process command contains
"ollama-auth-proxy" before killing, avoiding termination of
unrelated services on the same port
- Initialize MOCK_PID/PROXY_PID and guard EXIT trap with ${:-}
to prevent unbound-variable errors under set -u
Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
test/e2e-ollama-proxy.sh (1)
17-20:⚠️ Potential issue | 🟡 MinorGuard
cleanupagainst unset PIDs underset -u.Lines [17]-[20] can throw an unbound-variable error if the script exits before Lines [47] or [55], which obscures the real failure.
🐚 Suggested fix
+MOCK_PID="" +PROXY_PID="" + cleanup() { - kill "$MOCK_PID" 2>/dev/null || true - kill "$PROXY_PID" 2>/dev/null || true + [ -n "${MOCK_PID:-}" ] && kill "$MOCK_PID" 2>/dev/null || true + [ -n "${PROXY_PID:-}" ] && kill "$PROXY_PID" 2>/dev/null || true }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e-ollama-proxy.sh` around lines 17 - 20, The cleanup function calls kill on MOCK_PID and PROXY_PID which will cause an unbound-variable error under set -u if those PIDs were never set; update cleanup to test each PID variable safely (e.g. use parameter expansion like "${MOCK_PID-}" / "${PROXY_PID-}" or an explicit [ -n "${MOCK_PID-}" ] check) before calling kill, and ensure you only attempt kill when the variable is non-empty to avoid masking the original error and to clean up processes reliably.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e-ollama-proxy.sh`:
- Around line 89-96: The test currently calls the auth-exempt endpoint /api/tags
so it doesn't validate auth; change the request in test/e2e-ollama-proxy.sh that
builds BODY (using CORRECT_AUTH, TOKEN, PROXY_PORT) to call a protected endpoint
such as /api/generate instead of /api/tags, use the correct HTTP method and JSON
payload required by /api/generate in the curl invocation (keep the -H
"Authorization: $CORRECT_AUTH" header), then assert the response BODY contains
the expected generation marker and keep the same pass/fail logic.
---
Duplicate comments:
In `@test/e2e-ollama-proxy.sh`:
- Around line 17-20: The cleanup function calls kill on MOCK_PID and PROXY_PID
which will cause an unbound-variable error under set -u if those PIDs were never
set; update cleanup to test each PID variable safely (e.g. use parameter
expansion like "${MOCK_PID-}" / "${PROXY_PID-}" or an explicit [ -n
"${MOCK_PID-}" ] check) before calling kill, and ensure you only attempt kill
when the variable is non-empty to avoid masking the original error and to clean
up processes reliably.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: cc76b675-5b18-435b-bf2e-7b78105bf763
📒 Files selected for processing (6)
package.jsonsrc/lib/local-inference.test.tssrc/lib/local-inference.tssrc/lib/onboard.tssrc/nemoclaw.tstest/e2e-ollama-proxy.sh
✅ Files skipped from review due to trivial changes (1)
- package.json
🚧 Files skipped from review as they are similar to previous changes (3)
- src/lib/local-inference.test.ts
- src/lib/onboard.ts
- src/lib/local-inference.ts
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 1535-1536: The code currently generates ollamaProxyToken in
startOllamaAuthProxy() and immediately calls persistProxyToken(ollamaProxyToken)
which can leave a stale token if the user cancels or chooses a different
provider; change this so the token is only persisted after the provider
selection/validation is committed for the Ollama branch (i.e., persist inside
the final path that sets provider to "ollama-local" / after model selection
succeeds), or alternatively add a cleanup path that clears the persisted token
when the onboarding flow leaves the Ollama branch; update references to
ensureOllamaAuthProxy() usage so it only finds a persisted token when Ollama was
actually chosen and ensure ollamaProxyToken is persisted/cleared consistently
with that flow.
- Around line 1546-1579: The current readiness check in ensureOllamaAuthProxy
uses an unauthenticated GET /api/tags (via runCapture) which can pass against an
unrelated listener; replace it with a probe that verifies the specific proxy
instance — either (A) perform an authenticated probe using the persisted token
(call runCapture with the Authorization header / token-bound endpoint so the
request only succeeds against a proxy that accepts that token, using
loadPersistedProxyToken before probing and set ollamaProxyToken accordingly) or
(B) verify the actual proxy process identity (store/read the proxy PID/command
when starting in run(...) and in ensureOllamaAuthProxy validate that PID is
running and matches the expected command). Update ensureOllamaAuthProxy to
return only when the authenticated/token-bound probe or PID/command check
succeeds; use runCapture/run and the existing OLLAMA_PROXY_PORT,
ollamaProxyToken, loadPersistedProxyToken, and scripts/ollama-auth-proxy.js
symbols to locate and change the logic.
- Around line 1493-1499: persistProxyToken currently relies on
fs.writeFileSync(..., { mode: 0o600 }) which only sets permissions on new files;
update persistProxyToken to explicitly reset permissions after writing by
calling fs.chmodSync(PROXY_TOKEN_PATH, 0o600) following the write to ensure the
token file at PROXY_TOKEN_PATH always has restrictive 0600 permissions
regardless of prior file state.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 3df64d1d-08b3-4771-8705-2cfecf8e4dfd
📒 Files selected for processing (2)
src/lib/onboard.tstest/e2e-ollama-proxy.sh
✅ Files skipped from review due to trivial changes (1)
- test/e2e-ollama-proxy.sh
- Test 5 now uses POST /api/generate (protected) instead of GET /api/tags (auth-exempt) to verify token auth actually works - Gate test-e2e-ollama-proxy CI job behind changes filter to skip on docs-only PRs Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/pr.yaml:
- Around line 62-63: Replace the mutable ref for the setup-node action: locate
the uses entry "uses: actions/setup-node@v6" in the workflow and change it to
the equivalent immutable commit SHA (e.g., "uses:
actions/setup-node@<commit-sha> # v6") so the action is pinned to a specific
commit; ensure the chosen SHA matches the v6 release you intend to track.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 1db7bb91-c8e0-468d-8b3d-1e535b29ff28
📒 Files selected for processing (2)
.github/workflows/pr.yamltest/e2e-ollama-proxy.sh
✅ Files skipped from review due to trivial changes (1)
- test/e2e-ollama-proxy.sh
…, proxy identity check - Delay persisting proxy token to disk until ollama-local is confirmed in setupInference, so backing out to another provider doesn't leave a stale token that resurrects the proxy on non-Ollama sandboxes - ensureOllamaAuthProxy now verifies the proxy accepts our token via an authenticated POST (not auth-exempt GET /api/tags), preventing false positives from stale proxies or unrelated listeners - Add chmodSync after writeFileSync to ensure 0600 on existing files (Node.js mode option only applies on file creation) Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comprehensive e2e test using real Ollama (not mocks): - Install Ollama, pull qwen2.5:0.5b (small CPU model) - Start Ollama on 127.0.0.1 only, start auth proxy on 0.0.0.0:11435 - Token auth: reject unauthenticated, reject wrong token, accept correct - Real inference: /v1/chat/completions and /api/generate through proxy - Token persistence: file exists, 0600 permissions, content matches - Proxy recovery: kill, verify dead, restart from persisted token, verify inference works after restart - Container reachability: Docker container can reach proxy at host.openshell.internal:11435, cannot reach Ollama directly on 11434 Triggered via workflow_dispatch (manual) — not on every PR due to Ollama download and CPU inference time (~3-4 min). Also updates gpu-e2e test comments for auth proxy (127.0.0.1 binding). Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses Aaron's review: "Add an end-to-end onboard/connect regression test that provisions ollama-local, persists the token, restarts/ensures the proxy, and verifies container-reachable inference through 11435." Adds Phase 4.5 to the existing gpu-e2e test (runs after onboard, before inference): - Token file persisted at ~/.nemoclaw/ollama-proxy-token - Token file permissions 600 - Auth proxy running on :11435 - Proxy rejects unauthenticated POST (401) - Proxy accepts persisted token - Container reaches proxy at host.openshell.internal:11435 - Proxy recovery: kill proxy, verify restart via ensureOllamaAuthProxy Updates Phase 5 comments: inference path now goes through auth proxy (:11435) → Ollama (:11434). Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous recovery test called `nemoclaw status` to trigger ensureOllamaAuthProxy, but that function is only called on `sandboxConnect`, not `status`. Fix by restarting the proxy directly from the persisted token (simulating what ensureOllamaAuthProxy does on connect after a reboot). Also adds verification that the recovered proxy accepts the original persisted token. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevents lingering proxy process on reused runners after Phase 4.5g manually restarts the proxy for recovery testing. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Closes #709. Supersedes #1140.
Ollama has no built-in auth. Today onboarding tells users to bind
0.0.0.0:11434, which exposes Ollama to the entire network with no authentication (CWE-668, flagged in #1140). This reimplements the auth proxy approach from #679 against the current TypeScript codebase.Fix — keep Ollama on localhost, add an authenticated proxy:
Auth proxy (
scripts/ollama-auth-proxy.js): Lightweight Node.js reverse proxy on0.0.0.0:11435that validates a per-instance Bearer token before forwarding to Ollama on127.0.0.1:11434. Usescrypto.timingSafeEqualfor timing-safe comparison.GET /api/tagsis exempt for container health checks.Onboard integration: Ollama binds to
127.0.0.1instead of0.0.0.0. Proxy starts automatically after Ollama with a random 24-byte token. Stale proxies from previous runs are cleaned up. Startup is verified before proceeding.Inference routing: Sandbox provider uses proxy port (11435) with the generated token as the
OPENAI_API_KEYcredential. OpenShell's L7 proxy injects the token at egress.Platform fix: macOS hint ("On macOS, local inference also depends on OpenShell host routing support.") now gated on
process.platform === "darwin"instead of always shown.Test plan
nemoclaw onboardwith Local Ollama, verify inference works through proxy🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests
Chores