Skip to content

fix(security): add authenticated reverse proxy for local Ollama#1922

Merged
brandonpelfrey merged 16 commits intomainfrom
fix/ollama-auth-proxy
Apr 16, 2026
Merged

fix(security): add authenticated reverse proxy for local Ollama#1922
brandonpelfrey merged 16 commits intomainfrom
fix/ollama-auth-proxy

Conversation

@prekshivyas
Copy link
Copy Markdown
Contributor

@prekshivyas prekshivyas commented Apr 15, 2026

Summary

Closes #709. Supersedes #1140.

Ollama has no built-in auth. Today onboarding tells users to bind 0.0.0.0:11434, which exposes Ollama to the entire network with no authentication (CWE-668, flagged in #1140). This reimplements the auth proxy approach from #679 against the current TypeScript codebase.

Fix — keep Ollama on localhost, add an authenticated proxy:

  1. Auth proxy (scripts/ollama-auth-proxy.js): Lightweight Node.js reverse proxy on 0.0.0.0:11435 that validates a per-instance Bearer token before forwarding to Ollama on 127.0.0.1:11434. Uses crypto.timingSafeEqual for timing-safe comparison. GET /api/tags is exempt for container health checks.

  2. Onboard integration: Ollama binds to 127.0.0.1 instead of 0.0.0.0. Proxy starts automatically after Ollama with a random 24-byte token. Stale proxies from previous runs are cleaned up. Startup is verified before proceeding.

  3. Inference routing: Sandbox provider uses proxy port (11435) with the generated token as the OPENAI_API_KEY credential. OpenShell's L7 proxy injects the token at egress.

  4. Platform fix: macOS hint ("On macOS, local inference also depends on OpenShell host routing support.") now gated on process.platform === "darwin" instead of always shown.

Test plan

  • 7 e2e tests (mock backend, no real Ollama needed): unauthenticated rejection, wrong token rejection, correct token proxying, health check exemption, POST-to-health-check requires auth
  • 34 unit tests passing (updated port expectations and error messages)
  • All pre-commit hooks pass (gitleaks, shellcheck, ESLint, commitlint, TypeScript)
  • Manual: nemoclaw onboard with Local Ollama, verify inference works through proxy

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Token-gated local authentication proxy for Ollama with persisted credential, automatic lifecycle management, and container-access routing; unauthenticated GET /api/tags remains allowed.
    • Host-side connect now ensures the proxy is running after reconnects.
  • Tests

    • Added end-to-end tests validating proxy routing and authentication behavior.
    • Updated local tests to expect proxy-mediated access.
  • Chores

    • CI job added to run proxy e2e tests; minor devDependency reordering.

Ollama has no built-in auth and binding to 0.0.0.0 exposes it to the
network (CWE-668, #1140). This adds an authenticated reverse proxy so
Ollama stays on localhost while containers can still reach it.

- Add scripts/ollama-auth-proxy.js — Node.js proxy on 0.0.0.0:11435
  that validates a per-instance Bearer token before forwarding to
  Ollama on 127.0.0.1:11434. Health check (GET /api/tags) is exempt.
  Uses crypto.timingSafeEqual for timing-safe token comparison.
- Bind Ollama to 127.0.0.1 instead of 0.0.0.0 during onboard
- Start the auth proxy after Ollama, with stale proxy cleanup and
  startup verification
- Route sandbox inference through proxy port (11435) with the
  generated token as the OpenAI API key credential
- Gate macOS hint on process.platform === "darwin"
- Add OLLAMA_PROXY_PORT (11435) to ports.ts
- Add 7 e2e tests and CI job for the proxy
- Update unit tests for new port and error messages

Reimplements the approach from #679 (closed in favor of #1104) against
the current TypeScript codebase, addressing CodeRabbit findings from
the original PR (timing-safe comparison, stale proxy cleanup, startup
verification).

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 15, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an Ollama authentication reverse proxy, integrates it into onboarding and runtime to start/ensure/use the proxy and token, routes container reachability checks through the proxy, updates tests and adds an e2e proxy test plus a CI job to run that test.

Changes

Cohort / File(s) Summary
Proxy Server Implementation
scripts/ollama-auth-proxy.js
New Node.js reverse proxy requiring OLLAMA_PROXY_TOKEN (timing-safe compare), forwards to backend at 127.0.0.1:${OLLAMA_BACKEND_PORT}, strips authorization and host, allows unauthenticated GET /api/tags, returns 401/502 on auth/backend failures.
Onboarding & Runtime
src/lib/onboard.ts, src/nemoclaw.ts
Adds proxy token persistence (~/.nemoclaw/ollama-proxy-token), start/kill/ensure helpers for the auth proxy, changes non-WSL Ollama binding to 127.0.0.1, uses proxy token as OPENAI_API_KEY for non-WSL ollama-local, exports ensureOllamaAuthProxy, and calls it during sandbox connect.
Local Inference Routing & Tests
src/lib/local-inference.ts, src/lib/local-inference.test.ts
Introduces OLLAMA_CONTAINER_PORT (uses OLLAMA_PROXY_PORT when not WSL); updates base URL and container reachability check to host.openshell.internal:${OLLAMA_CONTAINER_PORT}; tests parameterized for the port and expect proxy-related failure text.
Port Configuration
src/lib/ports.ts
Added exported OLLAMA_PROXY_PORT (env NEMOCLAW_OLLAMA_PROXY_PORT, default 11435).
E2E Test Script & CI
test/e2e-ollama-proxy.sh, .github/workflows/pr.yaml
Adds e2e script that launches a mock backend and the auth proxy to validate auth success/failure and health-check exemption; adds test-e2e-ollama-proxy job to PR workflow to run the script when code changes.
Miscellaneous
package.json
Reordered a devDependency entry for ajv (no version change).

Sequence Diagram(s)

sequenceDiagram
    participant Container as Client Container
    participant Proxy as Ollama Auth Proxy\n(Port 11435)
    participant Backend as Ollama Backend\n(Port 11434)

    rect rgba(0, 200, 100, 0.5)
    Note over Container,Backend: Authenticated Request Flow
    Container->>Proxy: POST /api/generate\nAuthorization: Bearer {token}
    Proxy->>Proxy: Validate token (timing-safe)
    Proxy->>Backend: Forward request (strip auth/host)
    Backend->>Proxy: Response stream
    Proxy->>Container: Response stream
    end

    rect rgba(200, 100, 0, 0.5)
    Note over Container,Proxy: Unauthorized Request Flow
    Container->>Proxy: POST /api/generate (no/invalid token)
    Proxy->>Container: HTTP 401 Unauthorized
    end

    rect rgba(100, 150, 200, 0.5)
    Note over Container,Backend: Health Check (Exempted)
    Container->>Proxy: GET /api/tags (no auth)
    Proxy->>Backend: Forward health check
    Backend->>Proxy: Model list
    Proxy->>Container: Model list
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 I guard the tunnel on eleven-three-five,

Tokens snug while health checks may thrive.
I strip the headers, forward the call,
Reject the pretenders, let models stand tall.
A hopping proxy, so onboarding won't stall.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding an authenticated reverse proxy for local Ollama to improve security and container reachability.
Linked Issues check ✅ Passed The PR fully addresses issue #709's objective of enabling containerized components to reach local Ollama without binding to 0.0.0.0 via a lightweight authenticated proxy.
Out of Scope Changes check ✅ Passed All changes directly support the authenticated proxy implementation: proxy script, onboarding integration, inference routing updates, tests, and CI job—no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ollama-auth-proxy

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@wscurran wscurran added security Something isn't secure Local Models Running NemoClaw with local models priority: high Important issue that should be resolved in the next release NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). fix labels Apr 15, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
.github/workflows/pr.yaml (1)

54-65: Gate this E2E job behind the existing changes filter.

Right now this runs on docs-only PRs too, so it bypasses the fast path the workflow already has for expensive jobs. If that is not intentional, add the same needs: [checks, changes] / if: needs.changes.outputs.code == 'true' guard used by sandbox-images-and-e2e.

♻️ Suggested change
  test-e2e-ollama-proxy:
+   needs: [checks, changes]
+   if: needs.changes.outputs.code == 'true'
    runs-on: ubuntu-latest
    timeout-minutes: 5
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/pr.yaml around lines 54 - 65, The E2E job
"test-e2e-ollama-proxy" should be gated by the existing changes filter: add the
same needs and conditional used by "sandbox-images-and-e2e" by adding needs:
[checks, changes] to the job definition and adding if:
needs.changes.outputs.code == 'true' so the job only runs when code changes are
present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 1487-1522: Persist the generated ollamaProxyToken into the
onboarding session and handle resume: when startOllamaAuthProxy generates the
token, save it to onboardSession.ollamaProxyToken (or a similarly named field)
and reuse that value in getOllamaProxyToken instead of only the in-memory
variable; when resuming (e.g. in setupInference / after setupNim is skipped)
detect if the session provider is "ollama-local" and if
onboardSession.ollamaProxyToken exists reuse it and ensure the proxy is running,
otherwise call startOllamaAuthProxy to recreate the proxy and update
onboardSession.ollamaProxyToken so resumed runs always have a valid token and
proxy.
- Around line 1491-1499: The cleanup currently kills any PID from
runCapture(`lsof -ti :${OLLAMA_PROXY_PORT}`), which may terminate unrelated
services; instead track and validate the proxy process before killing: when you
spawn the proxy (where the proxy is started), record its PID (or write it to a
known file) and on cleanup read that PID and verify its command line contains
the proxy marker (e.g., "ollama-auth-proxy.js") using a ps lookup (or filter
lsof output by command) before calling run(`kill ...`); update the cleanup block
(the code using run, runCapture and OLLAMA_PROXY_PORT) to only kill the verified
PID and fall back to no-op if verification fails.

In `@test/e2e-ollama-proxy.sh`:
- Around line 17-20: The cleanup function and EXIT trap must guard against unset
PID variables to avoid unbound-variable errors under set -u; update cleanup (and
any EXIT trap invocation) to safely reference MOCK_PID and PROXY_PID using
parameter expansion or existence checks (e.g. ${MOCK_PID-} / ${PROXY_PID-} or if
[ -n "${MOCK_PID-}" ] ) before calling kill, so cleanup() only attempts kill
when the PID variables are set and non-empty.

---

Nitpick comments:
In @.github/workflows/pr.yaml:
- Around line 54-65: The E2E job "test-e2e-ollama-proxy" should be gated by the
existing changes filter: add the same needs and conditional used by
"sandbox-images-and-e2e" by adding needs: [checks, changes] to the job
definition and adding if: needs.changes.outputs.code == 'true' so the job only
runs when code changes are present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f4037d3e-ad05-4c00-a2da-d3698480e911

📥 Commits

Reviewing files that changed from the base of the PR and between f079a37 and 2f505e1.

📒 Files selected for processing (7)
  • .github/workflows/pr.yaml
  • scripts/ollama-auth-proxy.js
  • src/lib/local-inference.test.ts
  • src/lib/local-inference.ts
  • src/lib/onboard.ts
  • src/lib/ports.ts
  • test/e2e-ollama-proxy.sh

Comment thread src/lib/onboard.ts
Comment thread src/lib/onboard.ts Outdated
Comment thread test/e2e-ollama-proxy.sh
prekshivyas and others added 2 commits April 15, 2026 14:49
- Persist proxy token to ~/.nemoclaw/ollama-proxy-token (mode 0600)
  so it survives process restarts and onboard --resume
- Add ensureOllamaAuthProxy() called on sandbox connect to auto-restart
  the proxy after host reboots
- Restore WSL2 compatibility: skip proxy on WSL2 where Docker reaches
  the host directly (#1104), use OLLAMA_CONTAINER_PORT that adapts
  per platform
- Fix e2e test: replace invalid 0.0.0.0 reachability test with
  localhost liveness check (0.0.0.0 as destination routes to loopback
  on both Linux and macOS)

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- killStaleProxy() now verifies process command contains
  "ollama-auth-proxy" before killing, avoiding termination of
  unrelated services on the same port
- Initialize MOCK_PID/PROXY_PID and guard EXIT trap with ${:-}
  to prevent unbound-variable errors under set -u

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
test/e2e-ollama-proxy.sh (1)

17-20: ⚠️ Potential issue | 🟡 Minor

Guard cleanup against unset PIDs under set -u.

Lines [17]-[20] can throw an unbound-variable error if the script exits before Lines [47] or [55], which obscures the real failure.

🐚 Suggested fix
+MOCK_PID=""
+PROXY_PID=""
+
 cleanup() {
-  kill "$MOCK_PID" 2>/dev/null || true
-  kill "$PROXY_PID" 2>/dev/null || true
+  [ -n "${MOCK_PID:-}" ] && kill "$MOCK_PID" 2>/dev/null || true
+  [ -n "${PROXY_PID:-}" ] && kill "$PROXY_PID" 2>/dev/null || true
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e-ollama-proxy.sh` around lines 17 - 20, The cleanup function calls
kill on MOCK_PID and PROXY_PID which will cause an unbound-variable error under
set -u if those PIDs were never set; update cleanup to test each PID variable
safely (e.g. use parameter expansion like "${MOCK_PID-}" / "${PROXY_PID-}" or an
explicit [ -n "${MOCK_PID-}" ] check) before calling kill, and ensure you only
attempt kill when the variable is non-empty to avoid masking the original error
and to clean up processes reliably.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e-ollama-proxy.sh`:
- Around line 89-96: The test currently calls the auth-exempt endpoint /api/tags
so it doesn't validate auth; change the request in test/e2e-ollama-proxy.sh that
builds BODY (using CORRECT_AUTH, TOKEN, PROXY_PORT) to call a protected endpoint
such as /api/generate instead of /api/tags, use the correct HTTP method and JSON
payload required by /api/generate in the curl invocation (keep the -H
"Authorization: $CORRECT_AUTH" header), then assert the response BODY contains
the expected generation marker and keep the same pass/fail logic.

---

Duplicate comments:
In `@test/e2e-ollama-proxy.sh`:
- Around line 17-20: The cleanup function calls kill on MOCK_PID and PROXY_PID
which will cause an unbound-variable error under set -u if those PIDs were never
set; update cleanup to test each PID variable safely (e.g. use parameter
expansion like "${MOCK_PID-}" / "${PROXY_PID-}" or an explicit [ -n
"${MOCK_PID-}" ] check) before calling kill, and ensure you only attempt kill
when the variable is non-empty to avoid masking the original error and to clean
up processes reliably.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: cc76b675-5b18-435b-bf2e-7b78105bf763

📥 Commits

Reviewing files that changed from the base of the PR and between 8ba4eca and 1916df2.

📒 Files selected for processing (6)
  • package.json
  • src/lib/local-inference.test.ts
  • src/lib/local-inference.ts
  • src/lib/onboard.ts
  • src/nemoclaw.ts
  • test/e2e-ollama-proxy.sh
✅ Files skipped from review due to trivial changes (1)
  • package.json
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/lib/local-inference.test.ts
  • src/lib/onboard.ts
  • src/lib/local-inference.ts

Comment thread test/e2e-ollama-proxy.sh Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 1535-1536: The code currently generates ollamaProxyToken in
startOllamaAuthProxy() and immediately calls persistProxyToken(ollamaProxyToken)
which can leave a stale token if the user cancels or chooses a different
provider; change this so the token is only persisted after the provider
selection/validation is committed for the Ollama branch (i.e., persist inside
the final path that sets provider to "ollama-local" / after model selection
succeeds), or alternatively add a cleanup path that clears the persisted token
when the onboarding flow leaves the Ollama branch; update references to
ensureOllamaAuthProxy() usage so it only finds a persisted token when Ollama was
actually chosen and ensure ollamaProxyToken is persisted/cleared consistently
with that flow.
- Around line 1546-1579: The current readiness check in ensureOllamaAuthProxy
uses an unauthenticated GET /api/tags (via runCapture) which can pass against an
unrelated listener; replace it with a probe that verifies the specific proxy
instance — either (A) perform an authenticated probe using the persisted token
(call runCapture with the Authorization header / token-bound endpoint so the
request only succeeds against a proxy that accepts that token, using
loadPersistedProxyToken before probing and set ollamaProxyToken accordingly) or
(B) verify the actual proxy process identity (store/read the proxy PID/command
when starting in run(...) and in ensureOllamaAuthProxy validate that PID is
running and matches the expected command). Update ensureOllamaAuthProxy to
return only when the authenticated/token-bound probe or PID/command check
succeeds; use runCapture/run and the existing OLLAMA_PROXY_PORT,
ollamaProxyToken, loadPersistedProxyToken, and scripts/ollama-auth-proxy.js
symbols to locate and change the logic.
- Around line 1493-1499: persistProxyToken currently relies on
fs.writeFileSync(..., { mode: 0o600 }) which only sets permissions on new files;
update persistProxyToken to explicitly reset permissions after writing by
calling fs.chmodSync(PROXY_TOKEN_PATH, 0o600) following the write to ensure the
token file at PROXY_TOKEN_PATH always has restrictive 0600 permissions
regardless of prior file state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3df64d1d-08b3-4771-8705-2cfecf8e4dfd

📥 Commits

Reviewing files that changed from the base of the PR and between 1916df2 and adfb16e.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • test/e2e-ollama-proxy.sh
✅ Files skipped from review due to trivial changes (1)
  • test/e2e-ollama-proxy.sh

Comment thread src/lib/onboard.ts Outdated
Comment thread src/lib/onboard.ts Outdated
Comment thread src/lib/onboard.ts Outdated
- Test 5 now uses POST /api/generate (protected) instead of GET
  /api/tags (auth-exempt) to verify token auth actually works
- Gate test-e2e-ollama-proxy CI job behind changes filter to skip
  on docs-only PRs

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/pr.yaml:
- Around line 62-63: Replace the mutable ref for the setup-node action: locate
the uses entry "uses: actions/setup-node@v6" in the workflow and change it to
the equivalent immutable commit SHA (e.g., "uses:
actions/setup-node@<commit-sha> # v6") so the action is pinned to a specific
commit; ensure the chosen SHA matches the v6 release you intend to track.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 1db7bb91-c8e0-468d-8b3d-1e535b29ff28

📥 Commits

Reviewing files that changed from the base of the PR and between adfb16e and 962caa6.

📒 Files selected for processing (2)
  • .github/workflows/pr.yaml
  • test/e2e-ollama-proxy.sh
✅ Files skipped from review due to trivial changes (1)
  • test/e2e-ollama-proxy.sh

Comment thread .github/workflows/pr.yaml Outdated
…, proxy identity check

- Delay persisting proxy token to disk until ollama-local is confirmed
  in setupInference, so backing out to another provider doesn't leave
  a stale token that resurrects the proxy on non-Ollama sandboxes
- ensureOllamaAuthProxy now verifies the proxy accepts our token via
  an authenticated POST (not auth-exempt GET /api/tags), preventing
  false positives from stale proxies or unrelated listeners
- Add chmodSync after writeFileSync to ensure 0600 on existing files
  (Node.js mode option only applies on file creation)

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cv cv added the v0.0.18 Release target label Apr 16, 2026
cv and others added 8 commits April 16, 2026 00:45
Comprehensive e2e test using real Ollama (not mocks):
- Install Ollama, pull qwen2.5:0.5b (small CPU model)
- Start Ollama on 127.0.0.1 only, start auth proxy on 0.0.0.0:11435
- Token auth: reject unauthenticated, reject wrong token, accept correct
- Real inference: /v1/chat/completions and /api/generate through proxy
- Token persistence: file exists, 0600 permissions, content matches
- Proxy recovery: kill, verify dead, restart from persisted token,
  verify inference works after restart
- Container reachability: Docker container can reach proxy at
  host.openshell.internal:11435, cannot reach Ollama directly on 11434

Triggered via workflow_dispatch (manual) — not on every PR due to
Ollama download and CPU inference time (~3-4 min).

Also updates gpu-e2e test comments for auth proxy (127.0.0.1 binding).

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses Aaron's review: "Add an end-to-end onboard/connect regression
test that provisions ollama-local, persists the token, restarts/ensures
the proxy, and verifies container-reachable inference through 11435."

Adds Phase 4.5 to the existing gpu-e2e test (runs after onboard,
before inference):
- Token file persisted at ~/.nemoclaw/ollama-proxy-token
- Token file permissions 600
- Auth proxy running on :11435
- Proxy rejects unauthenticated POST (401)
- Proxy accepts persisted token
- Container reaches proxy at host.openshell.internal:11435
- Proxy recovery: kill proxy, verify restart via ensureOllamaAuthProxy

Updates Phase 5 comments: inference path now goes through auth proxy
(:11435) → Ollama (:11434).

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous recovery test called `nemoclaw status` to trigger
ensureOllamaAuthProxy, but that function is only called on
`sandboxConnect`, not `status`. Fix by restarting the proxy
directly from the persisted token (simulating what
ensureOllamaAuthProxy does on connect after a reboot).

Also adds verification that the recovered proxy accepts the
original persisted token.

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevents lingering proxy process on reused runners after Phase 4.5g
manually restarts the proxy for recovery testing.

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@brandonpelfrey brandonpelfrey merged commit 4f30b9d into main Apr 16, 2026
9 checks passed
@prekshivyas prekshivyas deleted the fix/ollama-auth-proxy branch April 16, 2026 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Local Models Running NemoClaw with local models NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). priority: high Important issue that should be resolved in the next release security Something isn't secure v0.0.18 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[NeMoClaw][Ubuntu + Docker CE] Linux onboarding for Ollama does not explain required container-reachable bind address

5 participants