fix(onboard): propagate rotated messaging credentials to sandbox L7 proxy by ericksoa · Pull Request #1967 · NVIDIA/NemoClaw

ericksoa · 2026-04-16T14:47:13Z

Summary

Fixes credential rotation bug where re-running nemoclaw onboard --non-interactive with a new messaging token (e.g. TELEGRAM_BOT_TOKEN) reports success but the sandbox continues using the old credential
Stores SHA-256 hashes of messaging credentials in the sandbox registry at creation time; on reuse, compares current tokens against stored hashes to detect rotation
When rotation is detected, automatically rebuilds the sandbox (backup → delete → create with new providers → restore), preserving workspace state

Closes #1903

Changes

File	Change
`src/lib/registry.ts`	Add `providerCredentialHashes` field to `SandboxEntry`
`src/lib/onboard.ts`	Add `hashCredential()`, `detectMessagingCredentialRotation()`, modify reuse path to detect rotation and trigger rebuild with backup/restore, store hashes at registration
`test/credential-rotation.test.ts`	11 unit tests for hash and rotation detection logic
`test/e2e/test-token-rotation.sh`	E2E test: onboard with token A → rotate to token B (verify rebuild + workspace preservation) → re-onboard with same token B (verify reuse)

Design Notes

Legacy sandboxes (created before this fix) have no stored hashes → detectMessagingCredentialRotation returns changed: false (conservative, no false rebuilds)
Workspace preservation uses existing sandbox-state.ts backup/restore machinery (same as sandboxRebuild)

Test plan

npx vitest run test/credential-rotation.test.ts — 11/11 pass
npx vitest run test/onboard.test.ts — 117/117 existing tests pass (no regressions)
Full test suite — 1583/1583 tests pass
tsc -p tsconfig.cli.json --noEmit — clean
All pre-commit and pre-push hooks pass
E2E: test/e2e/test-token-rotation.sh with real Telegram tokens

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Detects messaging credential rotation (e.g., Telegram tokens) and triggers sandbox rebuilds when needed
- Attempts workspace backup before rebuild and restore after to preserve state
- Persists per-credential hashes in the local sandbox registry for rotation tracking
Tests
- Unit tests for credential-rotation detection and hashing
- End-to-end test validating token rotation, rebuild behavior, and workspace preservation
Chores
- Nightly E2E workflow updated to run the token-rotation job

…roxy When a user re-runs `nemoclaw onboard` with a rotated Telegram bot token, the provider update succeeds at the gateway level but the running sandbox's L7 proxy continues using the old credential. OpenShell v0.0.26 only attaches providers at `sandbox create` time and has no hot-reload mechanism. Fix: store SHA-256 hashes of messaging credentials in the sandbox registry at creation time. On reuse, compare current tokens against stored hashes. When rotation is detected, automatically rebuild the sandbox (backup → delete → create with new providers → restore) to propagate the new credential through the L7 proxy. Closes #1903 Ref: nvbug 6083165 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-16T14:47:26Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds SHA‑256 credential hashing and rotation detection; integrates rotation checks into sandbox reuse/recreate flow; persists per‑credential hashes in the sandbox registry; performs workspace backup/restore around sandbox rebuilds when rotation is detected. Adds unit and E2E tests and a nightly CI job for token-rotation E2E.

Changes

Cohort / File(s)	Summary
Core logic & exports `src/lib/onboard.ts`	Added and exported `hashCredential()` and `detectMessagingCredentialRotation()`; integrated rotation detection into sandbox creation to block reuse when messaging credentials changed; added workspace backup/restore around sandbox rebuilds; store per-credential hashes on registration.
Registry type & persistence `src/lib/registry.ts`	Extended exported `SandboxEntry` with optional `providerCredentialHashes?: Record<string,string>` and updated `registerSandbox()` to persist the field when non-empty.
Unit tests `test/credential-rotation.test.ts`	New Vitest suite validating `hashCredential()` and `detectMessagingCredentialRotation()` across legacy, matching, differing, multi-provider, null-token, and missing-sandbox scenarios; uses dynamic require and mocks.
E2E test script `test/e2e/test-token-rotation.sh`	New Bash E2E that runs three onboarding phases (initial token A, rotate to token B expecting rebuild + state restore, re-onboard expecting reuse); validates registry `providerCredentialHashes`, sandbox/provider presence, and workspace persistence.
CI workflow `.github/workflows/nightly-e2e.yaml`	Added `token-rotation-e2e` job to nightly E2E, wired it into `notify-on-failure.needs` and updated failure condition; uploads install log on failure.

Sequence Diagram

sequenceDiagram
    participant User
    participant Onboard as Onboard Process
    participant Registry
    participant State as WorkspaceState
    participant Sandbox

    User->>Onboard: run onboard --non-interactive (with messaging tokens)
    Onboard->>Registry: get sandbox entry
    Registry-->>Onboard: sandbox entry (may include providerCredentialHashes)
    Onboard->>Onboard: hash current tokens, compare to stored hashes
    alt rotation detected
        Onboard->>State: backup workspace
        State-->>Onboard: backup artifact
        Onboard->>Sandbox: delete existing sandbox
        Onboard->>Sandbox: create new sandbox
        Onboard->>Registry: register sandbox + new providerCredentialHashes
        Registry-->>Onboard: confirm registration
        Onboard->>State: restore workspace into new sandbox
        State-->>Sandbox: workspace restored
    else no rotation
        Onboard->>Sandbox: reuse existing sandbox
    end
    Onboard-->>User: onboarding complete

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I nibble tokens, trim and hash,

when old ones change I make a splash,
I tuck your files into a chest,
rebuild the burrow, do my best,
then hop back in — your work's at rest.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: fixing credential rotation propagation to the sandbox L7 proxy by detecting rotation and rebuilding the sandbox with new credentials.
Linked Issues check	✅ Passed	The PR implements all core coding requirements from `#1903`: credential hash storage, rotation detection, automatic sandbox rebuild on rotation, and unit/E2E tests covering token rotation scenarios.
Out of Scope Changes check	✅ Passed	All changes align with the linked issue: credential hashing and rotation detection in onboard.ts, registry field for storing hashes, unit tests for detection logic, E2E script for token rotation validation, and CI workflow updates.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/credential-rotation-propagation

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/lib/registry.ts (1)

153-169: ⚠️ Potential issue | 🔴 Critical

Persist providerCredentialHashes when registering a sandbox.

createSandbox() now passes providerCredentialHashes, but this constructor drops the field. Every new sandbox is therefore recorded as a legacy entry, so detectMessagingCredentialRotation() never sees stored hashes and the rotation-rebuild path won't fire on the next onboard run.

Suggested fix

     data.sandboxes[entry.name] = {
       name: entry.name,
       createdAt: entry.createdAt || new Date().toISOString(),
       model: entry.model || null,
       nimContainer: entry.nimContainer || null,
       provider: entry.provider || null,
       gpuEnabled: entry.gpuEnabled || false,
       policies: entry.policies || [],
       policyTier: entry.policyTier || null,
       agent: entry.agent || null,
       dangerouslySkipPermissions:
         entry.dangerouslySkipPermissions === true ? true : undefined,
       agentVersion: entry.agentVersion || null,
+      providerCredentialHashes:
+        entry.providerCredentialHashes &&
+        Object.keys(entry.providerCredentialHashes).length > 0
+          ? { ...entry.providerCredentialHashes }
+          : undefined,
     };

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lib/registry.ts` around lines 153 - 169, registerSandbox is not
persisting providerCredentialHashes from the incoming SandboxEntry so new
sandboxes are recorded without hashes and break
detectMessagingCredentialRotation; update the object assigned to
data.sandboxes[entry.name] inside registerSandbox to include
providerCredentialHashes: entry.providerCredentialHashes || null (or undefined
as appropriate) so the stored entry retains the hashes passed by createSandbox;
ensure the property name matches SandboxEntry and any downstream checks that
read providerCredentialHashes from load().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 2699-2715: The code calls
sandboxState.backupSandboxState(sandboxName) but only treats thrown errors; if
backup.success === false the code should abort the automatic rebuild to avoid
destroying the workspace. After calling sandboxState.backupSandboxState inside
the try, check backup.success and if false log the failure (include
backup.message or details), do not set pendingStateRestore, and abort the
rebuild flow (for example return early or throw an error) so that
credentialRotation.changed does not trigger sandbox destruction; reference the
symbols credentialRotation.changed, sandboxState.backupSandboxState,
pendingStateRestore, existingSandboxState, and sandboxName when making the
change.

In `@test/e2e/test-token-rotation.sh`:
- Around line 156-163: The exported env vars in the script (OPENAI_BASE_URL,
NEMOCLAW_INFERENCE_PROVIDER, NEMOCLAW_INFERENCE_MODEL) don't match the
non-interactive env var names consumed by the onboarding logic in
src/lib/onboard.ts, so onboarding can ignore the fake endpoint; update the
test/e2e/test-token-rotation.sh exports to set the exact non-interactive env var
names that onboard reads (replace the three ignored vars with the corresponding
env var names used in src/lib/onboard.ts) so the fake OPENAI endpoint and
provider/model are actually bound during onboarding, and remove the stale/unused
exports.
- Around line 190-223: The probe writes the marker to /tmp which is outside the
agent-backed stateDirs handled by backupSandboxState()/restoreSandboxState(),
and the test currently only logs info if the marker is missing; change the test
to write the marker into one of the backed-up state dirs (use the agent's
writable/state directory used by restoreSandboxState(), e.g. the same dir
referenced by stateDirs) instead of /tmp and then assert failure if the marker
is absent after rotation (replace the info branch that logs missing marker with
a fail call); update references to the marker variable (MARKER) and the
openshell sandbox exec calls that read the marker so they point at the backed-up
path rather than /tmp/rotation-test-marker.
- Around line 196-235: After detecting rotation in ONBOARD_OUTPUT and confirming
sandbox rebuild, add an active verification inside the sandbox using openshell
sandbox exec "$SANDBOX_NAME" to call the Telegram getMe endpoint with the old
token (token A) and the new token (token B): run getMe (or an equivalent HTTP
call) from within the sandbox to assert token A now fails and token B succeeds;
place these checks after the "Sandbox running after rotation" block and before
the Phase 3 re-onboard assertions, reusing NEMOCLAW_CMD, SANDBOX_NAME and the
existing openshell helper to locate where to run the in-sandbox commands and
perform pass/fail on their responses.

---

Outside diff comments:
In `@src/lib/registry.ts`:
- Around line 153-169: registerSandbox is not persisting
providerCredentialHashes from the incoming SandboxEntry so new sandboxes are
recorded without hashes and break detectMessagingCredentialRotation; update the
object assigned to data.sandboxes[entry.name] inside registerSandbox to include
providerCredentialHashes: entry.providerCredentialHashes || null (or undefined
as appropriate) so the stored entry retains the hashes passed by createSandbox;
ensure the property name matches SandboxEntry and any downstream checks that
read providerCredentialHashes from load().

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: da46481e-75fe-4c94-b643-43bcde0d5782

📥 Commits

Reviewing files that changed from the base of the PR and between 121148c and 8114d31.

📒 Files selected for processing (4)

src/lib/onboard.ts
src/lib/registry.ts
test/credential-rotation.test.ts
test/e2e/test-token-rotation.sh

coderabbitai · 2026-04-16T14:56:00Z

+export TELEGRAM_BOT_TOKEN="$TELEGRAM_BOT_TOKEN_A"
+export NEMOCLAW_SANDBOX_NAME="$SANDBOX_NAME"
+export NEMOCLAW_POLICY_TIER="open"
+export NEMOCLAW_NON_INTERACTIVE=1
+export OPENAI_API_KEY="fake-key"
+export OPENAI_BASE_URL="$FAKE_BASE_URL"
+export NEMOCLAW_INFERENCE_PROVIDER="openai-compatible"
+export NEMOCLAW_INFERENCE_MODEL="fake-model"


⚠️ Potential issue | 🟠 Major

Use the non-interactive env vars that onboard actually reads.

These assignments don't match the CLI contract in src/lib/onboard.ts: OPENAI_BASE_URL, NEMOCLAW_INFERENCE_PROVIDER, and NEMOCLAW_INFERENCE_MODEL are ignored here. As written, this script won't bind onboarding to the fake endpoint, so it can silently exercise the wrong provider path.

Suggested fix

-export OPENAI_API_KEY="fake-key" -export OPENAI_BASE_URL="$FAKE_BASE_URL" -export NEMOCLAW_INFERENCE_PROVIDER="openai-compatible" -export NEMOCLAW_INFERENCE_MODEL="fake-model" +export NEMOCLAW_PROVIDER="custom" +export NEMOCLAW_MODEL="fake-model" +export NEMOCLAW_ENDPOINT_URL="$FAKE_BASE_URL" +export COMPATIBLE_API_KEY="fake-key" +export NEMOCLAW_PREFERRED_API="openai-completions"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export TELEGRAM_BOT_TOKEN="$TELEGRAM_BOT_TOKEN_A"

export NEMOCLAW_SANDBOX_NAME="$SANDBOX_NAME"

export NEMOCLAW_POLICY_TIER="open"

export NEMOCLAW_NON_INTERACTIVE=1

export OPENAI_API_KEY="fake-key"

export OPENAI_BASE_URL="$FAKE_BASE_URL"

export NEMOCLAW_INFERENCE_PROVIDER="openai-compatible"

export NEMOCLAW_INFERENCE_MODEL="fake-model"

export TELEGRAM_BOT_TOKEN="$TELEGRAM_BOT_TOKEN_A"

export NEMOCLAW_SANDBOX_NAME="$SANDBOX_NAME"

export NEMOCLAW_POLICY_TIER="open"

export NEMOCLAW_NON_INTERACTIVE=1

export NEMOCLAW_PROVIDER="custom"

export NEMOCLAW_MODEL="fake-model"

export NEMOCLAW_ENDPOINT_URL="$FAKE_BASE_URL"

export COMPATIBLE_API_KEY="fake-key"

export NEMOCLAW_PREFERRED_API="openai-completions"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-token-rotation.sh` around lines 156 - 163, The exported env vars in the script (OPENAI_BASE_URL, NEMOCLAW_INFERENCE_PROVIDER, NEMOCLAW_INFERENCE_MODEL) don't match the non-interactive env var names consumed by the onboarding logic in src/lib/onboard.ts, so onboarding can ignore the fake endpoint; update the test/e2e/test-token-rotation.sh exports to set the exact non-interactive env var names that onboard reads (replace the three ignored vars with the corresponding env var names used in src/lib/onboard.ts) so the fake OPENAI endpoint and provider/model are actually bound during onboarding, and remove the stale/unused exports.

This is a dead code path — the test now uses install.sh --non-interactive which handles inference setup internally. The old fake OpenAI env vars were removed in the rewrite. The test validates rotation detection and rebuild behavior, not inference routing.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash cat test/e2e/test-token-rotation.sh

Length of output: 6619

@ericksoa, confirmed — the full file on the branch has been rewritten. The fake OpenAI server, start_fake_openai, and all the flagged env vars (OPENAI_API_KEY, OPENAI_BASE_URL, NEMOCLAW_INFERENCE_PROVIDER, NEMOCLAW_INFERENCE_MODEL) are gone entirely. Phase 0 now delegates to bash install.sh --non-interactive, and the test only validates rotation detection and rebuild behavior via nemoclaw onboard --non-interactive output. My original comment was based on an older snapshot in the diff and doesn't apply to the current code. Sorry for the noise!

🧠 Learnings used

Learnt from: ericksoa Repo: NVIDIA/NemoClaw PR: 1849 File: src/nemoclaw.ts:2168-2176 Timestamp: 2026-04-16T17:11:07.305Z Learning: In `src/nemoclaw.ts` (TypeScript, NemoClaw CLI), the `rotate-token` command's `--from-env` flag validation is intentionally delegated to `configRotateToken()` in `src/lib/sandbox-config.ts`. If `--from-env` is the last CLI argument, `tokenOpts.fromEnv` becomes `undefined`; `configRotateToken` then checks `process.env[opts.fromEnv]` and errors on empty/undefined. No guard at the CLI dispatch layer is needed or expected — do not flag the missing value-presence check for `--from-env` in the rotate-token flag parser.

Learnt from: ericksoa Repo: NVIDIA/NemoClaw PR: 1849 File: test/shields-audit.test.ts:29-31 Timestamp: 2026-04-16T17:09:26.055Z Learning: In `test/shields-audit.test.ts` (NemoClaw project), the local `appendAuditEntry()` reimplementation is intentional: (1) the test's purpose is to validate JSONL format and field structure only, (2) redaction is covered separately in `test/secret-redaction.test.ts`, and (3) the CJS/ESM boundary in `src/lib/shields-audit.ts` prevents calling the real logger directly from Vitest. Do not flag the local reimplementation as a test-coverage gap in future reviews.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-16T15:00:37Z

❌ Brev E2E (messaging-providers): FAILED on branch fix/credential-rotation-propagation — See logs

Wires test/e2e/test-token-rotation.sh into the nightly-e2e workflow with fake Telegram tokens. Validates that rotating a messaging credential and re-running onboard triggers sandbox rebuild and propagates the new token. Ref: #1903 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 587-588: The notify-on-failure aggregation is missing the
shields-config-e2e job: add "shields-config-e2e" to the needs: [...] list and
include "needs.shields-config-e2e.result == 'failure' ||" into the long if:
conditional so any failure of shields-config-e2e will trigger the
notify-on-failure job; update the existing if expression where other
needs.*.result checks appear to include this new check in the same OR chain.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: faa12449-191f-4c3a-9f00-6fb3e047720a

📥 Commits

Reviewing files that changed from the base of the PR and between d2d8135 and c93c339.

📒 Files selected for processing (1)

.github/workflows/nightly-e2e.yaml

coderabbitai · 2026-04-16T17:14:43Z

+    needs: [cloud-e2e, cloud-experimental-e2e, messaging-providers-e2e, token-rotation-e2e, sandbox-survival-e2e, hermes-e2e, skip-permissions-e2e, sandbox-operations-e2e, inference-routing-e2e, snapshot-commands-e2e, rebuild-openclaw-e2e, upgrade-stale-sandbox-e2e, rebuild-hermes-e2e, gpu-e2e]
+    if: ${{ always() && (needs.cloud-e2e.result == 'failure' || needs.cloud-experimental-e2e.result == 'failure' || needs.messaging-providers-e2e.result == 'failure' || needs.token-rotation-e2e.result == 'failure' || needs.sandbox-survival-e2e.result == 'failure' || needs.hermes-e2e.result == 'failure' || needs.skip-permissions-e2e.result == 'failure' || needs.sandbox-operations-e2e.result == 'failure' || needs.inference-routing-e2e.result == 'failure' || needs.snapshot-commands-e2e.result == 'failure' || needs.rebuild-openclaw-e2e.result == 'failure' || needs.upgrade-stale-sandbox-e2e.result == 'failure' || needs.rebuild-hermes-e2e.result == 'failure' || needs.gpu-e2e.result == 'failure') }}


⚠️ Potential issue | 🟠 Major

Include shields-config-e2e in failure aggregation.

Line 587 and Line 588 still omit shields-config-e2e, so a failure there won’t trigger notify-on-failure, despite the workflow comment saying notification should happen when any E2E job fails.

🔧 Suggested patch

- needs: [cloud-e2e, cloud-experimental-e2e, messaging-providers-e2e, token-rotation-e2e, sandbox-survival-e2e, hermes-e2e, skip-permissions-e2e, sandbox-operations-e2e, inference-routing-e2e, snapshot-commands-e2e, rebuild-openclaw-e2e, upgrade-stale-sandbox-e2e, rebuild-hermes-e2e, gpu-e2e] - if: ${{ always() && (needs.cloud-e2e.result == 'failure' || needs.cloud-experimental-e2e.result == 'failure' || needs.messaging-providers-e2e.result == 'failure' || needs.token-rotation-e2e.result == 'failure' || needs.sandbox-survival-e2e.result == 'failure' || needs.hermes-e2e.result == 'failure' || needs.skip-permissions-e2e.result == 'failure' || needs.sandbox-operations-e2e.result == 'failure' || needs.inference-routing-e2e.result == 'failure' || needs.snapshot-commands-e2e.result == 'failure' || needs.rebuild-openclaw-e2e.result == 'failure' || needs.upgrade-stale-sandbox-e2e.result == 'failure' || needs.rebuild-hermes-e2e.result == 'failure' || needs.gpu-e2e.result == 'failure') }} + needs: [cloud-e2e, cloud-experimental-e2e, messaging-providers-e2e, token-rotation-e2e, sandbox-survival-e2e, hermes-e2e, skip-permissions-e2e, sandbox-operations-e2e, inference-routing-e2e, snapshot-commands-e2e, shields-config-e2e, rebuild-openclaw-e2e, upgrade-stale-sandbox-e2e, rebuild-hermes-e2e, gpu-e2e] + if: ${{ always() && (needs.cloud-e2e.result == 'failure' || needs.cloud-experimental-e2e.result == 'failure' || needs.messaging-providers-e2e.result == 'failure' || needs.token-rotation-e2e.result == 'failure' || needs.sandbox-survival-e2e.result == 'failure' || needs.hermes-e2e.result == 'failure' || needs.skip-permissions-e2e.result == 'failure' || needs.sandbox-operations-e2e.result == 'failure' || needs.inference-routing-e2e.result == 'failure' || needs.snapshot-commands-e2e.result == 'failure' || needs.shields-config-e2e.result == 'failure' || needs.rebuild-openclaw-e2e.result == 'failure' || needs.upgrade-stale-sandbox-e2e.result == 'failure' || needs.rebuild-hermes-e2e.result == 'failure' || needs.gpu-e2e.result == 'failure') }}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/nightly-e2e.yaml around lines 587 - 588, The notify-on-failure aggregation is missing the shields-config-e2e job: add "shields-config-e2e" to the needs: [...] list and include "needs.shields-config-e2e.result == 'failure' ||" into the long if: conditional so any failure of shields-config-e2e will trigger the notify-on-failure job; update the existing if expression where other needs.*.result checks appear to include this new check in the same OR chain.

Good catch — shields-config-e2e was added in a separate PR and isn't part of this change, but I'll include it in the notify-on-failure aggregation since we're already touching that line.

@ericksoa, sounds good! The patch in the review comment has the exact lines ready — just drop shields-config-e2e into the needs: list and add needs.shields-config-e2e.result == 'failure' || to the if: expression at the same position (after snapshot-commands-e2e), and you'll be all set.

🧠 Learnings used

Learnt from: ericksoa Repo: NVIDIA/NemoClaw PR: 1967 File: test/e2e/test-token-rotation.sh:0-0 Timestamp: 2026-04-16T18:34:27.427Z Learning: In `test/e2e/test-token-rotation.sh` (NemoClaw project), the E2E test intentionally focuses only on messaging credential rotation detection and sandbox rebuild triggering. Workspace/state preservation across rotation is not tested here; it is covered by the `sandboxRebuild` unit tests and `test/e2e/test-sandbox-rebuild.sh`. Do not flag the absence of a workspace-preservation assertion in this script as a test gap.

Learnt from: ericksoa Repo: NVIDIA/NemoClaw PR: 1849 File: src/lib/sandbox-config.ts:354-361 Timestamp: 2026-04-16T17:09:43.616Z Learning: In `src/lib/sandbox-config.ts` (TypeScript, NemoClaw CLI), the `appendAuditEntry` call inside `configSet` intentionally uses `action: "shields_down"` rather than a dedicated `"config_set"` action. This is by design: `config set` is only permitted while shields are down, so the audit entry records the shields-down authorization context rather than the config mutation itself. Do not flag this as a wrong action type in future reviews. A dedicated `config_set` audit action type is deferred to a follow-up.

Learnt from: ericksoa Repo: NVIDIA/NemoClaw PR: 1849 File: test/shields-audit.test.ts:29-31 Timestamp: 2026-04-16T17:09:26.055Z Learning: In `test/shields-audit.test.ts` (NemoClaw project), the local `appendAuditEntry()` reimplementation is intentional: (1) the test's purpose is to validate JSONL format and field structure only, (2) redaction is covered separately in `test/secret-redaction.test.ts`, and (3) the CJS/ESM boundary in `src/lib/shields-audit.ts` prevents calling the real logger directly from Vitest. Do not flag the local reimplementation as a test-coverage gap in future reviews.

Learnt from: jayavenkatesh19 Repo: NVIDIA/NemoClaw PR: 839 File: .github/workflows/nightly-e2e.yaml:97-101 Timestamp: 2026-03-25T17:41:58.209Z Learning: In this repo’s GitHub Actions workflows, it is intentional and correct to set `NEMOCLAW_RECREATE_SANDBOX=1` in a job `env:` block when the job later runs `install.sh` in `--non-interactive` mode. `install.sh --non-interactive` orchestrates `nemoclaw onboard --non-interactive`, which reads `NEMOCLAW_RECREATE_SANDBOX` from the environment; the variable is documented in `install.sh` usage/help output, so it should not be flagged as an unknown or unused env var in workflows that call `install.sh`.

registerSandbox() explicitly constructs the stored object from known fields rather than spreading the entry. The providerCredentialHashes field was added to the SandboxEntry interface but not to the object literal in registerSandbox(), so hashes were silently dropped. Also fixes the e2e test to use install.sh for proper NemoClaw bootstrapping on bare CI runners, and fixes shellcheck warning. Verified locally: token A onboard stores hashes, token B onboard detects rotation and rebuilds, token B re-onboard reuses sandbox. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

test/e2e/test-token-rotation.sh (2)

202-214: ⚠️ Potential issue | 🟠 Major

Workspace preservation is not asserted after rebuild.

The PR objective includes preserving workspace state across rebuild, but this script has no marker write/read assertion around the rotation-triggered rebuild path.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-token-rotation.sh` around lines 202 - 214, The test lacks an
assertion that workspace state is preserved across the rebuild in "Phase 3:
Re-onboard with same token", so add a marker file write/read check around the
onboarding/re-onboarding path: before the first onboard create a unique marker
file (e.g., write a GUID to /tmp or the sandbox workspace) and commit its
expected value; after the re-onboard step in the block that checks
ONBOARD_OUTPUT (both the "reusing it" branch and the else branch where rebuild
may have occurred), read the marker and assert it matches the original value
(use the same helper functions pass/fail/info), failing the test if the marker
is missing or altered; reference the variables/functions ONBOARD_OUTPUT, pass,
fail, info and the "Phase 3" section to locate where to add the marker write
before initial onboarding and the marker read/assert after the re-onboard.

10-12: ⚠️ Potential issue | 🟠 Major

This E2E still misses the core in-sandbox Telegram verification for rotation.

The script explicitly avoids Telegram API validation (Lines 10-12), and Phase 2 only checks onboarding logs and sandbox presence. That does not prove the running sandbox’s L7 proxy uses token A before rotation and token B after rotation (issue #1903 objective).

Also applies to: 170-201

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-token-rotation.sh` around lines 10 - 12, The E2E script
test-token-rotation.sh omits an in-sandbox verification that the sandbox L7
proxy actually uses token A before rotation and token B after rotation; add a
Phase 2 check that runs a request from inside the running sandbox (e.g.,
exec/curl into the sandbox process or run the test helper that hits the Telegram
API path /bot<TOKEN>/getMe) and assert the fake Telegram server received
requests authenticated with TOKEN_A pre-rotation and TOKEN_B post-rotation; use
the existing TOKEN_A/TOKEN_B variables and the test’s sandbox identifier to
locate the running container/process and record the token observed by the fake
Telegram endpoint, failing the test if the observed token does not match
expected.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/test-token-rotation.sh`:
- Around line 161-166: The current test checks for "providerCredentialHashes"
anywhere in $REGISTRY which can false-pass; update the assertion to verify the
credential hash is stored specifically under the current sandbox key
($SANDBOX_NAME). Locate the registry check that uses REGISTRY and replace the
broad grep with a sandbox-scoped check (e.g., parse the registry JSON and assert
that the object for SANDBOX_NAME contains a providerCredentialHashes field or
grep for a pattern that ties providerCredentialHashes to the SANDBOX_NAME key)
so the test only passes if providerCredentialHashes exists for "$SANDBOX_NAME".
- Around line 175-179: The script captures ONBOARD_OUTPUT and onboard_exit but
never fails the test when onboarding returns non-zero; update the block that
sets ONBOARD_OUTPUT and onboard_exit (and the similar Phase 3 capture) to check
if onboard_exit != 0 and, if so, print the captured ONBOARD_OUTPUT for debugging
and exit the test with the same non-zero code (or call the test-failure helper
used elsewhere), ensuring a non-zero onboarding result fails the e2e test
immediately.

---

Duplicate comments:
In `@test/e2e/test-token-rotation.sh`:
- Around line 202-214: The test lacks an assertion that workspace state is
preserved across the rebuild in "Phase 3: Re-onboard with same token", so add a
marker file write/read check around the onboarding/re-onboarding path: before
the first onboard create a unique marker file (e.g., write a GUID to /tmp or the
sandbox workspace) and commit its expected value; after the re-onboard step in
the block that checks ONBOARD_OUTPUT (both the "reusing it" branch and the else
branch where rebuild may have occurred), read the marker and assert it matches
the original value (use the same helper functions pass/fail/info), failing the
test if the marker is missing or altered; reference the variables/functions
ONBOARD_OUTPUT, pass, fail, info and the "Phase 3" section to locate where to
add the marker write before initial onboarding and the marker read/assert after
the re-onboard.
- Around line 10-12: The E2E script test-token-rotation.sh omits an in-sandbox
verification that the sandbox L7 proxy actually uses token A before rotation and
token B after rotation; add a Phase 2 check that runs a request from inside the
running sandbox (e.g., exec/curl into the sandbox process or run the test helper
that hits the Telegram API path /bot<TOKEN>/getMe) and assert the fake Telegram
server received requests authenticated with TOKEN_A pre-rotation and TOKEN_B
post-rotation; use the existing TOKEN_A/TOKEN_B variables and the test’s sandbox
identifier to locate the running container/process and record the token observed
by the fake Telegram endpoint, failing the test if the observed token does not
match expected.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 383aa178-b277-429a-a854-b284f03a972e

📥 Commits

Reviewing files that changed from the base of the PR and between c93c339 and 5bf0af0.

📒 Files selected for processing (2)

src/lib/registry.ts
test/e2e/test-token-rotation.sh

🚧 Files skipped from review as they are similar to previous changes (1)

src/lib/registry.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

♻️ Duplicate comments (3)

test/e2e/test-token-rotation.sh (3)

175-179: ⚠️ Potential issue | 🟠 Major

Fail immediately when nemoclaw onboard returns non-zero.

Line 176 captures onboard_exit but never enforces it; Phase 3 has the same gap at Line 206. A failed onboard can still produce partial output and trigger misleading pass conditions.

Suggested fix

 ONBOARD_OUTPUT=$(nemoclaw onboard --non-interactive 2>&1)
 onboard_exit=$?
 
 info "onboard exit code: $onboard_exit"
+if [ "$onboard_exit" -ne 0 ]; then
+  fail "Phase 2 onboard failed (exit $onboard_exit)"
+  echo "$ONBOARD_OUTPUT" | tail -40
+  exit "$onboard_exit"
+fi
@@
 ONBOARD_OUTPUT=$(nemoclaw onboard --non-interactive 2>&1)
+onboard_exit=$?
+if [ "$onboard_exit" -ne 0 ]; then
+  fail "Phase 3 onboard failed (exit $onboard_exit)"
+  echo "$ONBOARD_OUTPUT" | tail -40
+  exit "$onboard_exit"
+fi

Also applies to: 206-207

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-token-rotation.sh` around lines 175 - 179, The script captures
the output and exit code of `nemoclaw onboard` into `ONBOARD_OUTPUT` and
`onboard_exit` but does not act on non-zero exits; update the blocks that run
`nemoclaw onboard` (the lines assigning `ONBOARD_OUTPUT`/`onboard_exit` and the
Phase 3 equivalent) to immediately fail the test when `onboard_exit` is non-zero
by printing the captured `ONBOARD_OUTPUT` to stderr and calling `exit
$onboard_exit` (or `fail`/appropriate test-failure helper) so a failed onboard
cannot produce a false-positive pass.

161-166: ⚠️ Potential issue | 🟡 Minor

Scope the registry-hash assertion to "$SANDBOX_NAME" only.

Line 162 currently matches providerCredentialHashes anywhere in $REGISTRY, which can false-pass due to unrelated entries. Assert the field on the current sandbox record specifically.

Suggested fix

-# Verify credential hashes are stored in registry
-if [ -f "$REGISTRY" ] && grep -q "providerCredentialHashes" "$REGISTRY"; then
+# Verify credential hashes are stored for this sandbox entry
+if [ -f "$REGISTRY" ] && node -e '
+const fs = require("fs");
+const [registryPath, sandboxName] = process.argv.slice(1);
+const data = JSON.parse(fs.readFileSync(registryPath, "utf8"));
+const entry =
+  (data && data[sandboxName]) ||
+  (data && data.sandboxes && data.sandboxes[sandboxName]) ||
+  (Array.isArray(data) ? data.find((x) => x && x.name === sandboxName) : null);
+process.exit(entry && entry.providerCredentialHashes ? 0 : 1);
+' "$REGISTRY" "$SANDBOX_NAME"; then
   pass "Credential hashes stored in registry"
 else
   fail "Credential hashes not found in registry"
 fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-token-rotation.sh` around lines 161 - 166, The current check
searches for "providerCredentialHashes" anywhere in $REGISTRY which can
false-pass; change it to assert the field exists for the current sandbox only by
scoping to "$SANDBOX_NAME". Update the test so it queries the registry entry for
SANDBOX_NAME (e.g., using jq to test
.sandboxes["$SANDBOX_NAME"].providerCredentialHashes or by grepping the
SANDBOX_NAME block and then checking for providerCredentialHashes) and keep the
pass/fail logic around that scoped check (reference: the variables REGISTRY and
SANDBOX_NAME and the key providerCredentialHashes).

10-12: ⚠️ Potential issue | 🟠 Major

Add in-sandbox Telegram getMe assertions to validate the real regression path.

Lines 10-11 explicitly skip Telegram-response validation, but the issue objective is about the running sandbox L7 proxy using the rotated token. Log greps alone do not prove that behavior. Add active in-sandbox checks: token A should fail (4xx) before/after rotation as applicable, and token B should return 200 after rotation.

Also applies to: 196-214

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-token-rotation.sh` around lines 10 - 12, Update the test script
to perform active in-sandbox Telegram getMe calls instead of only grepping logs:
call the sandbox L7 proxy (use the script's existing sandbox invocation helper
or curl against the sandbox endpoint) with TOKEN_A and assert it returns a 4xx
before rotation and still fails after rotation when appropriate, then call with
TOKEN_B after rotation and assert it returns HTTP 200; reference and modify the
token variables (e.g., TOKEN_A, TOKEN_B) and the sandbox invocation helper
(e.g., the existing curl/sandbox runner used elsewhere in
test-token-rotation.sh) to add these explicit assertions around the rotation
step so the test verifies the actual proxy behavior rather than only log output.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/e2e/test-token-rotation.sh`:
- Around line 175-179: The script captures the output and exit code of `nemoclaw
onboard` into `ONBOARD_OUTPUT` and `onboard_exit` but does not act on non-zero
exits; update the blocks that run `nemoclaw onboard` (the lines assigning
`ONBOARD_OUTPUT`/`onboard_exit` and the Phase 3 equivalent) to immediately fail
the test when `onboard_exit` is non-zero by printing the captured
`ONBOARD_OUTPUT` to stderr and calling `exit $onboard_exit` (or
`fail`/appropriate test-failure helper) so a failed onboard cannot produce a
false-positive pass.
- Around line 161-166: The current check searches for "providerCredentialHashes"
anywhere in $REGISTRY which can false-pass; change it to assert the field exists
for the current sandbox only by scoping to "$SANDBOX_NAME". Update the test so
it queries the registry entry for SANDBOX_NAME (e.g., using jq to test
.sandboxes["$SANDBOX_NAME"].providerCredentialHashes or by grepping the
SANDBOX_NAME block and then checking for providerCredentialHashes) and keep the
pass/fail logic around that scoped check (reference: the variables REGISTRY and
SANDBOX_NAME and the key providerCredentialHashes).
- Around line 10-12: Update the test script to perform active in-sandbox
Telegram getMe calls instead of only grepping logs: call the sandbox L7 proxy
(use the script's existing sandbox invocation helper or curl against the sandbox
endpoint) with TOKEN_A and assert it returns a 4xx before rotation and still
fails after rotation when appropriate, then call with TOKEN_B after rotation and
assert it returns HTTP 200; reference and modify the token variables (e.g.,
TOKEN_A, TOKEN_B) and the sandbox invocation helper (e.g., the existing
curl/sandbox runner used elsewhere in test-token-rotation.sh) to add these
explicit assertions around the rotation step so the test verifies the actual
proxy behavior rather than only log output.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 57afdba5-2a54-4576-b495-92ccd32c4433

📥 Commits

Reviewing files that changed from the base of the PR and between 5bf0af0 and a09848f.

📒 Files selected for processing (2)

.github/workflows/nightly-e2e.yaml
test/e2e/test-token-rotation.sh

🚧 Files skipped from review as they are similar to previous changes (1)

.github/workflows/nightly-e2e.yaml

- Abort credential rotation rebuild when state backup fails (returns success: false or throws), falling back to reuse with a warning - Tighten registry hash assertion in e2e test to verify the hash is stored under the specific sandbox name, not just anywhere in the file - Enforce onboard exit codes in phases 2 and 3 of the e2e test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

fix(onboard): soften internal comment on credential refresh behavior

d2d8135

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

wscurran added Integration: Telegram Use this label to identify Telegram bot integration issues with NemoClaw. fix labels Apr 16, 2026

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

ericksoa mentioned this pull request Apr 16, 2026

fix(onboard): persist model and provider per sandbox in registry (#1689) #1975

Merged

13 tasks

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

Comment thread test/e2e/test-token-rotation.sh Outdated

Comment thread test/e2e/test-token-rotation.sh

chore: remove internal bug tracker references from public files

a09848f

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

ericksoa and others added 3 commits April 16, 2026 11:32

Merge branch 'main' into fix/credential-rotation-propagation

b0ba8e4

fix(e2e): use node instead of python for registry assertion

e09ab82

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

		needs: [cloud-e2e, cloud-experimental-e2e, messaging-providers-e2e, token-rotation-e2e, sandbox-survival-e2e, hermes-e2e, skip-permissions-e2e, sandbox-operations-e2e, inference-routing-e2e, snapshot-commands-e2e, rebuild-openclaw-e2e, upgrade-stale-sandbox-e2e, rebuild-hermes-e2e, gpu-e2e]
		if: ${{ always() && (needs.cloud-e2e.result == 'failure' \|\| needs.cloud-experimental-e2e.result == 'failure' \|\| needs.messaging-providers-e2e.result == 'failure' \|\| needs.token-rotation-e2e.result == 'failure' \|\| needs.sandbox-survival-e2e.result == 'failure' \|\| needs.hermes-e2e.result == 'failure' \|\| needs.skip-permissions-e2e.result == 'failure' \|\| needs.sandbox-operations-e2e.result == 'failure' \|\| needs.inference-routing-e2e.result == 'failure' \|\| needs.snapshot-commands-e2e.result == 'failure' \|\| needs.rebuild-openclaw-e2e.result == 'failure' \|\| needs.upgrade-stale-sandbox-e2e.result == 'failure' \|\| needs.rebuild-hermes-e2e.result == 'failure' \|\| needs.gpu-e2e.result == 'failure') }}

Conversation

ericksoa commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Design Notes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ericksoa Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 16, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ericksoa Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericksoa commented Apr 16, 2026 •

edited

Loading

coderabbitai bot commented Apr 16, 2026 •

edited

Loading

coderabbitai bot Apr 16, 2026 •

edited

Loading

coderabbitai bot Apr 16, 2026 •

edited

Loading