fix(onboard): propagate rotated messaging credentials to sandbox L7 proxy#1967
fix(onboard): propagate rotated messaging credentials to sandbox L7 proxy#1967
Conversation
…roxy When a user re-runs `nemoclaw onboard` with a rotated Telegram bot token, the provider update succeeds at the gateway level but the running sandbox's L7 proxy continues using the old credential. OpenShell v0.0.26 only attaches providers at `sandbox create` time and has no hot-reload mechanism. Fix: store SHA-256 hashes of messaging credentials in the sandbox registry at creation time. On reuse, compare current tokens against stored hashes. When rotation is detected, automatically rebuild the sandbox (backup → delete → create with new providers → restore) to propagate the new credential through the L7 proxy. Closes #1903 Ref: nvbug 6083165 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds SHA‑256 credential hashing and rotation detection; integrates rotation checks into sandbox reuse/recreate flow; persists per‑credential hashes in the sandbox registry; performs workspace backup/restore around sandbox rebuilds when rotation is detected. Adds unit and E2E tests and a nightly CI job for token-rotation E2E. Changes
Sequence DiagramsequenceDiagram
participant User
participant Onboard as Onboard Process
participant Registry
participant State as WorkspaceState
participant Sandbox
User->>Onboard: run onboard --non-interactive (with messaging tokens)
Onboard->>Registry: get sandbox entry
Registry-->>Onboard: sandbox entry (may include providerCredentialHashes)
Onboard->>Onboard: hash current tokens, compare to stored hashes
alt rotation detected
Onboard->>State: backup workspace
State-->>Onboard: backup artifact
Onboard->>Sandbox: delete existing sandbox
Onboard->>Sandbox: create new sandbox
Onboard->>Registry: register sandbox + new providerCredentialHashes
Registry-->>Onboard: confirm registration
Onboard->>State: restore workspace into new sandbox
State-->>Sandbox: workspace restored
else no rotation
Onboard->>Sandbox: reuse existing sandbox
end
Onboard-->>User: onboarding complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/lib/registry.ts (1)
153-169:⚠️ Potential issue | 🔴 CriticalPersist
providerCredentialHasheswhen registering a sandbox.
createSandbox()now passesproviderCredentialHashes, but this constructor drops the field. Every new sandbox is therefore recorded as a legacy entry, sodetectMessagingCredentialRotation()never sees stored hashes and the rotation-rebuild path won't fire on the next onboard run.Suggested fix
data.sandboxes[entry.name] = { name: entry.name, createdAt: entry.createdAt || new Date().toISOString(), model: entry.model || null, nimContainer: entry.nimContainer || null, provider: entry.provider || null, gpuEnabled: entry.gpuEnabled || false, policies: entry.policies || [], policyTier: entry.policyTier || null, agent: entry.agent || null, dangerouslySkipPermissions: entry.dangerouslySkipPermissions === true ? true : undefined, agentVersion: entry.agentVersion || null, + providerCredentialHashes: + entry.providerCredentialHashes && + Object.keys(entry.providerCredentialHashes).length > 0 + ? { ...entry.providerCredentialHashes } + : undefined, };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/registry.ts` around lines 153 - 169, registerSandbox is not persisting providerCredentialHashes from the incoming SandboxEntry so new sandboxes are recorded without hashes and break detectMessagingCredentialRotation; update the object assigned to data.sandboxes[entry.name] inside registerSandbox to include providerCredentialHashes: entry.providerCredentialHashes || null (or undefined as appropriate) so the stored entry retains the hashes passed by createSandbox; ensure the property name matches SandboxEntry and any downstream checks that read providerCredentialHashes from load().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 2699-2715: The code calls
sandboxState.backupSandboxState(sandboxName) but only treats thrown errors; if
backup.success === false the code should abort the automatic rebuild to avoid
destroying the workspace. After calling sandboxState.backupSandboxState inside
the try, check backup.success and if false log the failure (include
backup.message or details), do not set pendingStateRestore, and abort the
rebuild flow (for example return early or throw an error) so that
credentialRotation.changed does not trigger sandbox destruction; reference the
symbols credentialRotation.changed, sandboxState.backupSandboxState,
pendingStateRestore, existingSandboxState, and sandboxName when making the
change.
In `@test/e2e/test-token-rotation.sh`:
- Around line 156-163: The exported env vars in the script (OPENAI_BASE_URL,
NEMOCLAW_INFERENCE_PROVIDER, NEMOCLAW_INFERENCE_MODEL) don't match the
non-interactive env var names consumed by the onboarding logic in
src/lib/onboard.ts, so onboarding can ignore the fake endpoint; update the
test/e2e/test-token-rotation.sh exports to set the exact non-interactive env var
names that onboard reads (replace the three ignored vars with the corresponding
env var names used in src/lib/onboard.ts) so the fake OPENAI endpoint and
provider/model are actually bound during onboarding, and remove the stale/unused
exports.
- Around line 190-223: The probe writes the marker to /tmp which is outside the
agent-backed stateDirs handled by backupSandboxState()/restoreSandboxState(),
and the test currently only logs info if the marker is missing; change the test
to write the marker into one of the backed-up state dirs (use the agent's
writable/state directory used by restoreSandboxState(), e.g. the same dir
referenced by stateDirs) instead of /tmp and then assert failure if the marker
is absent after rotation (replace the info branch that logs missing marker with
a fail call); update references to the marker variable (MARKER) and the
openshell sandbox exec calls that read the marker so they point at the backed-up
path rather than /tmp/rotation-test-marker.
- Around line 196-235: After detecting rotation in ONBOARD_OUTPUT and confirming
sandbox rebuild, add an active verification inside the sandbox using openshell
sandbox exec "$SANDBOX_NAME" to call the Telegram getMe endpoint with the old
token (token A) and the new token (token B): run getMe (or an equivalent HTTP
call) from within the sandbox to assert token A now fails and token B succeeds;
place these checks after the "Sandbox running after rotation" block and before
the Phase 3 re-onboard assertions, reusing NEMOCLAW_CMD, SANDBOX_NAME and the
existing openshell helper to locate where to run the in-sandbox commands and
perform pass/fail on their responses.
---
Outside diff comments:
In `@src/lib/registry.ts`:
- Around line 153-169: registerSandbox is not persisting
providerCredentialHashes from the incoming SandboxEntry so new sandboxes are
recorded without hashes and break detectMessagingCredentialRotation; update the
object assigned to data.sandboxes[entry.name] inside registerSandbox to include
providerCredentialHashes: entry.providerCredentialHashes || null (or undefined
as appropriate) so the stored entry retains the hashes passed by createSandbox;
ensure the property name matches SandboxEntry and any downstream checks that
read providerCredentialHashes from load().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: da46481e-75fe-4c94-b643-43bcde0d5782
📒 Files selected for processing (4)
src/lib/onboard.tssrc/lib/registry.tstest/credential-rotation.test.tstest/e2e/test-token-rotation.sh
| export TELEGRAM_BOT_TOKEN="$TELEGRAM_BOT_TOKEN_A" | ||
| export NEMOCLAW_SANDBOX_NAME="$SANDBOX_NAME" | ||
| export NEMOCLAW_POLICY_TIER="open" | ||
| export NEMOCLAW_NON_INTERACTIVE=1 | ||
| export OPENAI_API_KEY="fake-key" | ||
| export OPENAI_BASE_URL="$FAKE_BASE_URL" | ||
| export NEMOCLAW_INFERENCE_PROVIDER="openai-compatible" | ||
| export NEMOCLAW_INFERENCE_MODEL="fake-model" |
There was a problem hiding this comment.
Use the non-interactive env vars that onboard actually reads.
These assignments don't match the CLI contract in src/lib/onboard.ts: OPENAI_BASE_URL, NEMOCLAW_INFERENCE_PROVIDER, and NEMOCLAW_INFERENCE_MODEL are ignored here. As written, this script won't bind onboarding to the fake endpoint, so it can silently exercise the wrong provider path.
Suggested fix
-export OPENAI_API_KEY="fake-key"
-export OPENAI_BASE_URL="$FAKE_BASE_URL"
-export NEMOCLAW_INFERENCE_PROVIDER="openai-compatible"
-export NEMOCLAW_INFERENCE_MODEL="fake-model"
+export NEMOCLAW_PROVIDER="custom"
+export NEMOCLAW_MODEL="fake-model"
+export NEMOCLAW_ENDPOINT_URL="$FAKE_BASE_URL"
+export COMPATIBLE_API_KEY="fake-key"
+export NEMOCLAW_PREFERRED_API="openai-completions"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export TELEGRAM_BOT_TOKEN="$TELEGRAM_BOT_TOKEN_A" | |
| export NEMOCLAW_SANDBOX_NAME="$SANDBOX_NAME" | |
| export NEMOCLAW_POLICY_TIER="open" | |
| export NEMOCLAW_NON_INTERACTIVE=1 | |
| export OPENAI_API_KEY="fake-key" | |
| export OPENAI_BASE_URL="$FAKE_BASE_URL" | |
| export NEMOCLAW_INFERENCE_PROVIDER="openai-compatible" | |
| export NEMOCLAW_INFERENCE_MODEL="fake-model" | |
| export TELEGRAM_BOT_TOKEN="$TELEGRAM_BOT_TOKEN_A" | |
| export NEMOCLAW_SANDBOX_NAME="$SANDBOX_NAME" | |
| export NEMOCLAW_POLICY_TIER="open" | |
| export NEMOCLAW_NON_INTERACTIVE=1 | |
| export NEMOCLAW_PROVIDER="custom" | |
| export NEMOCLAW_MODEL="fake-model" | |
| export NEMOCLAW_ENDPOINT_URL="$FAKE_BASE_URL" | |
| export COMPATIBLE_API_KEY="fake-key" | |
| export NEMOCLAW_PREFERRED_API="openai-completions" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/test-token-rotation.sh` around lines 156 - 163, The exported env
vars in the script (OPENAI_BASE_URL, NEMOCLAW_INFERENCE_PROVIDER,
NEMOCLAW_INFERENCE_MODEL) don't match the non-interactive env var names consumed
by the onboarding logic in src/lib/onboard.ts, so onboarding can ignore the fake
endpoint; update the test/e2e/test-token-rotation.sh exports to set the exact
non-interactive env var names that onboard reads (replace the three ignored vars
with the corresponding env var names used in src/lib/onboard.ts) so the fake
OPENAI endpoint and provider/model are actually bound during onboarding, and
remove the stale/unused exports.
There was a problem hiding this comment.
This is a dead code path — the test now uses install.sh --non-interactive which handles inference setup internally. The old fake OpenAI env vars were removed in the rewrite. The test validates rotation detection and rebuild behavior, not inference routing.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
cat test/e2e/test-token-rotation.sh
Length of output: 6619
@ericksoa, confirmed — the full file on the branch has been rewritten. The fake OpenAI server, start_fake_openai, and all the flagged env vars (OPENAI_API_KEY, OPENAI_BASE_URL, NEMOCLAW_INFERENCE_PROVIDER, NEMOCLAW_INFERENCE_MODEL) are gone entirely. Phase 0 now delegates to bash install.sh --non-interactive, and the test only validates rotation detection and rebuild behavior via nemoclaw onboard --non-interactive output. My original comment was based on an older snapshot in the diff and doesn't apply to the current code. Sorry for the noise!
🧠 Learnings used
Learnt from: ericksoa
Repo: NVIDIA/NemoClaw PR: 1849
File: src/nemoclaw.ts:2168-2176
Timestamp: 2026-04-16T17:11:07.305Z
Learning: In `src/nemoclaw.ts` (TypeScript, NemoClaw CLI), the `rotate-token` command's `--from-env` flag validation is intentionally delegated to `configRotateToken()` in `src/lib/sandbox-config.ts`. If `--from-env` is the last CLI argument, `tokenOpts.fromEnv` becomes `undefined`; `configRotateToken` then checks `process.env[opts.fromEnv]` and errors on empty/undefined. No guard at the CLI dispatch layer is needed or expected — do not flag the missing value-presence check for `--from-env` in the rotate-token flag parser.
Learnt from: ericksoa
Repo: NVIDIA/NemoClaw PR: 1849
File: test/shields-audit.test.ts:29-31
Timestamp: 2026-04-16T17:09:26.055Z
Learning: In `test/shields-audit.test.ts` (NemoClaw project), the local `appendAuditEntry()` reimplementation is intentional: (1) the test's purpose is to validate JSONL format and field structure only, (2) redaction is covered separately in `test/secret-redaction.test.ts`, and (3) the CJS/ESM boundary in `src/lib/shields-audit.ts` prevents calling the real logger directly from Vitest. Do not flag the local reimplementation as a test-coverage gap in future reviews.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
❌ Brev E2E (messaging-providers): FAILED on branch |
Wires test/e2e/test-token-rotation.sh into the nightly-e2e workflow with fake Telegram tokens. Validates that rotating a messaging credential and re-running onboard triggers sandbox rebuild and propagates the new token. Ref: #1903 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 587-588: The notify-on-failure aggregation is missing the
shields-config-e2e job: add "shields-config-e2e" to the needs: [...] list and
include "needs.shields-config-e2e.result == 'failure' ||" into the long if:
conditional so any failure of shields-config-e2e will trigger the
notify-on-failure job; update the existing if expression where other
needs.*.result checks appear to include this new check in the same OR chain.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: faa12449-191f-4c3a-9f00-6fb3e047720a
📒 Files selected for processing (1)
.github/workflows/nightly-e2e.yaml
| needs: [cloud-e2e, cloud-experimental-e2e, messaging-providers-e2e, token-rotation-e2e, sandbox-survival-e2e, hermes-e2e, skip-permissions-e2e, sandbox-operations-e2e, inference-routing-e2e, snapshot-commands-e2e, rebuild-openclaw-e2e, upgrade-stale-sandbox-e2e, rebuild-hermes-e2e, gpu-e2e] | ||
| if: ${{ always() && (needs.cloud-e2e.result == 'failure' || needs.cloud-experimental-e2e.result == 'failure' || needs.messaging-providers-e2e.result == 'failure' || needs.token-rotation-e2e.result == 'failure' || needs.sandbox-survival-e2e.result == 'failure' || needs.hermes-e2e.result == 'failure' || needs.skip-permissions-e2e.result == 'failure' || needs.sandbox-operations-e2e.result == 'failure' || needs.inference-routing-e2e.result == 'failure' || needs.snapshot-commands-e2e.result == 'failure' || needs.rebuild-openclaw-e2e.result == 'failure' || needs.upgrade-stale-sandbox-e2e.result == 'failure' || needs.rebuild-hermes-e2e.result == 'failure' || needs.gpu-e2e.result == 'failure') }} |
There was a problem hiding this comment.
Include shields-config-e2e in failure aggregation.
Line 587 and Line 588 still omit shields-config-e2e, so a failure there won’t trigger notify-on-failure, despite the workflow comment saying notification should happen when any E2E job fails.
🔧 Suggested patch
- needs: [cloud-e2e, cloud-experimental-e2e, messaging-providers-e2e, token-rotation-e2e, sandbox-survival-e2e, hermes-e2e, skip-permissions-e2e, sandbox-operations-e2e, inference-routing-e2e, snapshot-commands-e2e, rebuild-openclaw-e2e, upgrade-stale-sandbox-e2e, rebuild-hermes-e2e, gpu-e2e]
- if: ${{ always() && (needs.cloud-e2e.result == 'failure' || needs.cloud-experimental-e2e.result == 'failure' || needs.messaging-providers-e2e.result == 'failure' || needs.token-rotation-e2e.result == 'failure' || needs.sandbox-survival-e2e.result == 'failure' || needs.hermes-e2e.result == 'failure' || needs.skip-permissions-e2e.result == 'failure' || needs.sandbox-operations-e2e.result == 'failure' || needs.inference-routing-e2e.result == 'failure' || needs.snapshot-commands-e2e.result == 'failure' || needs.rebuild-openclaw-e2e.result == 'failure' || needs.upgrade-stale-sandbox-e2e.result == 'failure' || needs.rebuild-hermes-e2e.result == 'failure' || needs.gpu-e2e.result == 'failure') }}
+ needs: [cloud-e2e, cloud-experimental-e2e, messaging-providers-e2e, token-rotation-e2e, sandbox-survival-e2e, hermes-e2e, skip-permissions-e2e, sandbox-operations-e2e, inference-routing-e2e, snapshot-commands-e2e, shields-config-e2e, rebuild-openclaw-e2e, upgrade-stale-sandbox-e2e, rebuild-hermes-e2e, gpu-e2e]
+ if: ${{ always() && (needs.cloud-e2e.result == 'failure' || needs.cloud-experimental-e2e.result == 'failure' || needs.messaging-providers-e2e.result == 'failure' || needs.token-rotation-e2e.result == 'failure' || needs.sandbox-survival-e2e.result == 'failure' || needs.hermes-e2e.result == 'failure' || needs.skip-permissions-e2e.result == 'failure' || needs.sandbox-operations-e2e.result == 'failure' || needs.inference-routing-e2e.result == 'failure' || needs.snapshot-commands-e2e.result == 'failure' || needs.shields-config-e2e.result == 'failure' || needs.rebuild-openclaw-e2e.result == 'failure' || needs.upgrade-stale-sandbox-e2e.result == 'failure' || needs.rebuild-hermes-e2e.result == 'failure' || needs.gpu-e2e.result == 'failure') }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/nightly-e2e.yaml around lines 587 - 588, The
notify-on-failure aggregation is missing the shields-config-e2e job: add
"shields-config-e2e" to the needs: [...] list and include
"needs.shields-config-e2e.result == 'failure' ||" into the long if: conditional
so any failure of shields-config-e2e will trigger the notify-on-failure job;
update the existing if expression where other needs.*.result checks appear to
include this new check in the same OR chain.
There was a problem hiding this comment.
Good catch — shields-config-e2e was added in a separate PR and isn't part of this change, but I'll include it in the notify-on-failure aggregation since we're already touching that line.
There was a problem hiding this comment.
@ericksoa, sounds good! The patch in the review comment has the exact lines ready — just drop shields-config-e2e into the needs: list and add needs.shields-config-e2e.result == 'failure' || to the if: expression at the same position (after snapshot-commands-e2e), and you'll be all set.
🧠 Learnings used
Learnt from: ericksoa
Repo: NVIDIA/NemoClaw PR: 1967
File: test/e2e/test-token-rotation.sh:0-0
Timestamp: 2026-04-16T18:34:27.427Z
Learning: In `test/e2e/test-token-rotation.sh` (NemoClaw project), the E2E test intentionally focuses only on messaging credential rotation detection and sandbox rebuild triggering. Workspace/state preservation across rotation is not tested here; it is covered by the `sandboxRebuild` unit tests and `test/e2e/test-sandbox-rebuild.sh`. Do not flag the absence of a workspace-preservation assertion in this script as a test gap.
Learnt from: ericksoa
Repo: NVIDIA/NemoClaw PR: 1849
File: src/lib/sandbox-config.ts:354-361
Timestamp: 2026-04-16T17:09:43.616Z
Learning: In `src/lib/sandbox-config.ts` (TypeScript, NemoClaw CLI), the `appendAuditEntry` call inside `configSet` intentionally uses `action: "shields_down"` rather than a dedicated `"config_set"` action. This is by design: `config set` is only permitted while shields are down, so the audit entry records the shields-down authorization context rather than the config mutation itself. Do not flag this as a wrong action type in future reviews. A dedicated `config_set` audit action type is deferred to a follow-up.
Learnt from: ericksoa
Repo: NVIDIA/NemoClaw PR: 1849
File: test/shields-audit.test.ts:29-31
Timestamp: 2026-04-16T17:09:26.055Z
Learning: In `test/shields-audit.test.ts` (NemoClaw project), the local `appendAuditEntry()` reimplementation is intentional: (1) the test's purpose is to validate JSONL format and field structure only, (2) redaction is covered separately in `test/secret-redaction.test.ts`, and (3) the CJS/ESM boundary in `src/lib/shields-audit.ts` prevents calling the real logger directly from Vitest. Do not flag the local reimplementation as a test-coverage gap in future reviews.
Learnt from: jayavenkatesh19
Repo: NVIDIA/NemoClaw PR: 839
File: .github/workflows/nightly-e2e.yaml:97-101
Timestamp: 2026-03-25T17:41:58.209Z
Learning: In this repo’s GitHub Actions workflows, it is intentional and correct to set `NEMOCLAW_RECREATE_SANDBOX=1` in a job `env:` block when the job later runs `install.sh` in `--non-interactive` mode. `install.sh --non-interactive` orchestrates `nemoclaw onboard --non-interactive`, which reads `NEMOCLAW_RECREATE_SANDBOX` from the environment; the variable is documented in `install.sh` usage/help output, so it should not be flagged as an unknown or unused env var in workflows that call `install.sh`.
registerSandbox() explicitly constructs the stored object from known fields rather than spreading the entry. The providerCredentialHashes field was added to the SandboxEntry interface but not to the object literal in registerSandbox(), so hashes were silently dropped. Also fixes the e2e test to use install.sh for proper NemoClaw bootstrapping on bare CI runners, and fixes shellcheck warning. Verified locally: token A onboard stores hashes, token B onboard detects rotation and rebuilds, token B re-onboard reuses sandbox. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
test/e2e/test-token-rotation.sh (2)
202-214:⚠️ Potential issue | 🟠 MajorWorkspace preservation is not asserted after rebuild.
The PR objective includes preserving workspace state across rebuild, but this script has no marker write/read assertion around the rotation-triggered rebuild path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-token-rotation.sh` around lines 202 - 214, The test lacks an assertion that workspace state is preserved across the rebuild in "Phase 3: Re-onboard with same token", so add a marker file write/read check around the onboarding/re-onboarding path: before the first onboard create a unique marker file (e.g., write a GUID to /tmp or the sandbox workspace) and commit its expected value; after the re-onboard step in the block that checks ONBOARD_OUTPUT (both the "reusing it" branch and the else branch where rebuild may have occurred), read the marker and assert it matches the original value (use the same helper functions pass/fail/info), failing the test if the marker is missing or altered; reference the variables/functions ONBOARD_OUTPUT, pass, fail, info and the "Phase 3" section to locate where to add the marker write before initial onboarding and the marker read/assert after the re-onboard.
10-12:⚠️ Potential issue | 🟠 MajorThis E2E still misses the core in-sandbox Telegram verification for rotation.
The script explicitly avoids Telegram API validation (Lines 10-12), and Phase 2 only checks onboarding logs and sandbox presence. That does not prove the running sandbox’s L7 proxy uses token A before rotation and token B after rotation (issue
#1903objective).Also applies to: 170-201
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-token-rotation.sh` around lines 10 - 12, The E2E script test-token-rotation.sh omits an in-sandbox verification that the sandbox L7 proxy actually uses token A before rotation and token B after rotation; add a Phase 2 check that runs a request from inside the running sandbox (e.g., exec/curl into the sandbox process or run the test helper that hits the Telegram API path /bot<TOKEN>/getMe) and assert the fake Telegram server received requests authenticated with TOKEN_A pre-rotation and TOKEN_B post-rotation; use the existing TOKEN_A/TOKEN_B variables and the test’s sandbox identifier to locate the running container/process and record the token observed by the fake Telegram endpoint, failing the test if the observed token does not match expected.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/test-token-rotation.sh`:
- Around line 161-166: The current test checks for "providerCredentialHashes"
anywhere in $REGISTRY which can false-pass; update the assertion to verify the
credential hash is stored specifically under the current sandbox key
($SANDBOX_NAME). Locate the registry check that uses REGISTRY and replace the
broad grep with a sandbox-scoped check (e.g., parse the registry JSON and assert
that the object for SANDBOX_NAME contains a providerCredentialHashes field or
grep for a pattern that ties providerCredentialHashes to the SANDBOX_NAME key)
so the test only passes if providerCredentialHashes exists for "$SANDBOX_NAME".
- Around line 175-179: The script captures ONBOARD_OUTPUT and onboard_exit but
never fails the test when onboarding returns non-zero; update the block that
sets ONBOARD_OUTPUT and onboard_exit (and the similar Phase 3 capture) to check
if onboard_exit != 0 and, if so, print the captured ONBOARD_OUTPUT for debugging
and exit the test with the same non-zero code (or call the test-failure helper
used elsewhere), ensuring a non-zero onboarding result fails the e2e test
immediately.
---
Duplicate comments:
In `@test/e2e/test-token-rotation.sh`:
- Around line 202-214: The test lacks an assertion that workspace state is
preserved across the rebuild in "Phase 3: Re-onboard with same token", so add a
marker file write/read check around the onboarding/re-onboarding path: before
the first onboard create a unique marker file (e.g., write a GUID to /tmp or the
sandbox workspace) and commit its expected value; after the re-onboard step in
the block that checks ONBOARD_OUTPUT (both the "reusing it" branch and the else
branch where rebuild may have occurred), read the marker and assert it matches
the original value (use the same helper functions pass/fail/info), failing the
test if the marker is missing or altered; reference the variables/functions
ONBOARD_OUTPUT, pass, fail, info and the "Phase 3" section to locate where to
add the marker write before initial onboarding and the marker read/assert after
the re-onboard.
- Around line 10-12: The E2E script test-token-rotation.sh omits an in-sandbox
verification that the sandbox L7 proxy actually uses token A before rotation and
token B after rotation; add a Phase 2 check that runs a request from inside the
running sandbox (e.g., exec/curl into the sandbox process or run the test helper
that hits the Telegram API path /bot<TOKEN>/getMe) and assert the fake Telegram
server received requests authenticated with TOKEN_A pre-rotation and TOKEN_B
post-rotation; use the existing TOKEN_A/TOKEN_B variables and the test’s sandbox
identifier to locate the running container/process and record the token observed
by the fake Telegram endpoint, failing the test if the observed token does not
match expected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 383aa178-b277-429a-a854-b284f03a972e
📒 Files selected for processing (2)
src/lib/registry.tstest/e2e/test-token-rotation.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- src/lib/registry.ts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (3)
test/e2e/test-token-rotation.sh (3)
175-179:⚠️ Potential issue | 🟠 MajorFail immediately when
nemoclaw onboardreturns non-zero.Line 176 captures
onboard_exitbut never enforces it; Phase 3 has the same gap at Line 206. A failed onboard can still produce partial output and trigger misleading pass conditions.Suggested fix
ONBOARD_OUTPUT=$(nemoclaw onboard --non-interactive 2>&1) onboard_exit=$? info "onboard exit code: $onboard_exit" +if [ "$onboard_exit" -ne 0 ]; then + fail "Phase 2 onboard failed (exit $onboard_exit)" + echo "$ONBOARD_OUTPUT" | tail -40 + exit "$onboard_exit" +fi @@ ONBOARD_OUTPUT=$(nemoclaw onboard --non-interactive 2>&1) +onboard_exit=$? +if [ "$onboard_exit" -ne 0 ]; then + fail "Phase 3 onboard failed (exit $onboard_exit)" + echo "$ONBOARD_OUTPUT" | tail -40 + exit "$onboard_exit" +fiAlso applies to: 206-207
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-token-rotation.sh` around lines 175 - 179, The script captures the output and exit code of `nemoclaw onboard` into `ONBOARD_OUTPUT` and `onboard_exit` but does not act on non-zero exits; update the blocks that run `nemoclaw onboard` (the lines assigning `ONBOARD_OUTPUT`/`onboard_exit` and the Phase 3 equivalent) to immediately fail the test when `onboard_exit` is non-zero by printing the captured `ONBOARD_OUTPUT` to stderr and calling `exit $onboard_exit` (or `fail`/appropriate test-failure helper) so a failed onboard cannot produce a false-positive pass.
161-166:⚠️ Potential issue | 🟡 MinorScope the registry-hash assertion to
"$SANDBOX_NAME"only.Line 162 currently matches
providerCredentialHashesanywhere in$REGISTRY, which can false-pass due to unrelated entries. Assert the field on the current sandbox record specifically.Suggested fix
-# Verify credential hashes are stored in registry -if [ -f "$REGISTRY" ] && grep -q "providerCredentialHashes" "$REGISTRY"; then +# Verify credential hashes are stored for this sandbox entry +if [ -f "$REGISTRY" ] && node -e ' +const fs = require("fs"); +const [registryPath, sandboxName] = process.argv.slice(1); +const data = JSON.parse(fs.readFileSync(registryPath, "utf8")); +const entry = + (data && data[sandboxName]) || + (data && data.sandboxes && data.sandboxes[sandboxName]) || + (Array.isArray(data) ? data.find((x) => x && x.name === sandboxName) : null); +process.exit(entry && entry.providerCredentialHashes ? 0 : 1); +' "$REGISTRY" "$SANDBOX_NAME"; then pass "Credential hashes stored in registry" else fail "Credential hashes not found in registry" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-token-rotation.sh` around lines 161 - 166, The current check searches for "providerCredentialHashes" anywhere in $REGISTRY which can false-pass; change it to assert the field exists for the current sandbox only by scoping to "$SANDBOX_NAME". Update the test so it queries the registry entry for SANDBOX_NAME (e.g., using jq to test .sandboxes["$SANDBOX_NAME"].providerCredentialHashes or by grepping the SANDBOX_NAME block and then checking for providerCredentialHashes) and keep the pass/fail logic around that scoped check (reference: the variables REGISTRY and SANDBOX_NAME and the key providerCredentialHashes).
10-12:⚠️ Potential issue | 🟠 MajorAdd in-sandbox Telegram
getMeassertions to validate the real regression path.Lines 10-11 explicitly skip Telegram-response validation, but the issue objective is about the running sandbox L7 proxy using the rotated token. Log greps alone do not prove that behavior. Add active in-sandbox checks: token A should fail (4xx) before/after rotation as applicable, and token B should return 200 after rotation.
Also applies to: 196-214
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/test-token-rotation.sh` around lines 10 - 12, Update the test script to perform active in-sandbox Telegram getMe calls instead of only grepping logs: call the sandbox L7 proxy (use the script's existing sandbox invocation helper or curl against the sandbox endpoint) with TOKEN_A and assert it returns a 4xx before rotation and still fails after rotation when appropriate, then call with TOKEN_B after rotation and assert it returns HTTP 200; reference and modify the token variables (e.g., TOKEN_A, TOKEN_B) and the sandbox invocation helper (e.g., the existing curl/sandbox runner used elsewhere in test-token-rotation.sh) to add these explicit assertions around the rotation step so the test verifies the actual proxy behavior rather than only log output.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/e2e/test-token-rotation.sh`:
- Around line 175-179: The script captures the output and exit code of `nemoclaw
onboard` into `ONBOARD_OUTPUT` and `onboard_exit` but does not act on non-zero
exits; update the blocks that run `nemoclaw onboard` (the lines assigning
`ONBOARD_OUTPUT`/`onboard_exit` and the Phase 3 equivalent) to immediately fail
the test when `onboard_exit` is non-zero by printing the captured
`ONBOARD_OUTPUT` to stderr and calling `exit $onboard_exit` (or
`fail`/appropriate test-failure helper) so a failed onboard cannot produce a
false-positive pass.
- Around line 161-166: The current check searches for "providerCredentialHashes"
anywhere in $REGISTRY which can false-pass; change it to assert the field exists
for the current sandbox only by scoping to "$SANDBOX_NAME". Update the test so
it queries the registry entry for SANDBOX_NAME (e.g., using jq to test
.sandboxes["$SANDBOX_NAME"].providerCredentialHashes or by grepping the
SANDBOX_NAME block and then checking for providerCredentialHashes) and keep the
pass/fail logic around that scoped check (reference: the variables REGISTRY and
SANDBOX_NAME and the key providerCredentialHashes).
- Around line 10-12: Update the test script to perform active in-sandbox
Telegram getMe calls instead of only grepping logs: call the sandbox L7 proxy
(use the script's existing sandbox invocation helper or curl against the sandbox
endpoint) with TOKEN_A and assert it returns a 4xx before rotation and still
fails after rotation when appropriate, then call with TOKEN_B after rotation and
assert it returns HTTP 200; reference and modify the token variables (e.g.,
TOKEN_A, TOKEN_B) and the sandbox invocation helper (e.g., the existing
curl/sandbox runner used elsewhere in test-token-rotation.sh) to add these
explicit assertions around the rotation step so the test verifies the actual
proxy behavior rather than only log output.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 57afdba5-2a54-4576-b495-92ccd32c4433
📒 Files selected for processing (2)
.github/workflows/nightly-e2e.yamltest/e2e/test-token-rotation.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/nightly-e2e.yaml
- Abort credential rotation rebuild when state backup fails (returns success: false or throws), falling back to reuse with a warning - Tighten registry hash assertion in e2e test to verify the hash is stored under the specific sandbox name, not just anywhere in the file - Enforce onboard exit codes in phases 2 and 3 of the e2e test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
nemoclaw onboard --non-interactivewith a new messaging token (e.g.TELEGRAM_BOT_TOKEN) reports success but the sandbox continues using the old credentialCloses #1903
Changes
src/lib/registry.tsproviderCredentialHashesfield toSandboxEntrysrc/lib/onboard.tshashCredential(),detectMessagingCredentialRotation(), modify reuse path to detect rotation and trigger rebuild with backup/restore, store hashes at registrationtest/credential-rotation.test.tstest/e2e/test-token-rotation.shDesign Notes
detectMessagingCredentialRotationreturnschanged: false(conservative, no false rebuilds)sandbox-state.tsbackup/restore machinery (same assandboxRebuild)Test plan
npx vitest run test/credential-rotation.test.ts— 11/11 passnpx vitest run test/onboard.test.ts— 117/117 existing tests pass (no regressions)tsc -p tsconfig.cli.json --noEmit— cleantest/e2e/test-token-rotation.shwith real Telegram tokens🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests
Chores