Skip to content

feat: Phase 5 -- Web Management UI with RBAC Auth#13

Merged
rodaddy merged 3 commits intomainfrom
feat/network-deployment
Mar 10, 2026
Merged

feat: Phase 5 -- Web Management UI with RBAC Auth#13
rodaddy merged 3 commits intomainfrom
feat/network-deployment

Conversation

@rodaddy
Copy link
Copy Markdown
Owner

@rodaddy rodaddy commented Mar 10, 2026

Summary

  • Web management UI at daemon root (/) with dark theme, live service dashboard
  • RBAC auth system with pluggable AuthProvider interface (admin/agent/viewer roles)
  • Username+password login alongside bearer token auth
  • ConfigManager for runtime service CRUD with disk persistence
  • REST API: 10 endpoints for service management, import from URL/GitHub, reload
  • CI/CD deploy pipeline with auto-rollback on health check failure
  • Cross-compiles for Linux x64, deploys to CT 216

Test plan

  • 788 tests pass, 0 failures
  • TypeScript typecheck clean
  • Binary compiles (macOS + Linux x64 cross-compile)
  • Manual deploy to CT 216 verified -- health check passes
  • Web UI loads at https://mcp2cli.rodaddy.live/
  • Caddy proxy configured, Let's Encrypt cert provisioned
  • CI deploy pipeline (triggers on merge to main)

🤖 Generated with Claude Code

rodaddy and others added 2 commits March 9, 2026 22:51
…client

Add TCP listen mode (MCP2CLI_LISTEN_HOST/PORT) so the daemon can serve
multiple machines over HTTP. Bearer token auth with timing-safe comparison
protects authenticated endpoints while /health and /metrics remain open.

Prometheus metrics at GET /metrics expose request counts, latency histograms,
connection pool state, memory usage, and auth failures per service/tool.

Remote client mode (MCP2CLI_REMOTE_URL) lets the CLI connect to a remote
daemon instead of starting a local one. Includes a bash wrapper
(scripts/mcp2cli-remote) for curl-only machines.

LXC deployment files in deploy/ with systemd unit, env template, services
config, and setup script. Ansible playbook lives in the infrastructure repo.

Fixes idle timer firing immediately when timeoutMs=0 (TCP mode) by guarding
touch() against zero/negative values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 5 implementation:
- ConfigManager with CRUD operations and disk persistence
- REST API for service management (add/update/remove/import/reload)
- Embedded web UI with dark theme, login (username+password and token tabs)
- Pluggable AuthProvider interface with TokenAuthProvider (RBAC)
- Three roles: admin (full), agent (tools+read), viewer (read-only)
- CI/CD deploy pipeline: build, deploy to CT 216, health check, auto-rollback
- Cross-compile for Linux x64 in deploy job
- tokens.json gitignored (secrets)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Mar 10, 2026

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
28197636 Triggered Generic Password 0612ac3 tests/daemon/auth-provider.test.ts View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 10, 2026

Claude finished @rodaddy's task —— View job


PR Review: feat: Phase 5 -- Web Management UI with RBAC Auth

  • Read key source files (auth, server, ui, config-manager)
  • Review security issues
  • Review bugs and logic errors
  • Review resource leaks
  • Review error handling
  • Review TypeScript strictness

Overall this is a solid piece of work with good structural decisions (pluggable AuthProvider, timing-safe comparisons, Zod validation). Issues below are ordered by severity.


🔴 Critical Security Issues

1. Hardcoded credentials in test file (tests/daemon/auth-provider.test.ts:18-27)

GitGuardian flagged this. While these are test fixtures, the strings "admin-secret", "agent-secret", "s3cret-pass!", and "agent-pass-42" are hardcoded in the committed test file. The bigger concern: if anyone ever copies these entries verbatim into a tokens.json, they'd be using publicly-known credentials. These should use randomized or clearly-fake values like "test-token-admin" / "test-token-agent" / "test-password-1". The git history now contains them permanently unless rebased.

2. /api/auth/me returns admin context when auth is disabled (src/daemon/server.ts:262-268)

if (path === "/api/auth/me" && req.method === "GET") {
  return Response.json({
    success: true,
    userId: authCtx?.userId ?? "anonymous",
    role: authCtx?.role ?? "admin",   // ← defaults to "admin" if authCtx is null
  });
}

authCtx will be null when the path is auth-exempt and auth is enabled but the token is missing. An unauthenticated client hitting /api/auth/me without a token gets back role: "admin". This is exploitable: the UI sets currentRole from this response (ui.ts:354), so an attacker could claim admin role in the browser by manually calling /api/auth/me with no token. The default should be "viewer" (or the endpoint should require auth). Note that /api/auth/me is not in AUTH_EXEMPT_PATHS, so it should get auth-checked — but authCtx is also null when the provider is disabled (it's set to admin-via-authenticateRequest) and for some other paths. This warrants careful audit.

3. Token returned in login response is the raw bearer token (src/daemon/server.ts:250-255)

return Response.json({
  success: true,
  token: result.token,   // ← plaintext credential in response body
  ...
});

The token in tokens.json is returned verbatim over the wire and stored in sessionStorage. sessionStorage is XSS-accessible. Since the UI runs on the same origin as the API, any XSS in the dashboard HTML would steal the bearer token. This is inherent to the single-origin design, but worth noting. More critically: the Prometheus /metrics endpoint (auth-exempt) could leak service names and tool call patterns to anyone on the network.

4. SSRF via /api/services/import (src/daemon/config-manager.ts:141)

const response = await fetch(url);

No URL allowlist or scheme validation. An admin can direct the daemon to fetch from internal network addresses (http://169.254.169.254/..., file://, http://localhost:PORT/...). Since the server runs as a privileged daemon, this gives filesystem read / internal service probing. At minimum, validate url starts with https:// and reject file:, ftp:, RFC-1918 hostnames, and loopback addresses. The buildGitHubRawUrl function at least constrains to raw.githubusercontent.com, but a caller can bypass it by providing url directly.


🟡 Bugs and Logic Errors

5. Route ordering bug: DELETE regex matches before PUT regex is checked (src/daemon/server.ts:309-340)

const putMatch  = path.match(/^\/api\/services\/([^/]+)$/);  // line 309
if (putMatch && req.method === "PUT") { ... }

const deleteMatch = path.match(/^\/api\/services\/([^/]+)$/); // line 328
if (deleteMatch && req.method === "DELETE") { ... }

Both regexes are identical. Since they're just if blocks (not else if), this works correctly today because method differs. But /api/services/reload also matches [^/]+. When req.method === "POST", the reload check appears after these, which means POST /api/services/reload won't reach the reload handler — it falls through to line 373 only because neither putMatch nor deleteMatch triggers for POST. This is fragile; if someone adds more POST routes using [^/]+ the ordering bug will silently eat requests. The reload route (/api/services/reload) at line 373 should be checked before the parameterized [^/]+ routes to prevent ambiguity.

6. writeLock is not a proper async mutex (src/daemon/config-manager.ts:216-226)

private writeLock = false;

private async writeToDisk(): Promise<void> {
  if (this.writeLock) {
    throw new ConfigManagerError("Concurrent write detected -- try again");
  }
  this.writeLock = true;
  try {
    await Bun.write(...);
  } finally {
    this.writeLock = false;
  }
}

In a single-threaded JS runtime this is safe only between synchronous check and the first await. Two concurrent calls to addService() / updateService() / removeService() from concurrent HTTP requests will both pass the if (this.writeLock) check before either sets it to true. The result: concurrent writes that silently corrupt the config file. A proper async queue or write-coalescing approach is needed.

7. timingSafeEqual is duplicated in auth.ts and auth-provider.ts

The function is implemented twice, identically. The one in auth.ts is used by the deprecated checkAuth(). Not a bug, but the dead code / duplication should be removed since checkAuth itself is deprecated and appears unused outside tests.


🟡 Resource Leaks

8. AbortController timer leak on tool call timeout (src/daemon/server.ts:113)

const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), timeout);

let sdkResult: ...;
try {
  sdkResult = await Promise.race([...]);
} finally {
  clearTimeout(timer);
}

The clearTimeout(timer) is in the inner finally, which is correct. However, the AbortController signal listener (controller.signal.addEventListener("abort", ...)) is never removed. Each timed-out call leaks a listener. Use { once: true } in addEventListener or removeEventListener in the finally block.

9. No request body size limit

await req.json() is called on several endpoints (/call, /list-tools, /schema, /api/services, etc.) without checking Content-Length first. A malicious (or buggy) client can send a multi-GB body and exhaust heap memory. Bun's built-in maxRequestBodySize should be set on Bun.serve() options.


🟡 Missing Error Handling

10. importFromUrl doesn't time out (src/daemon/config-manager.ts:141)

const response = await fetch(url);

No AbortSignal / timeout. A slow or hanging remote server will stall the request handler indefinitely, preventing other requests from being processed (Bun's event loop is single-threaded). Add fetch(url, { signal: AbortSignal.timeout(10_000) }).

11. /api/auth/login doesn't validate username/password field types (src/daemon/server.ts:238-239)

const body = await req.json() as { username?: string; password?: string };
if (!body.username || !body.password) { ... }

body.username could be a number or object; the truthiness check passes through, and timingSafeEqual gets a non-string. The TextEncoder call on a non-string argument produces unexpected results. Explicit typeof body.username !== "string" check is needed.

12. Config reload doesn't update the config passed to createDaemonServer (src/daemon/index.ts:129)

const server = createDaemonServer({
  ...
  config,          // ← stale reference captured at startup
  configManager,
  ...
});

Inside the server, getConfig() correctly uses configManager.getServices() when configManager is present. But if for some reason configManager is undefined, config falls back to the stale value. This is fine given the current code always passes configManager, but fragile.


🟠 TypeScript Strictness

13. Unsafe as casts throughout server.ts

const body = (await req.json()) as DaemonCallRequest;     // line 105
const body = (await req.json()) as DaemonListToolsRequest; // line 166
const body = (await req.json()) as DaemonSchemaRequest;    // line 181
const body = await req.json() as { username?: string; password?: string }; // line 238

req.json() returns unknown. Casting to a typed interface without runtime validation means a malformed request body silently passes the type system and then throws at runtime with a confusing error. These should go through Zod or at minimum a type guard before the cast.

14. authCtx is null for non-exempt paths that weren't explicitly authenticated (server.ts:82)

let authCtx: AuthContext | null = null;
if (!isAuthExempt(path)) {
  authCtx = authenticateRequest(req, authProvider);
  if (!authCtx) { return 401; }
  ...
}
// authCtx is null for ALL exempt paths from here

At /api/auth/me (not in exempt list), authCtx is populated correctly. But any new path added to AUTH_EXEMPT_PATHS in the future without updating the downstream code could accidentally use authCtx?.userId ?? "anonymous" and leak a default value. Minor, but the type of authCtx being nullable for the entire handler scope after the check makes the flow confusing.

15. ui.ts JavaScript uses var-style global leaks

let editMode = null;    // ui.ts:269
let pollTimer = null;   // ui.ts:270
let currentRole = 'admin';  // ui.ts:271 — initialized to 'admin' before auth

currentRole defaults to 'admin' before the auth check completes. There's a brief window at page load where isAdmin() returns true and admin UI elements might render visible before applyRoleUI() is called. The applyRoleUI() call happens after the async init() function completes, so there's a visible flash of admin controls on every load.


✅ What's Done Well

  • Timing-safe comparisons for auth token and password checks
  • Zod validation for all config mutations
  • structuredClone on getServices() to prevent external mutation
  • The AuthProvider interface design is clean for future OAuth swap-in
  • Role hierarchy in ROLE_PERMISSIONS is explicit and testable
  • AbortController timeout on tool calls (issue is just the listener cleanup)
  • Prometheus metrics are properly labeled and include histograms

Summary of Must-Fix Before Merge

  1. Test credentials in git history — replace with obviously-fake fixture values, consider squashing/rebasing
  2. /api/auth/me default role should not be "admin" — change fallback to "viewer" or require auth on the endpoint
  3. SSRF in importFromUrl — validate URL scheme and reject RFC-1918/loopback addresses
  4. writeLock race — use a proper async queue for concurrent write safety
  5. importFromUrl no timeout — add AbortSignal.timeout(10_000)

- CSS: .hidden { display: none !important } to override .login-overlay flex
- Race condition: authLock prevents 401 re-showing overlay during login
- Debounce: require 2 consecutive auth failures before showing login
- await refresh() before starting poll interval
- Error handling: try/catch on doLogin() and init()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 10, 2026

Claude finished @rodaddy's task —— View job


PR Review: feat: Phase 5 -- Web Management UI with RBAC Auth

  • Read key source files
  • Review security issues
  • Review bugs and logic errors
  • Review resource leaks
  • Review error handling
  • Review TypeScript strictness

Overall: solid architecture with a clean AuthProvider interface, timing-safe comparisons, and Zod validation throughout config mutations. Issues are ordered by severity.


🔴 Critical Security Issues

1. Hardcoded credentials in committed test file (tests/daemon/auth-provider.test.ts:18-27)

GitGuardian already flagged this. The constants "admin-secret", "agent-secret", "s3cret-pass!", and "agent-pass-42" are permanently in git history. While test-only, these values look realistic enough that someone copy-pasting them into tokens.json would be using publicly-known credentials. Replace with obviously-fake values ("test-token-admin", "test-password-placeholder") and consider squashing the commit.

2. /api/auth/me defaults role to "admin" when authCtx is null (src/daemon/server.ts:266)

role: authCtx?.role ?? "admin",   // ← "admin" fallback is wrong

authCtx is null for all AUTH_EXEMPT_PATHS. The / path is exempt (auth.ts:12), but /api/auth/me is not exempt, so in normal authenticated flow authCtx is populated correctly. However, when auth is disabled (provider.enabled === false), authenticateRequest returns { role: "admin" } — so this is actually correct in that case. The bug: if auth is enabled and a future code path sets authCtx to null on /api/auth/me (e.g. via a middleware refactor), the fallback would silently grant admin. The safe fix is ?? "viewer". The UI (ui.ts:334) sets currentRole = me.role || 'viewer' which at least has the right fallback, but the server-side default is still wrong.

3. SSRF via /api/services/import (src/daemon/config-manager.ts:141)

const response = await fetch(url);   // no scheme/host validation

An admin can direct the daemon to fetch http://169.254.169.254/metadata, file:///etc/passwd, or internal services. The buildGitHubRawUrl() helper constrains to raw.githubusercontent.com but the server-side import endpoint accepts url directly, bypassing it. At minimum, reject non-https:// schemes and RFC-1918 / loopback hosts before calling fetch().

4. /metrics leaks service names and tool patterns without auth (src/daemon/auth.ts:12)

/metrics is in AUTH_EXEMPT_PATHS. The Prometheus output includes per-service and per-tool labels (service="${service}",tool="${tool}"). For a network-exposed daemon, this leaks the MCP service topology to any unauthenticated client on the network. Consider moving /metrics behind auth or requiring a scrape token.


🟡 Bugs and Logic Errors

5. doLogout() resets currentRole to 'admin' instead of 'viewer' (src/daemon/ui.ts:383)

function doLogout() {
  clearToken();
  currentRole = 'admin';   // ← should be 'viewer' or null
  showLogin('');
  ...
}

After logout, before the login overlay appears, isAdmin() returns true and admin controls are momentarily active. It also means if showLogin() fails to hide content (e.g. if the overlay CSS breaks), admin buttons are visible to unauthenticated users.

6. Route ordering: /api/services/reload and /api/services/import match the parameterized [^/]+ regex (src/daemon/server.ts:309-383)

The putMatch/deleteMatch regex ^\/api\/services\/([^/]+)$ matches /api/services/reload and /api/services/import. This works correctly today because neither is a PUT or DELETE request, but it's fragile — the reload and import routes at lines 373/386 are checked after the parameterized routes. If someone adds PUT /api/services/reload the parameterized PUT handler will eat it silently. Move reload and import routes before the [^/]+ parameterized routes.

7. writeLock is not a proper async mutex (src/daemon/config-manager.ts:216-225)

private writeLock = false;
private async writeToDisk(): Promise<void> {
  if (this.writeLock) { throw ... }  // ← check
  this.writeLock = true;             // ← set (not atomic!)
  try { await Bun.write(...); } finally { this.writeLock = false; }
}

Two concurrent HTTP requests calling addService() can both pass the if (this.writeLock) check before either sets it to true, since both checks happen synchronously before the first await. The result is concurrent writes that can corrupt services.json. Replace with an async promise-chain queue or queueMicrotask-based mutex.

8. service and tool metric label values come from user-controlled body.service/body.tool (src/daemon/server.ts:106-107, metrics.ts:160)

const key = `${service}.${tool}`;   // → label in Prometheus output

Prometheus labels appear in the unauth-accessible /metrics endpoint. A client sending service="} # injection" could corrupt the Prometheus output. The metric key should be sanitized (strip {, }, ", \n) before inclusion in labels.


🟡 Resource Leaks

9. AbortController signal listener is never removed (src/daemon/server.ts:123-128)

new Promise<never>((_, reject) => {
  controller.signal.addEventListener("abort", () => reject(...));
})

The clearTimeout(timer) in finally is correct, but the abort listener is never removed from controller.signal. Each timed-out call leaks a listener. Add { once: true } to the addEventListener call — it's a single-fire event so this is the correct fix.

10. No request body size limit

await req.json() on /call, /list-tools, /schema, /api/services, and /api/auth/login has no Content-Length guard. A client can exhaust heap with a multi-GB body. Set maxRequestBodySize in Bun.serve() options (e.g. 1 * 1024 * 1024 for 1MB).


🟡 Missing Error Handling

11. importFromUrl has no timeout (src/daemon/config-manager.ts:141)

const response = await fetch(url);   // hangs indefinitely on slow/unresponsive hosts

A slow remote will stall the event loop indefinitely. Add AbortSignal.timeout(10_000):

const response = await fetch(url, { signal: AbortSignal.timeout(10_000) });

12. /api/auth/login doesn't validate field types before timingSafeEqual (src/daemon/server.ts:238-241)

const body = await req.json() as { username?: string; password?: string };
if (!body.username || !body.password) { ... }

body.username could be a number or object — the truthiness check passes, and TextEncoder.encode(42) produces unexpected bytes. Add typeof body.username !== "string" checks before calling authenticateBasic.

13. init() in ui.ts has no timeout on the initial auth probe (src/daemon/ui.ts:623)

const probe = await fetch('/api/services');

If the daemon is slow to respond, the UI hangs on the loading screen. Add AbortSignal.timeout(5000) and show a user-visible error on timeout.


🟠 TypeScript Strictness

14. Unsafe as casts without runtime validation (src/daemon/server.ts:105, 166, 181, 238)

All four req.json() as SomeType casts bypass the type system. A malformed body silently produces runtime errors with unhelpful messages. These should either use Zod (which is already a dependency) or at minimum a type guard before the cast. The auth login body is particularly risky since it feeds directly into auth logic.

15. authCtx nullable throughout entire handler scope (src/daemon/server.ts:82-415)

authCtx is null for all exempt paths and for paths where auth succeeded. Every downstream use of authCtx in the handler requires ?. which obscures whether a given path guarantees auth or not. After the auth block, authCtx should be narrowed or a non-nullable shadow declared:

const authedCtx = authCtx!; // only after the auth block has returned 401 if null

16. timingSafeEqual is duplicated in auth.ts:121 and auth-provider.ts:246

Identical implementation exists in both files. The one in auth.ts serves only the deprecated checkAuth() function. Remove checkAuth() and the local timingSafeEqual from auth.ts (or export the one from auth-provider.ts).


✅ What's Done Well

  • Timing-safe comparisons applied consistently to both token and password checks
  • authenticateBasic iterates all entries unconditionally (prevents user-enumeration timing leak)
  • Zod validation on all ConfigManager mutations
  • structuredClone on getServices() prevents external mutation
  • AuthProvider interface is clean and swap-ready for OAuth/OIDC
  • ROLE_PERMISSIONS is explicit and well-tested
  • AbortController with clearTimeout on tool calls (listener cleanup is the only gap)
  • Prometheus histograms with proper bucket boundaries
  • Graceful shutdown with 10s force-exit fallback

Must-Fix Before Merge

# Issue File
1 Hardcoded credentials in git history tests/daemon/auth-provider.test.ts:18-27
2 /api/auth/me role fallback should be "viewer" src/daemon/server.ts:266
3 SSRF: no URL validation in importFromUrl src/daemon/config-manager.ts:141
4 writeLock race condition on concurrent writes src/daemon/config-manager.ts:216
5 importFromUrl has no fetch timeout src/daemon/config-manager.ts:141
6 doLogout resets role to 'admin' src/daemon/ui.ts:383
7 Prometheus label injection via user-supplied service/tool names src/daemon/metrics.ts:160

@rodaddy rodaddy merged commit 098acbe into main Mar 10, 2026
3 of 4 checks passed
@rodaddy rodaddy deleted the feat/network-deployment branch March 10, 2026 04:32
rodaddy added a commit that referenced this pull request Mar 13, 2026
* feat: network deployment with TCP binding, auth, metrics, and remote client

Add TCP listen mode (MCP2CLI_LISTEN_HOST/PORT) so the daemon can serve
multiple machines over HTTP. Bearer token auth with timing-safe comparison
protects authenticated endpoints while /health and /metrics remain open.

Prometheus metrics at GET /metrics expose request counts, latency histograms,
connection pool state, memory usage, and auth failures per service/tool.

Remote client mode (MCP2CLI_REMOTE_URL) lets the CLI connect to a remote
daemon instead of starting a local one. Includes a bash wrapper
(scripts/mcp2cli-remote) for curl-only machines.

LXC deployment files in deploy/ with systemd unit, env template, services
config, and setup script. Ansible playbook lives in the infrastructure repo.

Fixes idle timer firing immediately when timeoutMs=0 (TCP mode) by guarding
touch() against zero/negative values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: web management UI with RBAC auth and CI/CD deploy pipeline

Phase 5 implementation:
- ConfigManager with CRUD operations and disk persistence
- REST API for service management (add/update/remove/import/reload)
- Embedded web UI with dark theme, login (username+password and token tabs)
- Pluggable AuthProvider interface with TokenAuthProvider (RBAC)
- Three roles: admin (full), agent (tools+read), viewer (read-only)
- CI/CD deploy pipeline: build, deploy to CT 216, health check, auto-rollback
- Cross-compile for Linux x64 in deploy job
- tokens.json gitignored (secrets)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: web UI login overlay race condition and CSS specificity bug

- CSS: .hidden { display: none !important } to override .login-overlay flex
- Race condition: authLock prevents 401 re-showing overlay during login
- Debounce: require 2 consecutive auth failures before showing login
- await refresh() before starting poll interval
- Error handling: try/catch on doLogin() and init()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant