Skip to content

Security audit fixes + hardening (v0.11.0)#1

Merged
thinkbig1979 merged 10 commits into
mainfrom
security/audit-fixes
Jun 3, 2026
Merged

Security audit fixes + hardening (v0.11.0)#1
thinkbig1979 merged 10 commits into
mainfrom
security/audit-fixes

Conversation

@thinkbig1979

Copy link
Copy Markdown
Owner

Full security audit of Capstan with fixes across severity levels, plus a docs refresh.

Fixes

  • C1 (Critical): DR-restore destination is derived server-side (the configured restic repo), removing a client-controlled localRepoPath that fed rclone sync (arbitrary host overwrite).
  • H1/H2/H3: symlink-aware restore confinement; at-rest secret key via HKDF from a dedicated STORAGE_KEY (decoupled from JWT_SECRET, legacy-decrypt fallback); login always runs bcrypt (no username-enumeration timing oracle).
  • H4/H5: dependency CVE bumps (go-git, x/net, circl, docker→v28, axios≥1.16, fast-uri) + patched Go 1.25.11 toolchain. govulncheck reduced to 2 unfixable-upstream docker/docker advisories (init-only reachability).
  • M1/M2/M5: shared pathutil.IsContained (symlink-aware containment) wired into git path resolution and validateStackPath; snapshot-ID validation + -- guard.
  • M3/M4/M6/M7: Secure cookie via real TLS/X-Forwarded-Proto; reject pasted SSH key material; config-driven CSP connect-src; nginx security headers.
  • L1/L2/L4/L5/L8/L9/L10: fail-closed secret storage; JWT issuer binding; HSTS only over HTTPS; header hygiene; -- guard on compose service names; pinned base images; .env.example clarifications.

Behavior change

JWTs now carry an iss claim that validators require, so existing sessions are invalidated on upgrade (one-time re-login). Stored secrets remain readable (legacy key) and re-encrypt on next save.

Docs

README refreshed to match the implementation and reference only tracked files (dropped .agent-os, CLAUDE.md, and Supporting-Docs/* links; fixed Quick Start and API paths).

Backend 636 tests + frontend 526 tests pass; vet clean; frontend audit clean. Full report in Supporting-Docs/security/ (local).

thinkbig1979 and others added 10 commits June 3, 2026 14:28
…in timing oracle (H3)

C1 (Critical): POST /backups/dr-restore accepted a client-supplied
localRepoPath that flowed unvalidated into `rclone sync <remote> <path>`,
deleting/overwriting files at any host path (running as root). The frontend
never sent this field — it was attacker-only input. The restore destination
is now derived server-side as <DataDir>/dr-restore and is no longer a
parameter on RunDRRestore/LaunchDRRestore or the request body.

H3 (High): the login handler returned before any bcrypt comparison when the
username did not exist (~18µs) vs ~tens of ms when it existed, leaking valid
usernames via response latency. Login now always performs a bcrypt comparison
against a dummy hash on the missing-user path, equalizing timing.

Regression tests added:
- TestRunDRRestore_ConfinesDestinationToDataDir
- TestAuthHandler_Login_UnknownUserPerformsBcrypt

Refs: Supporting-Docs/security/security-audit-2026-06-03.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Backend (govulncheck-verified):
- go-git/v5 v5.16.5 -> v5.17.1 (GO-2026-4910, GO-2026-4909 — resolved)
- golang.org/x/net v0.49.0 -> v0.53.0 (GO-2026-4918 x/net side — resolved)
- cloudflare/circl v1.6.1 -> v1.6.3 (GO-2026-4550 — resolved)
- docker/docker v26.1.4 -> v28.5.2 (latest Moby; required an API type
  migration: StatsJSON->container.StatsResponse, prune reports and network
  options into their subpackages, EventsOptions->events.ListOptions)
- go.mod: pin toolchain go1.25.11; Dockerfiles golang:1.24-alpine ->
  golang:1.25.11-alpine. Clears ~20 reachable stdlib CVEs (crypto/tls,
  net/http cookie-parse memory exhaustion GO-2025-4012, html/template, etc.).

After these changes `govulncheck ./...` under go1.25.11 reports only
GO-2026-4887 and GO-2026-4883 (docker/docker). Both are "Fixed in: N/A"
(no upstream patch exists) and are reachable only via package init, not the
vulnerable daemon-side AuthZ plugin path — this app is a Docker client and
never executes that code. Tracked as accepted pending an upstream fix.

Frontend (pnpm audit --prod now clean):
- axios ^1.13.5 -> ^1.17.0 (cloud-metadata exfil GHSA-fvcv-3m26-pcqx,
  proxy MITM / NO_PROXY bypass CVE-2025-62718, prototype pollution)
- pnpm override fast-uri@<3.1.1 -> >=3.1.2 (path traversal, host confusion)

Backend: 621 tests pass, vet clean. Frontend: 526 tests pass, build clean.

Refs: Supporting-Docs/security/security-audit-2026-06-03.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…n (H1/M2/M1/M5)

Adds internal/pathutil.IsContained: a symlink-aware containment check that
resolves the deepest existing ancestor of a (possibly not-yet-created) target
via EvalSymlinks before a trailing-separator prefix comparison. A purely
lexical check let a symlink inside a root escape it on read/write.

Wired through all three confinement sites:
- H1: RunRestore restore-target check now uses pathutil.IsContained, so a
  symlink inside a stack dir cannot redirect `restic restore --target` outside.
- M2: validateStackPath (underpins every compose/env read+write) is now
  symlink-aware.
- M1: git resolvePathFromStack replaced its naive strings.HasPrefix (which let
  /stacks-evil pass for a /stacks root) with pathutil.IsContained, fixing both
  the sibling-prefix bypass and symlink escape.

M5: previewSnapshot now validates the snapshot ID against a hex/"latest"
pattern, and RestorePreview inserts "--" before the ID so a "-"-prefixed value
cannot be parsed as a restic flag.

Regression tests: pathutil (5, incl. symlink escape + nonexistent target),
validateStackPath symlink escape, previewSnapshot malformed-ID rejection.
630 tests pass, vet clean.

Refs: Supporting-Docs/security/security-audit-2026-06-03.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…T_SECRET (H2)

The at-rest encryption key for stored secrets (git HTTPS token, restic
password) was key = SHA-256(JWT_SECRET): the same secret signed JWTs and
encrypted data, and a single SHA-256 pass is not a KDF, so a low-entropy
JWT_SECRET yielded a brute-forceable AES key.

TokenEncryptor now derives its primary AES-256-GCM key with HKDF-SHA256 from a
dedicated STORAGE_KEY (falling back to JWT_SECRET when unset, for compatibility),
domain-separated with an "info" label. A legacy AEAD keyed by SHA-256(JWT_SECRET)
is retained for DECRYPTION ONLY, so secrets written by the previous scheme stay
readable and are transparently re-encrypted under the primary key on next write.

- config: new STORAGE_KEY env var; documented in .env.example files.
- NewTokenEncryptor(storageSecret, jwtSecret); Decrypt tries primary then legacy.

Regression tests: round-trip, legacy-ciphertext decrypt (backward compat),
and primary-key-depends-on-storage-key (JWT_SECRET disclosure alone can't
decrypt). 633 tests pass, vet clean.

Refs: Supporting-Docs/security/security-audit-2026-06-03.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
M3: the Secure flag on auth/CSRF cookies now follows the real request scheme
(middleware.IsSecureRequest: TLS or X-Forwarded-Proto=https) instead of a
Host-substring heuristic that "localhost.evil.com" could downgrade.

M4: git_ssh_key is a path to a key file; UpdateGitSettings now rejects pasted
private-key material (PEM/OpenSSH headers, multi-line input) so key bytes are
never stored in or echoed back from settings.

M6: CSP connect-src is built from the configured CORS origins (+ ws/wss
variants); localhost variants are added only in dev (AUTH_DISABLED), so a
cross-origin reverse-proxy deployment is no longer blocked by a localhost-only
policy.

M7: frontend/nginx.conf now sets X-Frame-Options, X-Content-Type-Options,
Referrer-Policy and a frame-ancestors CSP on the served SPA + assets (the Go
CSP never reaches the browser in the split deployment). It also preserves an
upstream proxy's X-Forwarded-Proto instead of always overwriting with $scheme,
so M3's Secure-cookie logic works behind external TLS termination.

L4: HSTS is only emitted over HTTPS (avoids includeSubDomains pinning over
plaintext/dev). L5: X-XSS-Protection set to 0 (deprecated; CSP supersedes) and
Referrer-Policy added to the backend headers.

Regression tests: Secure-cookie-from-XFP, looksLikePrivateKey, and
config-driven connect-src. 636 tests pass, vet clean. nginx.conf syntax
validated via nginx:alpine (only the compose-network upstream is unresolved
standalone).

Refs: Supporting-Docs/security/security-audit-2026-06-03.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
L1 (fail closed): DB.SetSetting now refuses to persist a sensitive setting
(restic_password, git_https_token) when no encryptor is configured, instead of
silently writing plaintext. Tests that exercise these flows now use an
encryptor-backed DB.

L2 (JWT issuer): issued tokens carry iss="capstan" and both validators
(handlers.parseJWT, middleware.ValidateJWT) require it, binding tokens to this
app. NOTE: deploying invalidates existing sessions (one-time re-login); the
risk was already largely mitigated by the jti->session-row lookup.

L8: docker compose pull/up for single-service updates now pass "--" before the
service name so a "-"-prefixed compose service name cannot be parsed as a flag
(defense-in-depth; service names come from compose labels, not HTTP).

L9: pin floating base images — backend/Dockerfile alpine:latest -> alpine:3.21
(matches the canonical docker/Dockerfile), frontend/Dockerfile nginx:alpine ->
nginx:1.27-alpine.

L10: backend/.env.example now defaults AUTH_DISABLED=false with a trusted-network
warning, and clarifies GIT_SSH_KEY is a file path (not key contents).

Accepted (no change): L3 'unsafe-eval' (required by recharts; documented in the
CSP comment), L6 username in failed-login logs (useful for monitoring), L7
cross-stack snapshot enumeration (single-admin trust model).

636 tests pass, vet clean.

Refs: Supporting-Docs/security/security-audit-2026-06-03.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…arate dir

Follow-up to the C1 fix: the server-derived DR-restore destination must be the
configured local restic repository (bc.ResticRepository, default
<DataDir>/restic-repo) so the fetched repo lands where restic operations expect
it — as the docs describe. The interim "<DataDir>/dr-restore" subdir would have
restored the repository to the wrong location. Still server-derived (never from
client input), so the C1 guarantee holds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…security changes

Update deploy docs/config for the security hardening:
- README + docker/compose.yaml + docker-compose.prod.yaml: add STORAGE_KEY
  (at-rest secret encryption, independent of JWT_SECRET; falls back to it).
- README Security Considerations: note the reverse proxy must forward
  X-Forwarded-Proto: https for Secure cookies/HSTS, and that upgrading
  invalidates existing sessions (one-time re-login).
- README: Go 1.24 -> 1.25.
- backup_test.go: clarify the DR-restore test asserts a client-supplied
  localRepoPath is ignored (C1).

(Supporting-Docs/Deployment.md, which is gitignored, was updated on disk with
the matching STORAGE_KEY var, X-Forwarded-Proto requirement callout, deprecated
X-XSS-Protection example fix, and upgrade note.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summarize the application-level hardening for prospective users: auth defaults
and session handling, secrets encrypted at rest, shell-free command execution,
symlink-aware path confinement, CSRF/CORS/WebSocket-origin protections, security
headers, and dependency scanning. Keeps the honest framing that Docker socket
access is the real trust boundary and cross-links the existing section.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Fix the header logo to a tracked asset (frontend/public/capstan.svg); the old
  Supporting-Docs/Branding path is untracked and rendered broken on GitHub.
- Correct Quick Start: start-local.sh serves the all-in-one image on :5001 (not
  3001); document the native dev path (backend :5001 + Vite :5173).
- Fix API endpoint paths to the real /api/v1 routes (git under /api/v1/git,
  backups under /api/v1/backups, auth setup/me, compose-env), and the Backups
  curl examples (/api/v1/backups/...).
- Remove all references to untracked files: .agent-os (Project Structure),
  CLAUDE.md, and every Supporting-Docs/* guide link (Deployment, Migration,
  Troubleshooting, Volume-Path-Identity). Production/migration/volume guidance is
  now self-contained.
- Generalize stale commands (docker compose v2; logs without a 'backend' service).

Also revert backend/.env.example to AUTH_DISABLED=true: it is the LOCAL DEV
example consumed by start-local.sh/run-local.sh, and AUTH_DISABLED=false without
a JWT_SECRET makes the dev backend fail to start. Added a dev-only warning; the
production default lives in the root .env.example.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@thinkbig1979 thinkbig1979 merged commit a73b848 into main Jun 3, 2026
@thinkbig1979 thinkbig1979 deleted the security/audit-fixes branch June 3, 2026 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant