Skip to content

HTTP hardening — timeouts, admission control, middleware chain, and CORS#774

Merged
vfusco merged 17 commits intonext/2.0from
feature/http-hardening
Apr 22, 2026
Merged

HTTP hardening — timeouts, admission control, middleware chain, and CORS#774
vfusco merged 17 commits intonext/2.0from
feature/http-hardening

Conversation

@vfusco
Copy link
Copy Markdown
Collaborator

@vfusco vfusco commented Apr 17, 2026

Summary

Hardens the three HTTP surfaces (inspect, JSON-RPC, telemetry) against untrusted traffic. All controls are defense-in-depth layers behind a reverse proxy — disabled or conservative by default.

Middleware chain

Every HTTP handler is wrapped in a standard chain: Recover → RequestID → CORS → Admission → handler.

  • Recover catches panics and writes a generic 500 — Go error values never reach the response body. If bytes were already flushed, it re-panics with http.ErrAbortHandler to drop the connection cleanly instead of producing a corrupt response.
  • RequestID validates upstream X-Request-ID against a safe charset ([A-Za-z0-9._:=/+-]{1,128}) or generates a UUIDv4. Echoed on every response for log correlation.
  • CORS is disabled by default. When configured via CARTESI_*_CORS_ALLOWED_ORIGINS, only exact-match origins are reflected — no wildcard. Preflight
    (OPTIONS) is short-circuited before admission so it never consumes a concurrency permit.
  • Admission caps concurrent in-flight requests per surface (default 64). Rejects with 503 + jittered Retry-After: [1,3] to prevent thundering herds.
    Disabled with MAX_INFLIGHT=0.

Design choices

  • Per-surface timeout presets (DefaultInspectOptions, DefaultJSONRPCOptions, DefaultTelemetryOptions) are constructor functions returning fresh values — not mutable package globals.
  • Per-request context deadline on inspect: context.WithTimeout(InspectMaxDeadline + 30s) is set after app resolution in Inspector.ServeHTTP. The HTTP WriteTimeout (600s) is a backstop for leaked goroutines, not the deadline enforcer. This structurally eliminates the need to coordinate WriteTimeout with per-app InspectMaxDeadline.
  • Per-app fail-fast on inspect: the machine semaphore uses TryAcquire (not blocking Acquire), so one saturated application cannot starve others — its excess requests fail at the per-app gate and free HTTP-global capacity.
  • corsWriter implements Unwrap/Flush/Hijack to preserve the http.ResponseWriter wrapper chain. Without Unwrap, http.MaxBytesReader cannot walk to the real *http.response to force-close the connection after 413.
  • Error responses are text/plain, not JSON-RPC envelopes. Admission 503 and panic 500 happen before the request reaches the JSON-RPC handler — they are transport-level errors. JSON-RPC SDKs surface them as transport failures, which is the correct signal.
  • Legacy internal/services/ deleted. The old Access-Control-Allow-Origin: * middleware and HttpService test helper are replaced by pkg/service/CORSMiddleware and httptest.NewServer.

Operator deployment guide

docs/http-posture.md documents the full HTTP posture for operators:

  • Deployment model: the node assumes a trusted network boundary. HTTP controls are defense-in-depth, not a substitute for a reverse proxy (which should provide TLS, rate limiting, auth, and connection caps).
  • Bind defaults: all services bind to :PORT (all interfaces) for Docker compatibility. A startup warning fires for every unspecified bind address.
    Bare-metal deployments should override to loopback.
  • Timeouts: per-surface presets documented in a table. Inspect WriteTimeout=600s is a backstop; the per-request context deadline is the real enforcer.
  • Admission control: independent budgets per surface (inspect and JSON-RPC). MAX_INFLIGHT=0 disables. Rejection is silent (no per-request log) to avoid amplifying floods. Jittered Retry-After desynchronizes retries.
  • Request bodies: inspect 2 MiB, JSON-RPC 1 MiB, enforced by MaxBytesReader with connection close on oversize. Worst-case body buffer memory at default concurrency: 128 MiB (inspect) + 64 MiB (JSON-RPC).
  • CORS: disabled by default, exact-match allowlist, null origin rejected. For production, prefer handling CORS at the reverse proxy.
  • PostgreSQL pool sizing: rule of thumb pool_max_conns ≥ JSONRPC_MAX_INFLIGHT + steady-state writers. Documents the fail-fast (admission 503) vs fail-slow (pool acquire block) distinction.
  • Known limitations: two-layer admission on inspect (HTTP-global + per-app), single-replica assumption for the default 64.

@vfusco vfusco added this to the 2.0.0 milestone Apr 17, 2026
@vfusco vfusco self-assigned this Apr 17, 2026
@vfusco vfusco force-pushed the feature/http-hardening branch from 99e10a5 to bafea96 Compare April 17, 2026 15:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens the node’s HTTP-facing surfaces (inspect, JSON-RPC, telemetry) by introducing a shared middleware chain and server-option presets, plus adding admission control, stricter error handling, and operator-facing documentation/config knobs.

Changes:

  • Add pkg/service HTTP hardening utilities (server option presets, Recover/RequestID/CORS/Admission middleware, bind-exposure warnings) with extensive unit tests.
  • Rewire inspect + JSON-RPC servers to use the standardized middleware chain, body-size enforcement, and configurable CORS/admission settings; make per-app inspect concurrency fail fast.
  • Update integration log scanning to support level-scoped expected logs; remove legacy internal/services HTTP/CORS helpers; add operator guide (docs/http-posture.md).

Reviewed changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/integration/snapshot_policy_test.go Switch integration allowlist to level-scoped expected log entries.
test/integration/restart_test.go Switch integration allowlist to level-scoped expected log entries.
test/integration/logscanner_test.go Replace “allowed errors” with level-scoped expected logs and stricter matching.
test/integration/echo_authority_test.go Update expectations: 404 inspect logs at Info, not Error.
pkg/service/telemetry_test.go Add coverage for hardened telemetry server wiring and panic recovery.
pkg/service/service.go Update telemetry server creation/start semantics and wiring.
pkg/service/http_server_test.go Add unit tests for HTTP server presets + bind warning + internal error writer.
pkg/service/http_server.go Introduce HTTP server option presets, server builder, bind warnings, internal error response helper.
pkg/service/http_middleware_test.go Add tests for RequestID + Recover middleware + writer wrapper behavior.
pkg/service/http_middleware.go Implement RequestID + Recover middleware and response-writer wrapper.
pkg/service/http_cors_test.go Add tests for CORS parsing and middleware behavior (preflight, vary, unwrap chain).
pkg/service/http_cors.go Implement CORS config parsing + strict origin reflection + preflight short-circuiting.
pkg/service/http_admission_test.go Add tests for semaphore admission + middleware rejection semantics.
pkg/service/http_admission.go Implement concurrency admission control with jittered Retry-After.
internal/services/http_test.go Delete legacy HTTP service test helper.
internal/services/http.go Delete legacy permissive CORS middleware and HTTP service wrapper.
internal/manager/instance_test.go Update inspect-capacity test to match fail-fast semantics.
internal/manager/instance.go Change per-app inspect concurrency to fail fast with a typed sentinel error.
internal/jsonrpc/util_test.go Allow tests to configure JSON-RPC max-inflight/CORS via config.
internal/jsonrpc/service_test.go Add tests for hardened JSON-RPC server options, RequestID, admission, CORS, body limit.
internal/jsonrpc/service.go Rewire JSON-RPC server with standard middleware chain + NewHTTPServer + configurable CORS/admission.
internal/jsonrpc/jsonrpc.go Return 413 for MaxBytesReader overflow.
internal/inspect/inspect_test.go Replace legacy HTTP test harness with httptest.Server.
internal/inspect/inspect.go Rework inspector construction to use hardened server/middleware chain; enforce body cap with MaxBytesReader; add per-request deadlines and fail-fast capacity handling.
internal/inspect/hardening_test.go Add comprehensive inspect hardening tests (413, 405+Allow, generic 500, chain order, CORS/admission).
internal/config/generated.go Add config env vars + defaults for inspect/jsonrpc CORS and max in-flight; propagate through NodeConfig conversions.
internal/config/generate/code.go Fix config getter generation for string defaults of "" (use viper.IsSet).
internal/config/generate/Config.toml Define new HTTP hardening env vars and descriptions.
internal/advancer/service.go Construct and run inspector via new API; wire admission + CORS config; shutdown via inspector API.
go.mod Promote github.com/google/uuid to direct dependency.
docs/http-posture.md Add operator deployment guide describing HTTP posture, timeouts, admission, CORS, and error semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/inspect/hardening_test.go Outdated
Comment thread docs/http-posture.md Outdated
Comment thread pkg/service/http_admission.go Outdated
Comment thread internal/jsonrpc/service_test.go Outdated
@vfusco vfusco force-pushed the feature/http-hardening branch from bafea96 to 6a0f5d4 Compare April 17, 2026 16:07
@vfusco vfusco marked this pull request as ready for review April 17, 2026 16:09
mpolitzer
mpolitzer previously approved these changes Apr 22, 2026
@github-project-automation github-project-automation Bot moved this from Todo to Waiting Merge in Rollups SDK Apr 22, 2026
renatomaia
renatomaia previously approved these changes Apr 22, 2026
Comment thread internal/inspect/hardening_test.go Outdated
Comment thread internal/jsonrpc/service.go Outdated
@vfusco vfusco dismissed stale reviews from renatomaia and mpolitzer via 0a37a24 April 22, 2026 19:10
@vfusco vfusco force-pushed the feature/http-hardening branch from 0a37a24 to 95a9ecb Compare April 22, 2026 20:12
@vfusco vfusco merged commit 95a9ecb into next/2.0 Apr 22, 2026
8 checks passed
@vfusco vfusco deleted the feature/http-hardening branch April 22, 2026 20:40
@github-project-automation github-project-automation Bot moved this from Waiting Merge to Done in Rollups SDK Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants