Codex Load Balancer

Codex load balancer is a pragmatic reverse proxy and load balancer for Codex. It aggregates multiple ChatGPT auth tokens, keeps usage in memory, and selects the best token per request to avoid rate limits.

Features

Token directory scan on startup and hot reload (polling).
Usage sync at startup and every 5 minutes by default.
Load balancing with weekly limit priority and 5-hour health degradation.
Session stickiness via common headers.
Automatic failover on rate limit responses.
WebSocket upgrade proxy support.
Per-request token usage persistence (input / cached / output) to SQLite.
Built-in web dashboard for global/account usage and quota status.
Stats dashboard for internal usage.

Requirements

Go 1.25+

Build

go build -o codex-load-balancer .

Run

./codex-load-balancer \
  --api-key your-api-key \
  --data-dir ./data \
  --port 8080 \
  --sync-interval 5m \
  --sync-concurrency 8

Flags:

--api-key (required): API key for protected proxy endpoints.
--data-dir (required): Directory containing active *.json auth files.
--port (optional): Listen port. Default 8080.
--sync-interval (optional): Usage sync interval. Default 5m.
--sync-concurrency (optional): Usage sync concurrency. Default 8.

Docker Compose

Put credential *.json files in ./data, then start the service:

CLB_API_KEY=your-api-key docker compose up -d --build

By default, Compose publishes 8080:8080. Override the host port when needed:

CLB_API_KEY=your-api-key CLB_PORT=9090 docker compose up -d --build

Compose passes runtime settings through environment variables:

CLB_API_KEY (required): API key for protected proxy endpoints.
CLB_PORT (optional): Host port to publish. Default 8080.
CLB_LISTEN_PORT (optional): Container listen port. Default 8080.
CLB_DATA_DIR (optional): Container data directory. Default /app/data.
CLB_SYNC_INTERVAL (optional): Usage sync interval. Default 5m.
CLB_SYNC_CONCURRENCY (optional): Usage sync concurrency. Default 8.

Notes:

Usage sync and dashboard state are stored in data-dir/clb.db.
The service no longer reads a TOML config file.

Token File Format

Codex load balancer stores Codex credential JSON. The proxy reads .tokens.access_token, .tokens.account_id, .tokens.refresh_token, optional .tokens.id_token, and .last_refresh from each *.json file. If id_token is present, its unsigned JWT claims are used only as a local hint for user_id and email; upstream requests still use .tokens.account_id for ChatGPT-Account-ID.

Example:

{
  "auth_mode": "chatgpt",
  "last_refresh": "2026-03-30T16:00:00Z",
  "created_at": "2026-03-30T16:00:00Z",
  "tokens": {
    "id_token": "...",
    "access_token": "...",
    "refresh_token": "...",
    "account_id": "account_123"
  }
}

Proxy Behavior

Allowed paths: /responses, /v1/responses, /models, and /v1/models only.
/v1/responses and /v1/models are normalized by stripping /v1 upstream.
Most request headers are preserved; Authorization is replaced and Accept-Encoding is removed so the proxy can inspect upstream response bodies.
For WebSocket upstream requests, Sec-WebSocket-Extensions is stripped so usage frames stay observable as plain JSON (no per-message compression).
Upstream base URL: https://chatgpt.com/backend-api/codex.
WebSocket (Upgrade: websocket) requests are proxied through the selected token.

Session Stickiness

If a request includes one of the following headers, Codex load balancer binds that session to a token:

session_id

If the bound token hits a limit error, Codex load balancer unbinds and reselects.

Load Balancing Rules

Filter out invalid, cooled down, or exhausted tokens.
Prefer higher weekly_limit.
If the top token has <30% 5-hour remaining and another token has higher 5-hour remaining, pick the healthier token.
If weekly limits tie, pick higher 5-hour remaining.

Rate Limit Handling

If the upstream responds with status 429, returns a Codex usage_limit_reached error, or emits a streamed Responses/WebSocket limit failure, the current token is cooled down and its sticky sessions are cleared. Non-stream requests are retried once with another token.

Usage Sync

Syncs at startup and every 5 minutes.
Uses https://chatgpt.com/backend-api/wham/usage.
Account metadata shown in the dashboard, including user_id, account_id, email, and plan_type, comes from the usage response in real time.
Per-account usage is grouped by a local identity key: user_id, then account_id. This keeps Business Team members separate when upstream returns the same account_id.
Before proxying or syncing usage, Codex load balancer refreshes stale access tokens from the stored refresh token.
On 401, Codex load balancer refreshes once and retries. If the token still stays unauthorized during usage sync, it removes the credential file and evicts the token from memory.

Dashboard

Endpoints:

GET /stats
GET /stats/overview

Auth:

No auth on /stats* (intended for trusted internal network only).

Dashboard data:

Overview cards: today, recent_7_days, recent_30_days, total with input_tokens, cached_tokens, output_tokens, reasoning_tokens.
Trend and composition: recent_90_days, trend.windows[] for 7/30/90 day UTC buckets, and composition for cached input / non-cached input / output split.
Current dashboard page load uses only /stats/overview.
Account table: account_key, user_id, account_id, email, plan_type, totals, per-account composition, and 5-hour / weekly quota usage from usage sync (/backend-api/wham/usage).
The dashboard loads Chart.js from a pinned CDN URL in web/index.html and Alpine.js from a pinned ESM CDN import in web/app.js; Tailwind CSS remains embedded from web/tailwind.css.

Logs

Codex load balancer logs structured events via log/slog to stdout.

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github		.github
web		web
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
auth.go		auth.go
auth_test.go		auth_test.go
compose.yaml		compose.yaml
dashboard.go		dashboard.go
dashboard_test.go		dashboard_test.go
endpoints.go		endpoints.go
entrypoint.sh		entrypoint.sh
flags.go		flags.go
flags_test.go		flags_test.go
go.mod		go.mod
go.sum		go.sum
main.go		main.go
proxy_http.go		proxy_http.go
proxy_retry_test.go		proxy_retry_test.go
proxy_token.go		proxy_token.go
proxy_tool_injection_test.go		proxy_tool_injection_test.go
proxy_writer.go		proxy_writer.go
proxy_writer_test.go		proxy_writer_test.go
proxy_ws.go		proxy_ws.go
rate_limits.go		rate_limits.go
refresh.go		refresh.go
refresh_test.go		refresh_test.go
routing.go		routing.go
routing_test.go		routing_test.go
server.go		server.go
server_websocket_test.go		server_websocket_test.go
store.go		store.go
store_test.go		store_test.go
syncer.go		syncer.go
syncer_test.go		syncer_test.go
testutil_test.go		testutil_test.go
tokens.go		tokens.go
tokens_reload_test.go		tokens_reload_test.go
tool_injection.go		tool_injection.go
tool_injection_test.go		tool_injection_test.go
upstream.go		upstream.go
usage.go		usage.go
usage_capture.go		usage_capture.go
usage_capture_test.go		usage_capture_test.go
usage_db.go		usage_db.go
usage_db_test.go		usage_db_test.go
usage_sink.go		usage_sink.go
usage_sink_test.go		usage_sink_test.go
usage_test.go		usage_test.go
websocket_smoke.sh		websocket_smoke.sh
websocket_smoke_test.go		websocket_smoke_test.go
websocket_transform.go		websocket_transform.go
websocket_transform_test.go		websocket_transform_test.go
websocket_tunnel.go		websocket_tunnel.go
websocket_tunnel_test.go		websocket_tunnel_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Codex Load Balancer

Features

Requirements

Build

Run

Docker Compose

Token File Format

Proxy Behavior

Session Stickiness

Load Balancing Rules

Rate Limit Handling

Usage Sync

Dashboard

Logs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Codex Load Balancer

Features

Requirements

Build

Run

Docker Compose

Token File Format

Proxy Behavior

Session Stickiness

Load Balancing Rules

Rate Limit Handling

Usage Sync

Dashboard

Logs

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages