Skip to content

drewstone/cli-bridge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cli-bridge

Use your local coding-CLI subscriptions as one OpenAI-compatible HTTP API — with persistent session resume.

Model ids are <harness>/<model>. The harness is the agent runtime (Claude Code, Codex, opencode, claudish, …); the model is whatever that runtime can address. One string, both choices explicit.

Backends this is built for (✓ implemented, ◦ stubbed):

Harness Status What it is
claude/ Claude Code CLI — your Claude Max / Pro subscription
claudish/ Claude Code + claudish — Claude's workflow, a different brain
codex/ OpenAI Codex CLI — your ChatGPT Plus/Pro subscription
opencode/ opencode — multi-provider; the vehicle for Kimi Code via the opencode-kimi-full plugin
factory/ Factory Droid
amp/ Sourcegraph Amp
forge/ Forge Code
<provider>/ Passthrough: openai/, anthropic/, moonshot/, zai/ — direct vendor API, not a CLI

Personal productivity tool. Single-user by default. Loopback-only by default. No ambition to be a shared proxy.


The idea in two sentences

Every AI lab now ships a CLI, each with its own subscription + better session economics than the metered API. cli-bridge exposes all of them as one OpenAI-compatible endpoint so your tools (editor, aider, tangle-router, a bash script) can switch harnesses with a single string.

Model id scheme

<harness>/<model>

claude/sonnet                        # Claude Code + Anthropic Sonnet
claude/opus                          # Claude Code + Anthropic Opus
claude/claude-sonnet-4-5-20250929    # Claude Code + specific version

claudish/openrouter@deepseek/deepseek-r1   # Claude Code workflow, DeepSeek brain
claudish/google@gemini-2.0-flash           # Claude Code workflow, Gemini brain
claudish/zai@glm-4.6                       # Claude Code workflow, Z.AI brain

codex/gpt-5-codex                    # Codex CLI, Codex model
opencode/kimi-for-coding             # opencode + kimi-full plugin (Kimi Code sub)
opencode/anthropic/claude-sonnet-4-5 # opencode's configured anthropic provider

openai/gpt-4o                        # passthrough — OpenAI API, metered
zai/glm-4.6                          # passthrough — Z.AI API, metered

The registry matches on the <harness>/ prefix; first-registered-first-match wins. bridge/claude (no model) defaults to whatever the harness default is.

Through tangle-router

When reaching cli-bridge via tangle-router's /api/chat, prefix the whole thing with bridge/:

bridge/claude/sonnet
bridge/claudish/openrouter@deepseek/deepseek-r1
bridge/opencode/kimi-for-coding

The router's short-circuit strips the leading bridge/ and forwards the <harness>/<model> as-is.

Install

git clone https://github.com/drewstone/cli-bridge.git
cd cli-bridge
pnpm install
cp .env.example .env
# edit .env to taste
pnpm verify   # probes each configured backend, reports ready/unavailable
pnpm start
# → http://127.0.0.1:3344  (was 8787; changed to dodge port collisions)

Prereqs: Node 22+. For each backend you want enabled, install + log in on the host:

Backend Install Auth
claude npm i -g @anthropic-ai/claude-code claude /login (OAuth, opens browser)
claudish claude above + run claudish locally, point CLAUDISH_URL at it claudish's own provider config
codex brew install openai/homebrew-tap/codex codex login
opencode brew install sst/tap/opencode (+ opencode-kimi-full plugin for Kimi Code) opencode login
passthrough (none) provider API keys in .env

Quick test

curl -N http://127.0.0.1:3344/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -H 'X-Session-Id: my-first-session' \
  -d '{
    "model": "claude/sonnet",
    "messages": [{"role": "user", "content": "say hi in 3 words"}],
    "stream": true
  }'

Subsequent calls with the same X-Session-Id resume the conversation. Claude Code sees prior context, doesn't re-charge you for replaying it.

API

POST /v1/chat/completions

OpenAI Chat Completions. Model id routes via harness prefix. Supports streaming (default) or stream: false. Session resume via session_id body field or X-Session-Id header.

Extra fields this bridge accepts beyond vanilla OpenAI:

  • cwd: persist a working directory for the session and run future resumed turns there
  • agent_profile: full AgentProfile object

Behavior:

  • sandbox backends honor the full agent_profile natively
  • local harness backends (claude-code, codex, kimi-code) persist the full profile, honor the executable subset directly where possible, and compile the remaining context into a deterministic system-prompt preamble

Example:

curl http://127.0.0.1:3344/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer '"$BRIDGE_BEARER" \
  -d '{
    "model": "codex/gpt-5.4-mini",
    "session_id": "agent-builder-local",
    "cwd": "/Users/drew/webb/agent-builder",
    "agent_profile": {
      "name": "local-coder",
      "prompt": { "systemPrompt": "Be surgical. No placeholder logic." },
      "skills": ["critical-audit"]
    },
    "messages": [{ "role": "user", "content": "inspect the repo and propose the smallest viable fix" }],
    "stream": false
  }'

GET /v1/models

Lists model ids each ready backend claims, with which harness serves them.

GET /health

JSON report per backend — ready / unavailable / error with detail.

GET /v1/sessions · DELETE /v1/sessions/:id

Inspect / clear external-to-internal session mappings.

Claudish setup

Claudish is a separate tool (Hono-based Anthropic proxy). Run it locally:

brew install claudish   # or install-from-source per its repo
claudish --port 3456
# then in cli-bridge .env:
CLAUDISH_URL=http://127.0.0.1:3456
BRIDGE_BACKENDS=claude,claudish,passthrough

Now every claudish/<model> call spawns Claude Code with ANTHROPIC_BASE_URL=http://127.0.0.1:3456 — Claude Code's workflow, whatever-you-configured's brain.

Use with Tangle products

VerticalBench — swap claude -p subprocess calls for HTTP to cli-bridge with X-Session-Id: leaf-<id>. Durable session state across runs, no re-billing replays.

Agent Builder devBYOK_CLI_ENDPOINT=http://host.docker.internal:3344 in .dev.vars. Forge drives your Claude Code subscription locally during development. Never ship that to production.

PR reviews & automations — any bash cron / GitHub Action can hit POST /v1/chat/completions with a stable X-Session-Id.

Parallel mode (Docker pool)

Default behavior spawns the CLI on the host. That's fine for one caller; under N concurrent chat() calls you hit:

  • shared ~/.claude (or ~/.kimi, ~/.codex, ~/.config/opencode) OAuth state
  • shared scratch dirs (multiple CLI processes touching the same tmp)
  • single CLI subprocess instance contending with itself

The Docker executor solves all three: each chat() runs inside a pre-warmed container slot, and session_id sticks the same caller to the same slot so --resume reads the same on-disk transcript turn-to-turn. Works for every subprocess backend — claude, kimi, codex, opencode — through the same Spawner abstraction.

# 1. build the unified runtime image once (has all four CLIs installed)
docker build -f docker/Dockerfile.cli-runtime -t cli-bridge-cli-runtime:latest .

# 2. enable per backend (any subset)
cat >> .env <<'EOF'
CLAUDE_EXECUTOR=docker
CLAUDE_DOCKER_POOL_SIZE=4
KIMI_EXECUTOR=docker
KIMI_DOCKER_POOL_SIZE=2
CODEX_EXECUTOR=host
OPENCODE_EXECUTOR=host
EOF

# Or flip everything at once:
# echo 'BRIDGE_DEFAULT_EXECUTOR=docker' >> .env

# 3. start as usual
pnpm start
# [cli-bridge] claude executor: docker pool size=4 image=cli-bridge-cli-runtime:latest
# [cli-bridge] kimi   executor: docker pool size=2 image=cli-bridge-cli-runtime:latest

OAuth mount modes:

  • share (default) — bind-mounts host ~/.claude (etc) into every slot. Simplest; concurrent token-refresh can race on the same session DB.
  • per-slot — each slot gets its own named docker volume. Full OAuth isolation; one <cli> /login per slot on first run.

Topology guide

cli-bridge spawns pool containers by talking to the host docker daemon — pool slots are siblings of cli-bridge, not nested. Two shapes work:

  • cli-bridge on host (recommended for autoresearch / dev). The bridge runs as pnpm start; pool containers spawn directly via the host docker daemon. Callers (orchestrators, evals) hit 127.0.0.1:3344.

  • cli-bridge in a container (deployment). The compose stack bind-mounts /var/run/docker.sock so the bridge can drive the host daemon to spawn pool slots as siblings on the host. Set <NAME>_DOCKER_HOST_CONFIG_DIR to a HOST path (not a path inside the bridge container) — the daemon resolves binds against the host fs.

Either way, an orchestrator running in its own container hits the bridge at host.docker.internal:3344 (Docker Desktop) or the bridge gateway IP (Linux). No DinD anywhere.

Deploy

See deploy/README.md for Hetzner box (Docker or systemd). Remote deploy requires BRIDGE_BEARER — cli-bridge refuses to bind non-loopback without one.

Design notes

  • Explicit in the model id, not the env. The <harness>/<model> scheme means "what you type is what runs." No mode toggles that change behavior under the same id.
  • Harnesses are independent. Each is a class implementing Backend; add a new one in src/backends/*.ts, register it in src/server.ts.
  • Single-user assumption is deliberate. No per-call user auth beyond the optional bearer.

License

MIT

About

Use your coding-CLI subscriptions (Claude Code, Codex, opencode, Kimi Code, Factory, Amp, Forge) as one OpenAI-compatible HTTP API with persistent session resume. Personal tool.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors