AI Cloud OS — Native Go model routing, IAM-integrated auth, usage tracking, and KMS secrets management. Zero middlemen, pure performance.
Hanzo Cloud implements the ZAP (Zero-overhead API Protocol) — a native Go model routing layer that connects users directly to upstream AI providers with no intermediaries.
User → cloud-api (Go, ZAP gateway) → upstream providers (DO-AI, Fireworks, OpenAI)
↕ ↕
Hanzo IAM Hanzo KMS
(auth, billing, usage) (multi-tenant secrets)
| Component | Description | Technology |
|---|---|---|
| Gateway | ZAP-native model routing, auth, billing | Go + Beego |
| Frontend | Admin UI, chat, knowledge base | React + Next.js |
| IAM | Identity, API keys (hk-), usage tracking |
hanzoai/iam |
| KMS | Multi-tenant secrets, org-scoped projects | hanzoai/kms |
| Engine | Local inference (mistral.rs fork) | hanzoai/engine |
ZAP defines the fast native path from API gateway to AI inference:
- OpenAI-compatible JSON over HTTP (
/v1/chat/completions,/v1/models) - Three auth modes: IAM API key (
hk-*), JWT (hanzo.id OAuth), Provider key (sk-*) - Static model routing — 66+ models mapped to 3 upstream providers in pure Go
- Per-request usage tracking — async fire-and-forget to IAM
- KMS-resolved secrets — provider API keys from Infisical with org-scoped projects
- Zero Python — no legacy proxy middleware, no extra hops
gpt-4o, gpt-5, gpt-5-mini, claude-opus-4-6, claude-sonnet-4-5, claude-haiku-4-5, o3, o3-mini, qwen3-32b, deepseek-r1-distill-70b, llama-3.3-70b, and more.
fireworks/deepseek-r1, fireworks/deepseek-v3, fireworks/kimi-k2, fireworks/qwen3-235b-a22b, fireworks/qwen3-coder-480b, fireworks/cogito-671b, and more.
openai-direct/gpt-4o, openai-direct/gpt-5, openai-direct/o3, openai-direct/o3-mini, openai-direct/gpt-4o-mini
zen4-mini, zen4-pro, zen4-max, zen4-ultra, zen4-coder-flash, zen4-coder-pro, zen-vl, zen3-omni
Full model list: GET /api/models
# Build
CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags="-w -s" -o cloud-api-server .
# Run
./cloud-api-server
# Test
curl -H "Authorization: Bearer hk-YOUR-API-KEY" \
https://api.hanzo.ai/v1/chat/completions \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'Set via conf/app.conf or environment variables:
| Variable | Description |
|---|---|
HANZO_API_KEY |
Unified service token for internal IAM + KMS operations (billing/usage/auth support) |
iamEndpoint |
IAM service URL (production: http://iam.hanzo.svc.cluster.local:8000) |
clientId |
IAM OAuth client ID for cloud |
clientSecret |
IAM OAuth client secret for cloud |
dataSourceName |
Database DSN (do not commit; inject via KMS-managed secret) |
KMS_CLIENT_ID |
Infisical Universal Auth client ID |
KMS_CLIENT_SECRET |
Infisical Universal Auth client secret |
KMS_PROJECT_ID |
Default KMS project ID |
KMS_ENVIRONMENT |
KMS environment (default: production) |
# Docker
docker pull ghcr.io/hanzoai/cloud:latest
# Kubernetes (production)
kubectl apply -f k8s/kms-secrets.yaml
kubectl apply -f k8s/configmap.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yamlGitHub Actions production deploy (.github/workflows/deploy-production.yml) resolves deployment credentials from Hanzo KMS. Preferred setup is a single GitHub secret: HANZO_API_KEY. Universal Auth (KMS_CLIENT_ID + KMS_CLIENT_SECRET) remains as a fallback.