Cacheon (SN14)

Inference optimization. Fastest server wins.

Cacheon is a Bittensor subnet (SN14) that runs an open competition for production-grade LLM inference optimization. Miners submit containerized inference servers. Validators evaluate them against a vLLM baseline on the same hardware. The fastest correct server takes all emission.

V1 arena: Qwen2.5-72B-Instruct on 4x H200 or equivalent GPUs. Beat the pinned vLLM baseline on TTFT and throughput while passing a greedy-decoding correctness gate.

How It Works

Miners build an inference server, package it as a Docker image, and then commit the image reference and image digest on-chain.
Validators scan the chain for new commitments, pull the image, and run it with model weights mounted at /models.
Scoring measures TTFT and throughput improvement over the vLLM baseline. Correctness is checked first -- fail it and the score is zero.
The fastest correct server becomes king and earns all subnet emission until someone beats it.
Challengers must exceed the king's score by a small decaying margin (~1% at crowning, decaying to 0 over ~7 days) to prevent noise-driven churn.

Score formula:

if not correctness_pass:
    score = 0.0
else:
    ttft_imp = max(0, (baseline_ttft - miner_ttft) / baseline_ttft)
    tps_imp  = max(0, (miner_tps  - baseline_tps)  / baseline_tps)
    score = 0.5 * ttft_imp + 0.5 * tps_imp

For Miners

Build an inference server that serves Qwen2.5-72B-Instruct via /v1/chat/completions with streaming and logprobs. Package it as a Docker image (maximum 20 GB; model weights are mounted at runtime, not baked into the image). Push it to a public registry and commit on-chain.

Requirements: public container registry, Bittensor wallet registered on SN14. GPU hardware is only needed for local testing.

# Push your image
docker tag my-server:latest docker.io/myuser/cacheon-miner:v1
docker push docker.io/myuser/cacheon-miner:v1

# Commit on-chain (one shot per hotkey -- test locally first)
python miner/commit.py \
  --wallet-name <wallet> \
  --wallet-hotkey <hotkey> \
  --image "docker.io/myuser/cacheon-miner:v1" \
  --digest "sha256:..." \
  --network finney \
  --netuid 14

Full guide: cacheon.ai/docs/miners/overview

For Validators

The validator has two components: an always-on CPU host (chain scanning, weight setting) and an ephemeral GPU pod (eval). The GPU pod is rented on-demand only when challengers are queued.

GPU requirements: NVLink/SXM interconnect, 4x H200 or equivalent, 400 GB storage, model weights at /workspace/models/Qwen2.5-72B-Instruct.

# CPU host (always-on)
git clone https://github.com/latent-to/cacheon
cd cacheon
cp .env.example .env   # add wallet and S3 config
docker compose up --build

# GPU pod (on-demand, run when challengers appear)
bash scripts/gpu_setup/setup.sh
docker compose -f docker-compose.gpu.yml up --build

Full guide: cacheon.ai/docs/validators/overview

Documentation

	Miners	Validators	Evaluation
Start here	Overview	Overview	Scoring
Reference	API contract	Architecture	Harness
Setup	Quickstart	GPU pod setup	Prompts
Rules	Rules	CPU host setup	Roadmap

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.github		.github
api		api
example-miner		example-miner
miner		miner
scripts		scripts
tests		tests
validator		validator
.gitignore		.gitignore
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cacheon (SN14)

How It Works

For Miners

For Validators

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cacheon (SN14)

How It Works

For Miners

For Validators

Documentation

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages