WybeCoder: Verified Imperative Code Generation

WybeCoder is an agentic code verification framework that enables prove-as-you-generate development where code, invariants, and proofs co-evolve. It combines automatic verification condition generation and SMT solvers (cvc5) with interactive proofs in Lean 4, operating on Velvet — a Dafny-like imperative language embedded in Lean 4 via the Loom framework.

Named in homage to Edsger Wybe Dijkstra.

Paper: WybeCoder: Verified Imperative Code Generation

Project page: https://facebookresearch.github.io/wybecoder

Results

Benchmark	Solve Rate	Details
Verina (189 problems)	74.1%	128 proved + 12 disproved (Claude Opus 4.5, 32 turns x 16 agents)
Clever-Loom (161 problems)	62.1%	100 / 161 problems solved (Claude Opus 4.5, 32 turns x 16 agents)

Agent Strategies

Sequential Agent — Single-agent turn-based loop with iterative refinement. Multiple independent attempts run in parallel (pass@k).
Subgoal Decomposition — Extracts verification subgoals, dispatches parallel provers, and reconstructs the full proof. Supports conflict-driven method modification across iterations.

Installation

Python environment

micromamba create -n wybecoder python=3.12
micromamba activate wybecoder
pip install -r requirements.txt

Lean setup

lake update          # downloads Loom, loogle, and precompiled Mathlib
cd .lake/packages/Loom
lake build Loom CaseStudies Mathlib REPL
cd ../loogle
lake exe cache get && lake build LoogleMathlibCache && lake build
cd ../../..          # back to repo root

Recent Loom should automatically install the cvc5 prover. Manual installation (if needed):

macOS: brew install --cask cvc5/cvc5/cvc5
Linux: download from the cvc5 releases page
Verify: which cvc5

Verify your setup

Run the quick-start command below with --max_repls 1 to confirm that the Lean REPL starts and the model API responds (see Quick Start).

API Keys

Set the environment variable for the model provider you plan to use:

Model prefix	Provider	Environment variable
`gemini-*`	Google Gemini	`GOOGLE_API_KEY`
`claude-*`	Llama API (hosts Claude via Vertex)	`LLAMA_API_KEY`
`gpt-*`	Azure OpenAI	`AZURE_OPENAI_API_KEY` + `AZURE_OPENAI_ENDPOINT`
`vllm-*`	Local vLLM server	None (auto-started or set `vllm_base_url`)

Quick Start

Run a lightweight smoke test (2 workers, 1 attempt):

export LLAMA_API_KEY=your-key        # or GOOGLE_API_KEY for Gemini configs
python -m src.worker configs/other_models/clever_linear_claude_4_5_sonnet_32t_32agents.yaml \
  --n_worker_threads 2 --n_attempts 1 --max_repls 2

You should see Started successfully with config: ... followed by HTTP 200 responses. Results are written to runs/dumps/<run_name>/trajectories/. Kill with Ctrl-C once satisfied.

View results with the trajectory viewer:

python scripts/build_viewer_data.py   # generate viewer data
python scripts/serve_viewer.py        # serve at http://localhost:8000/viewer.html

Running Full Experiments

Single node:

python -m src.worker configs/decomp.yaml

Multi-node with SLURM:

srun -N <nodes> --ntasks-per-node=1 python -u -m src.worker configs/decomp.yaml

Available Configs

Directory	Strategy	Examples
`configs/`	Decomposition / multi-agent	`decomp.yaml`, `clever_decomp.yaml`, `multi.yaml`, `sort_decomp.yaml`
`configs/other_models/`	Sequential (linear) agent	`clever_linear_claude_4_5_sonnet_32t_32agents.yaml`, `verina_linear_gemini_2_5_pro_32t_32agents.yaml`
`configs/ablation/`	Ablation studies	`a1_no_invariant.yaml`, `a2_no_fallback.yaml`, ...

Datasets

Clever-Loom (data/clever_loom.jsonl) — 161 algorithm and data structure problems from the CLEVER benchmark, manually converted and verified in Loom format.
Verina — 189 program verification problems. Convert to Loom format with python -m scripts.verina_to_loom.
Sorting (data/sorting.jsonl) — Sorting algorithm verification (Heapsort, Mergesort, Quicksort, etc.).

MCP Integration (Optional)

The agent supports Model Context Protocol integration for enhanced theorem search via Loogle and Leanexplore.

Setup

# Loogle
./src/start_loogle_server.sh       # starts on localhost:8088
python -m scripts.test_loogle

# Leanexplore
leanexplore data fetch
python -m scripts.test_leanexplore

# Test both together
python -m scripts.test_mcp

# End-to-end with the agent
python -m scripts.test_agent_with_mcp

Enable in a config:

use_mcp: true
use_loogle: true
use_leanexplore: true

See docs/mcp_architecture.md for implementation details.

Citation

@article{gloeckle2026wybecoder,
  title     = {WybeCoder: Verified Imperative Code Generation},
  author    = {Gloeckle, Fabian and Bak{\v{s}}ys, Mantas and Feher, Darius
               and Zheng, Kunhao and Hayat, Amaury and Holden, Sean B.
               and Synnaeve, Gabriel and O'Hearn, Peter},
  journal   = {Preprint},
  year      = {2026}
}

License

This project is licensed under CC-BY-NC 4.0. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
configs		configs
data		data
docs		docs
runs		runs
scripts		scripts
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
dependencies.toml		dependencies.toml
lake-manifest.json		lake-manifest.json
lakefile.toml		lakefile.toml
lean-toolchain		lean-toolchain
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WybeCoder: Verified Imperative Code Generation

Results

Agent Strategies

Installation

Python environment

Lean setup

Verify your setup

API Keys

Quick Start

Running Full Experiments

Available Configs

Datasets

MCP Integration (Optional)

Setup

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WybeCoder: Verified Imperative Code Generation

Results

Agent Strategies

Installation

Python environment

Lean setup

Verify your setup

API Keys

Quick Start

Running Full Experiments

Available Configs

Datasets

MCP Integration (Optional)

Setup

Citation

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages