A multi-agent orchestration system for VS Code Copilot. ControlFlow coordinates 13 specialized agents under deterministic P.A.R.T contracts (Prompt → Archive → Resources → Tools), structured text outputs, and layered reliability gates.
- Why ControlFlow?
- Quick Start
- When to Use Which Agent
- Pipeline by Complexity
- Orchestration State Machine
- Failure Routing
- Agent Architecture
- Evaluation Suite
- Project Structure
- Documentation
- Installation
- License
| Single Agent | ControlFlow (13 agents) | |
|---|---|---|
| Planning | Agent guesses architecture on-the-fly | Planner runs structured idea interview, produces phased plan with Mermaid diagrams |
| Quality gates | None | PlanAuditor + AssumptionVerifier + ExecutabilityVerifier audit before implementation |
| Execution | Sequential, monolithic | Wave-based parallel execution with inter-phase contracts |
| Failures | Silent or catastrophic | Classified (transient/fixable/needs_replan/escalate) with automatic retry routing |
| Scope drift | Common | LLM Behavior Guidelines enforce surgical changes |
| Verification | Manual | Offline eval suite + CodeReviewer gates every phase |
# 1. Clone
git clone https://github.com/Smithbox-ai/ControlFlow.git
# 2. Copy to your VS Code prompts directory (or symlink)
# Windows: %APPDATA%\Code\User\prompts
# macOS: ~/Library/Application Support/Code/User/prompts
# Linux: ~/.config/Code/User/prompts
# 3. Enable in VS Code settings:
# { "chat.customAgentInSubagent.enabled": true,
# "github.copilot.chat.responsesApiReasoningEffort": "high" }
# 4. Reload VS Code → type @Planner in Copilot Chat
# 5. Verify evals
cd evals && npm install && npm testFirst task? Type
@Planner "Add OAuth login with Google"— the system handles the rest.
| Scenario | Agent | What happens |
|---|---|---|
| Abstract idea or vague goal | @Planner |
Idea interview → phased plan → Mermaid diagram |
| Detailed task, clear requirements | @Orchestrator |
Dispatches subagents → verification gates → phase-by-phase execution |
| Research question | @Researcher |
Evidence-based investigation with confidence scores |
| Quick codebase exploration | @CodeMapper |
Read-only discovery — files, dependencies, entry points |
Typical workflow: @Planner authors a plan → you approve → @Orchestrator executes it with full subagent coordination, review gates, and approvals.
| Tier | Scope | Review Agents | Max Iterations |
|---|---|---|---|
| TRIVIAL | 1–2 files, single concern | None (CodeReviewer still runs per-phase) | — |
| SMALL | 3–5 files, single domain | PlanAuditor | 2 |
| MEDIUM | 6–15 files, cross-domain | PlanAuditor + AssumptionVerifier | 5 |
| LARGE | 15+ files, system-wide | PlanAuditor + AssumptionVerifier + ExecutabilityVerifier | 5 |
Any plan with an unresolved HIGH-impact risk_review entry forces the full pipeline regardless of tier.
Mermaid diagram (click to expand)
stateDiagram-v2
[*] --> PLANNING
PLANNING --> WAITING_APPROVAL: plan ready
WAITING_APPROVAL --> PLAN_REVIEW: user approved
PLAN_REVIEW --> ACTING: audit passed
PLAN_REVIEW --> PLANNING: needs revision
WAITING_APPROVAL --> ACTING: trivial plan (skip review)
ACTING --> REVIEWING: phase complete
REVIEWING --> WAITING_APPROVAL: review done
WAITING_APPROVAL --> ACTING: next phase approved
WAITING_APPROVAL --> COMPLETE: all phases done
COMPLETE --> [*]
Simplified — REJECTED transition, HIGH_RISK_APPROVAL_GATE, and ABSTAIN paths omitted for clarity. See
Orchestrator.agent.mdfor the full state machine.
| Classification | Action | Max Retries |
|---|---|---|
transient |
Retry same agent | 3 |
fixable |
Retry with fix hint | 1 |
needs_replan |
Delegate to Planner | 1 |
escalate |
Stop — present to user | 0 |
When any retry budget is exhausted the phase escalates to the user with accumulated failure evidence.
Mermaid diagram (click to expand)
graph TB
User((User))
subgraph Orchestration
Orchestrator[Orchestrator<br/><i>conductor & gate controller</i>]
Planner[Planner<br/><i>structured planning</i>]
end
subgraph "Adversarial Review"
PlanAuditor[PlanAuditor<br/><i>plan audit</i>]
AssumptionVerifier[AssumptionVerifier<br/><i>mirage detection</i>]
ExecutabilityVerifier[ExecutabilityVerifier<br/><i>executability check</i>]
end
subgraph Research
Researcher[Researcher<br/><i>evidence-first research</i>]
CodeMapper[CodeMapper<br/><i>codebase discovery</i>]
end
subgraph Implementation
CoreImplementer[CoreImplementer<br/><i>backend implementation</i>]
UIImplementer[UIImplementer<br/><i>frontend implementation</i>]
PlatformEngineer[PlatformEngineer<br/><i>CI/CD & infrastructure</i>]
end
subgraph Verification
CodeReviewer[CodeReviewer<br/><i>code review & safety</i>]
BrowserTester[BrowserTester<br/><i>E2E & accessibility</i>]
end
subgraph Documentation
TechnicalWriter[TechnicalWriter<br/><i>docs & diagrams</i>]
end
User -->|idea / vague goal| Planner
User -->|detailed task| Orchestrator
User -->|research question| Researcher
User -->|codebase question| CodeMapper
Planner -->|structured plan| Orchestrator
Orchestrator -->|dispatch| Research
Orchestrator -->|dispatch| Implementation
Orchestrator -->|dispatch| Verification
Orchestrator -->|dispatch| Documentation
Orchestrator -->|audit| PlanAuditor
Orchestrator -->|audit| AssumptionVerifier
Orchestrator -->|audit| ExecutabilityVerifier
style Orchestrator fill:#4A90D9,color:#fff
style Planner fill:#7B68EE,color:#fff
style PlanAuditor fill:#E74C3C,color:#fff
style AssumptionVerifier fill:#E74C3C,color:#fff
style ExecutabilityVerifier fill:#E74C3C,color:#fff
style Researcher fill:#2ECC71,color:#fff
style CodeMapper fill:#2ECC71,color:#fff
style CoreImplementer fill:#F39C12,color:#fff
style UIImplementer fill:#F39C12,color:#fff
style PlatformEngineer fill:#F39C12,color:#fff
style CodeReviewer fill:#1ABC9C,color:#fff
style BrowserTester fill:#1ABC9C,color:#fff
style TechnicalWriter fill:#9B59B6,color:#fff
| Agent | File | Role |
|---|---|---|
| Orchestrator | Orchestrator.agent.md |
Conductor, gate controller, delegation |
| Planner | Planner.agent.md |
Structured planning, idea interviews |
| Agent | File | Role |
|---|---|---|
| Researcher | Researcher-subagent.agent.md |
Evidence-first research |
| CodeMapper | CodeMapper-subagent.agent.md |
Read-only codebase discovery |
| CodeReviewer | CodeReviewer-subagent.agent.md |
Code review and safety gates |
| PlanAuditor | PlanAuditor-subagent.agent.md |
Adversarial plan audit |
| AssumptionVerifier | AssumptionVerifier-subagent.agent.md |
Assumption-fact confusion detection |
| ExecutabilityVerifier | ExecutabilityVerifier-subagent.agent.md |
Cold-start plan executability simulation |
| CoreImplementer | CoreImplementer-subagent.agent.md |
Backend implementation |
| UIImplementer | UIImplementer-subagent.agent.md |
Frontend implementation |
| PlatformEngineer | PlatformEngineer-subagent.agent.md |
CI/CD, containers, infrastructure |
| TechnicalWriter | TechnicalWriter-subagent.agent.md |
Documentation, diagrams, code-doc parity |
| BrowserTester | BrowserTester-subagent.agent.md |
E2E browser testing, accessibility audits |
Models are resolved at runtime via governance/model-routing.json — see docs/agent-engineering/MODEL-ROUTING.md.
cd evals && npm test runs the full offline suite — schema compliance, reference integrity, P.A.R.T section ordering, tool grant consistency, behavioral invariants, orchestration handoff discipline, and drift detection. No live agents, no network.
See evals/README.md for pass descriptions and how to add scenarios.
├── Orchestrator.agent.md # Conductor agent
├── Planner.agent.md # Planning agent
├── *-subagent.agent.md # 11 specialized subagents
├── .github/
│ └── copilot-instructions.md # Shared agent policy (loaded by all agents)
├── schemas/ # JSON Schema contracts
├── docs/
│ ├── agent-engineering/ # Governance policies and reliability gates
│ └── tutorial-ru/ # Full Russian-language tutorial (19 chapters)
├── governance/ # Operational knobs and tool grants
├── skills/ # Reusable domain pattern library (11 patterns)
├── evals/ # Offline validation suite
│ └── scenarios/ # Eval scenario fixtures
├── plans/ # Plan artifacts and templates
└── NOTES.md # Active objective state (repo-persistent)
- docs/tutorial-en/ — full English tutorial: architecture, agents, orchestration, planning, review pipeline, schemas, governance, skills, memory, failure taxonomy, evals, case studies, exercises, glossary, FAQ.
- docs/tutorial-ru/ — то же на русском языке.
- docs/agent-engineering/ — authoritative governance specs: P.A.R.T, reliability gates, clarification policy, tool routing, scoring, observability, memory architecture.
- CONTRIBUTING.md — how to add agents, schemas, eval scenarios.
- CHANGELOG.md — version history.
VS Code prompts directory:
- Windows:
%APPDATA%\Code\User\prompts- macOS:
~/Library/Application Support/Code/User/prompts- Linux:
~/.config/Code/User/prompts
- Clone this repository.
- Copy the entire repo contents into the prompts directory (or symlink the repo there).
- Enable custom agents in VS Code settings:
{ "chat.customAgentInSubagent.enabled": true, "github.copilot.chat.responsesApiReasoningEffort": "high" } - Reload VS Code.
- Verify: type
@Plannerin Copilot Chat — the agent should appear in suggestions. - Run evals:
cd evals && npm install && npm test
Without .github/copilot-instructions.md agents will not have access to shared failure classification, conventions, and governance references.
Create a new .agent.md file following the P.A.R.T structure (Prompt → Archive → Resources → Tools). See CONTRIBUTING.md for the full 4-step process.
MIT. Copyright (c) 2026 ControlFlow Contributors.
ControlFlow was inspired by and builds upon ideas from:
- Github-Copilot-Atlas — original multi-agent orchestration concept for VS Code Copilot.
- claude-bishx — agent engineering patterns and structured workflows.
- copilot-orchestra
- oh-my-opencode