Skip to content

Development

deepelement.ai edited this page May 9, 2026 · 2 revisions

💻 ClawCode for Development — User Guide

From prompt to production — how ClawCode accelerates your entire development workflow.


Table of Contents


Overview

ClawCode is a terminal-native AI coding assistant that gives an AI agent a full development toolkit — file operations, shell execution, code search, browser automation, and more — all mediated through a structured agent loop with configurable safety boundaries.

From a developer's perspective, ClawCode serves five core purposes:

Purpose What It Means
Write code Generate, edit, patch, and refactor files across your project
Understand code Search, read, and analyze your codebase with architecture awareness
Execute & test Run shell commands, scripts, tests, and build tools
Plan & track Use plan/spec modes for structured development with task tracking
Integrate Interact with browsers, APIs, git, LSP diagnostics, and MCP servers

The architecture of ClawCode's development system can be understood in four layers:

┌──────────────────────────────────────────────────────────┐
│                   User Interface                          │
│   TUI (interactive)  |  CLI (non-interactive / -p flag)   │
├──────────────────────────────────────────────────────────┤
│                   Agent Loop (ReAct)                      │
│   Reasoning → Tool Selection → Execution → Observation    │
├──────────────────────────────────────────────────────────┤
│                   Tool System (20+ tools)                 │
│   Files │ Shell │ Search │ Edit │ Browser │ Desktop │ Envs│
├──────────────────────────────────────────────────────────┤
│                   Intelligence Layer                      │
│   LSP │ Code Awareness │ Git │ Plan/Spec │ Summarization  │
└──────────────────────────────────────────────────────────┘

Tool System — The Agent's Toolkit

ClawCode equips the AI agent with 20+ specialized tools covering every aspect of the software development lifecycle. Each tool has a defined schema (name, description, parameters) that the LLM uses to decide what to invoke. Tools are permission-gated — you control what the agent can do.

File Operations

The agent can work with your files using four dedicated tools:

Tool Purpose Key Parameters
View Read file contents with offset/limit file_path, offset, limit
Write Create or overwrite a file file_path, content
Edit Search-and-replace within a file file_path, old_string, new_string
Patch Apply diffs to files file_path, diff

Path resolution is intelligent — the tool layer:

  • Resolves stale absolute paths from older sessions
  • Recovers workspace-relative pseudo-paths (e.g., -p/clawcode/...)
  • Expands environment variables and ~ home directories
  • Validates that paths stay within the workspace boundary

File read performance uses binary seek for large offsets (>500 lines), avoiding the need to read entire files line-by-line.

Shell & Script Execution

The Bash tool lets the agent execute shell commands directly in your terminal environment:

Feature Detail
Timeout Default 30s, configurable per-command
Working directory Inherits your project workspace
Shell detection Auto-detects cmd, PowerShell, bash, git-bash on Windows
UTF-8 output Proper encoding handling with fallback to system locale
Python fallback Graceful fallback when shell commands fail (e.g., ls on Windows)
Permission gate Destructive commands require user approval

The ExecuteCode tool provides a dual-mode execution environment:

Mode Description
kind="shell" Execute shell commands, returns {stdout, stderr, returncode} as JSON
kind="python" Run Python in a sandboxed subprocess — blocked __import__, blocked open, restricted builtins, enforced timeout

The sandbox is not a perfect security boundary but prevents the most common automation-time escapes.

Code Search

ClawCode's search system is Rust-powered for native performance:

Tool Purpose Example
Glob Find files by pattern glob pattern="**/*.tsx" path="src/"
Grep Search file contents with regex `grep pattern="TODO

Grep output follows ripgrep-style formatting: path:line_number: matched_line. Context lines (-A, -B, -C) are supported. Results can be limited via head_limit.

The native Rust engine (clawcode_performance) also provides:

  • Native AST parsing for code structure analysis
  • Native diff generation for efficient change comparison
  • Native git operations for workspace state queries
  • Native file type detection and MIME classification
  • Native text truncation and stream processing

Advanced Editing & Patching

Beyond basic file operations, the advanced tools provide:

Tool/Feature Description
Smart parameter coercion Handles non-JSON inputs (single-quoted Python dicts, loose object strings)
Alias normalization filePathfile_path, textcontent, cmdcommand
Leading raw prefix stripping Fixes raw={}{} prefixes sometimes emitted by proxy models
Edit SearchReplace tool — precise line-range edits using SEARCH/REPLACE pattern matching
WebFetch HTTP GET requests to fetch web content as Markdown

Task Tracking

The agent can maintain a persistent task list during development using two tools:

Tool Purpose Persistence
TodoWrite Create/update a structured task list with priorities and statuses .clawcode/todos.json
TodoRead Read the current task list .clawcode/todos.json

Each task entry has:

  • id — Unique identifier
  • content — Task description
  • statuspending, in_progress, completed
  • priorityhigh, medium, low

The TUI HUD bar renders TODO progress visually (e.g., ●●●○○), giving you instant visibility into what the agent is working on.

Browser Automation

The agent can control a web browser for testing, scraping, and automation:

Action Description
browser_navigate Navigate to a URL
browser_click Click an element by index or selector
browser_type Type text into an input field
browser_scroll Scroll the page
browser_snapshot Capture the current page state (accessibility tree)
browser_vision AI vision-based page analysis
browser_back Navigate back
browser_console Read browser console output
browser_get_images Extract images from the page
browser_close Close the browser
web_search Web search (requires API key)
web_extract Extract structured content from URLs

Backed by two browser providers:

  • BrowserUse — Python-based browser automation agent
  • Browserbase — Cloud browser infrastructure

Desktop Automation

For desktop-level interaction (testing GUI applications, reviewing designs):

Tool Description
Desktop Screenshot Capture full or partial screen screenshots
Desktop Mouse Click, move, drag, scroll
Desktop Keyboard Type text, press key combinations

These tools are gated — they're only available when explicitly requested, protecting against unintended desktop interaction.

Subagent Spawning

The Agent/Task tool enables hierarchical task delegation:

Feature Description
Nested agents Spawn subagents with isolated tool sets
Tool isolation Each subagent's tools are bounded by its role's tools allowlist
Auto-approval Subagents auto-approve within their tool boundaries
Iteration budget Configurable maxTurns prevents runaway execution
Plugin hooks Integration with the hook engine for lifecycle events

Execution Environments

Code execution can happen in multiple environments:

Environment Use Case
Local Direct execution on your machine (default)
Docker Containerized, isolated execution
SSH Remote server execution
Daytona Cloud development environment
Modal Serverless cloud execution
Singularity HPC container execution
Persistent Shell Long-running shell sessions with state preservation

Configuration via environment variables (e.g., CLAWCODE_TERMINAL_PERSISTENT, Docker/SSH connection strings).

Native Performance Engine (Rust)

For performance-critical operations, ClawCode ships a Rust native extension (clawcode_performance) built with PyO3:

Module Capability
grep High-performance regex search with ripgrep-level speed
glob Fast file pattern matching with gitignore-aware filtering
ast Language-aware AST parsing (symbol extraction, structure analysis)
diff Native unified diff generation
git Git repository status and operations
clipboard Cross-platform clipboard access
highlight Syntax highlighting for code blocks
image Image processing and analysis
text Text processing, truncation, stream operations
json_parse Fast JSON parsing
xxhash High-speed content hashing

The Rust engine is pre-built for Windows x64, macOS (arm64/x64), and Linux (arm64/x64).


Structured Development Workflows

ClawCode offers three structured modes that enforce development discipline — from safe exploration to full autonomous coding.

Plan Mode — Safe Research Before Changes

Activated via /plan, Plan mode enforces a strict read-only contract:

Allowed tools:

  • view, ls, glob, grep — Read-only file inspection
  • diagnostics — LSP error checking
  • fetch — Web content retrieval
  • bash (read-only only) — Non-destructive shell commands
  • Agent, Task — Subagent spawning (plan/explore/review only)

Blocked tools:

  • write, edit, patch — No file modifications
  • execute_code — No arbitrary code execution
  • cronjob — No scheduled tasks
  • Destructive bash commands: rm, mv, cp, sed, git add/commit/push, pip install, etc.

Use cases:

  • "What does the authentication module look like?"
  • "Find all places where we use the deprecated API"
  • "Analyze the database schema and suggest improvements"

Plan mode ensures the agent can research thoroughly without accidentally changing anything. Plans are persisted to .claw/plans/ with versioned snapshots.

Spec Mode — Analyze, Implement, Verify

Activated via /spec, Spec mode extends Plan with a three-phase execution contract:

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│ spec_pending │────▶│spec_executing│────▶│spec_verifying │
│  Read-Only   │     │  Full Access │     │ Test-Only    │
└──────────────┘     └──────────────┘     └──────────────┘
Phase Tool Access Purpose
spec_pending Read-only (same as Plan mode) Analyze the codebase, produce a specification
spec_executing Full tool access Implement the spec — write code, run commands freely
spec_verifying Read-only + test commands only Verify the work: pytest, jest, cargo test, ruff, mypy, eslint

What Spec mode produces:

File Purpose
spec.md Detailed specification with goals and constraints
tasks.md Ordered task breakdown with dependencies
checklist.md Acceptance verification items (with auto_verify_command)
meta.json Structured machine-readable data for automation

Each SpecTask includes depends_on, acceptance_criteria, files_to_modify, and priority. Each CheckItem can have an auto_verify_command — allowing automated acceptance testing.

Crash recovery: Spec mode supports normalize_stale_build_after_restart(), so interrupted sessions can resume cleanly.

Claw Mode — Lightweight Autonomous Coding

Activated via /claw, Claw mode provides a budget-controlled autonomous agent with full tool access:

Feature Description
Full tool access No plan-mode restrictions — read, write, execute, all available
Iteration budget Shared cap across all subagents prevents runaway loops
Per-turn reset Budget resets per user turn, giving controlled autonomy
Same tool surface Uses the same tools as the default coder agent
Dedicated prompt System prompt extended with "Claw mode" suffix indicating full access

Claw mode is ideal when you want the agent to work autonomously on a well-defined task — it will iterate, search, edit, and test until the budget is exhausted or the task is complete.


LLM Configuration & Provider Ecosystem

ClawCode supports an exceptionally wide range of LLM providers, giving you full control over which model powers your development.

Supported Providers

Provider Models Key Features
Anthropic Claude 3.5 Sonnet/Haiku/Opus, Claude 3.7 Sonnet Extended thinking, prompt caching, tool calling, streaming
OpenAI GPT-4o, o1/o3/o4, GPT-4.1 Plus all OpenAI-compatible APIs (DeepSeek, Qwen, GLM, Moonshot/Kimi, Mistral, MiniMax, Cohere, Volcengine/Doubao, StepFun, Baichuan, SiliconFlow, Novita, Lepton, Fireworks, Together, Perplexity, DeepInfra, Nebius)
Google Gemini 1.5 Pro, 2.0 Flash, 2.5 Pro/Flash Up to 2M context window via native SDK
Groq Llama, Mixtral, Qwen, DeepSeek Ultra-fast inference via Groq's API
Azure OpenAI models on Azure Deployment-based routing, enterprise compliance
OpenRouter 100+ models from any vendor Unified API, vendor/model ID format
xAI Grok-2, Grok-2-mini Up to 131K context
AWS Bedrock Claude, Titan, Llama, Mistral, Cohere, AI21 AWS-native, boto3 integration
GitHub Copilot GPT-4o, GPT-4.1, Claude Sonnet, Gemini, O-series Token exchange auth, free-tier access

Each provider inherits from a common BaseProvider interface, ensuring consistent behavior regardless of which model you use.

Configuring Your Models

Configuration lives in .clawcode.json (project-root or ~/.clawcode.json):

{
  "providers": {
    "anthropic": {
      "api_key": "sk-ant-..."
    },
    "openai": {
      "api_key": "sk-...",
      "base_url": "https://api.deepseek.com/v1",
      "models": ["deepseek-chat", "deepseek-reasoner"],
      "timeout": 120
    },
    "gemini": {
      "api_key": "AIza..."
    }
  },
  "agents": {
    "coder": {
      "model": "claude-sonnet-4-20250514",
      "max_tokens": 8192
    },
    "task": {
      "model": "claude-sonnet-4-20250514",
      "max_tokens": 8192
    },
    "title": {
      "model": "claude-3-5-haiku-20241022",
      "max_tokens": 100
    },
    "summarizer": {
      "model": "claude-3-5-sonnet-20241022",
      "max_tokens": 4096
    }
  }
}

Agent roles and their default configurations:

Agent Role Default Model Default Max Tokens
coder Main coding agent claude-3-5-sonnet-20241022 8192
task Subagent (team tasks) claude-3-5-sonnet-20241022 8192
title Session title generation claude-3-5-haiku-20241022 100
summarizer Conversation compaction claude-3-5-sonnet-20241022 4096

Provider auto-resolution — you don't need to specify the provider for each model. ClawCode infers it from the model name:

Model Name Pattern Resolved Provider
Contains claude or anthropic Anthropic
Contains gpt, o1, o3, o4 OpenAI
Contains gemini Gemini
Contains deepseek OpenAI (compatible endpoint)
Contains grok xAI
Contains qwen, qwq, qvq OpenAI (compatible endpoint)
vendor/model (single slash) OpenRouter
anthropic/<model> Anthropic (compatibility gateway)

Model Switching

Method How
TUI Model Picker Ctrl + O — interactive dialog listing available models with provider labels
Direct slash /model — change the model for the current session
Config file Edit .clawcode.json → restart or use model picker
Per-agent override Set provider_key in AgentConfig to pin a specific provider slot

The model picker displays models from:

  1. Your Provider.models arrays in config
  2. Models referenced by your agents
  3. A bundled reference catalog of 121+ models across 32 providers

Cost Tracking & Usage Monitoring

ClawCode tracks token usage and cost at multiple levels:

Level What's Tracked
Per-message input_tokens, output_tokens, cache_creation_tokens, cache_read_tokens, cost
Per-session Accumulated prompt_tokens, completion_tokens, cost (persisted in database)
Per-turn Live input/output token counts shown in HUD

Usage command/usage shows:

Model: claude-sonnet-4-20250514
Context window (approx.): 200,000 tokens
Context fill (estimate): 45%
Session tokens (DB): prompt 12,345 / completion 6,789
This turn (live): input 2,340 / output 1,200

Billing errors are caught gracefully with a user-friendly message asking you to top up or switch providers.


Context Management

Context Windows

Each model has a defined context window size used for auto-compaction decisions:

Model Family Context Window
Claude 3.5 Sonnet/Haiku/Opus 200,000 tokens
GPT-4o / GPT-4o-mini 128,000 tokens
GPT-4 Turbo 128,000 tokens
Gemini 2.0 Flash 1,000,000 tokens
Gemini 1.5 Pro 2,000,000 tokens
DeepSeek / Qwen / Kimi / GLM 131,072 tokens
MiniMax 204,800 tokens

Automatic Summarization

Long conversations are automatically compacted to stay within context limits:

Parameter Default Meaning
Trigger threshold 70% of context window When to start summarization
Minimum messages 10 Don't summarize short conversations
Safety margin 62% ratio for input Ensures summarizer prompt fits
Kept messages 4 most recent Recent context is always preserved
Summary max 4,096 tokens Compact enough to save significant space

Chunked fallback: When history is too large for a single summarization call, the summarizer splits messages into chunks and builds a running summary across segments.

Heuristic fallback: If LLM summarization fails entirely, short excerpts from archived turns are extracted — ensuring you never lose all context.

Memory preservation: Before compaction, the agent proactively calls the memory tool to persist durable facts, preventing important information from being lost.

You can also manually trigger summarization with /compact.


Code Intelligence

Code Awareness Panel

The Code Awareness Panel in the right sidebar gives you real-time visibility into what the agent understands about your project:

Feature What You See
Architecture map Project directories classified into layers: CORE, API, CONFIG, TEST, DOCS
Modified files Green highlights with #N sequence numbers — which files the agent edited
Read files Blue highlights with R#N sequence numbers — which files the agent inspected
Symbol outlines Top-level symbols extracted for key files (via LSP or regex heuristics)
Role tags [entry], [api], [test] markers showing each directory's purpose
Per-question history Which files were touched in each turn, archived for review

How architecture is classified:

The system uses a two-stage LLM approach:

  1. Stage 1: BFS outline + README snippet → infer architecture layers
  2. Stage 2: Map directories to those layers in batches (120 directories per batch)
  3. Fallback: Name-based heuristics when LLM is unavailable (src/ = CORE, tests/ = TEST, api/ = API)

LSP Integration

ClawCode implements a full Language Server Protocol 3.17 client, connecting to 30+ pre-configured language servers:

Language Server
Python pylsp
Go gopls
TypeScript/JavaScript typescript-language-server
Rust rust-analyzer
Java jdtls
C/C++ clangd
C# omnisharp
Kotlin kotlin-language-server
Scala metals
Swift sourcekit-lsp
Dart dart
Zig zls
Ruby solargraph
PHP intelephense
Lua lua-language-server
Haskell haskell-language-server
Elixir elixir-ls
Erlang erlang_ls
OCaml ocamllsp
Clojure clojure-lsp
HTML/CSS/Vue/Svelte vscode-* language servers
JSON/YAML/TOML yaml-language-server, taplo
SQL/GraphQL sql-language-server
Bash/PowerShell bash-language-server
Terraform terraform-ls
Dockerfile/Protobuf/Markdown/LaTeX Various

The LSP integration provides:

  • Real-time diagnostics — errors and warnings surfaced to the agent via the diagnostics tool
  • Symbol extractiondocument_symbols() feeds the Code Awareness panel
  • Diagnostic callbacks — the TUI reacts to new diagnostics in real time
  • Auto-start — language servers launch on-demand when files of their language are opened

Git & Version Control

GitHub PR Integration

The agent can work with GitHub Pull Requests directly:

Capability Description
PR parsing Accepts URLs (github.com/owner/repo/pull/123) or plain #123 with git remote detection
PR comments Fetches issue comments, review comments, and reviews
PR review context Retrieves PR metadata + file list + truncated patches for LLM-based code review
Diff generation Best-effort diff against main/master, with fallback to HEAD~50..HEAD
Authentication Via GITHUB_TOKEN environment variable or gh auth token CLI

The /rewind Command

/rewind safely undoes the agent's changes — restoring files to their pre-modification state:

/rewind — restore all tracked files to HEAD
Feature Description
Git detection Only works in git repositories
Safe scope Only restores tracked files (never touches untracked/new files)
Pre-restore stash Changes are stashed before restore (recoverable)
Diff summary Shows what changed via git diff --stat before rewinding

CLI & Non-Interactive Mode

ClawCode supports scripting and CI/CD integration through its non-interactive mode:

# Basic one-shot query
clawcode -p "Explain the authentication module"

# JSON output for scripting
clawcode -p "Fix the bug in auth.py" -f json

# Quiet mode (no spinner)
clawcode -p "Run the test suite and report results" -q

# Specify working directory
clawcode -p "Add TypeScript types" -c /path/to/project

# Debug mode
clawcode -p "Analyze the build error" -d
Flag Purpose
-p / --prompt Run a single prompt non-interactively
-f / --output-format text (default) or json (outputs {"response": "..."})
-q / --quiet Hide the spinner
-d / --debug Enable debug logging
-c / --cwd Set working directory

How it works under the hood:

  1. Creates a session named Non-interactive: {prompt[:50]}
  2. Builds a minimal agent runtime
  3. Streams AgentEventType events
  4. Collects response content from RESPONSE and CONTENT_DELTA events
  5. Outputs in the requested format

CI/CD example:

# In your CI pipeline — generate a report as JSON
clawcode -p "Review this PR for security issues" -f json | jq .response > security_review.md

Developer Toolchain — How It All Works Together

Here's how a typical development session flows end-to-end:

You: "Fix the authentication bug and add tests"
    │
    ▼
┌────────────────────────────────────────────────────────────┐
│  1. Agent loads system prompt + project context             │
│     - Reads CLAUDE.md / .clawcode.md                        │
│     - Appends project tree structure                        │
│     - Loads active TODOs from .clawcode/todos.json          │
│     - Appends available skills from plugins                 │
└─────────────────────────┬──────────────────────────────────┘
                          │
                          ▼
┌────────────────────────────────────────────────────────────┐
│  2. Agent researches (Plan-like phase)                      │
│     - glob "**/*auth*"  → finds auth files                  │
│     - grep "authenticate" → finds relevant code             │
│     - View file contents → understands the implementation   │
│     - diagnostics → checks for existing errors              │
│     - Code Awareness lights up read files in blue           │
└─────────────────────────┬──────────────────────────────────┘
                          │
                          ▼
┌────────────────────────────────────────────────────────────┐
│  3. Agent plans and tracks                                  │
│     - TodoWrite → creates task list:                        │
│       1. Fix auth logic (in_progress)                       │
│       2. Add unit tests (pending)                           │
│       3. Run test suite (pending)                           │
│     - HUD shows TODO progress: ●○○                          │
└─────────────────────────┬──────────────────────────────────┘
                          │
                          ▼
┌────────────────────────────────────────────────────────────┐
│  4. Agent implements                                        │
│     - Edit auth.py → precise line-range replacement         │
│     - Write tests/test_auth.py → new test file              │
│     - Code Awareness highlights modified files in green     │
└─────────────────────────┬──────────────────────────────────┘
                          │
                          ▼
┌────────────────────────────────────────────────────────────┐
│  5. Agent executes and verifies                             │
│     - Bash: pytest tests/test_auth.py -v                    │
│     - Reads test output → verifies all tests pass           │
│     - LSP diagnostics → confirms no new errors              │
│     - TodoWrite → marks all tasks completed                 │
└─────────────────────────┬──────────────────────────────────┘
                          │
                          ▼
┌────────────────────────────────────────────────────────────┐
│  6. Agent reports back                                      │
│     - Summarizes changes made                               │
│     - Shows diff summary (via git integration)              │
│     - Lists files modified (from Code Awareness tracking)   │
│     - Session tokens/cost accumulated in DB                 │
│                                                             │
│     You can now:                                            │
│     - Review diffs with /rewind safety net                  │
│     - Check usage with /usage                               │
│     - Continue with a new session (Ctrl+N)                  │
└────────────────────────────────────────────────────────────┘

ClawCodeCreative Engineering Cockpit for Serious AI Builders

Your terminal. Your toolkit. Ship faster.