The AI-Native Code Intelligence Backend. Extract the Signal, Discard the Noise.
Giving LLM agents deterministic, AST-level understanding of any codebase β at nuclear token efficiency.
Most AI coding agents rely on tools built for human eyeballs β cat, grep, tree, git diff. For an LLM these are toxic: they flood the context window with whitespace, comments, full file dumps, and force the agent into "amnesia" from pagination.
CortexAST is a sensory system built strictly for AI brains.
Powered by Tree-sitter and written in pure Rust, it gives agents a deterministic, high-fidelity understanding of entire codebases β cutting token usage by up to 90 % while preserving 100 % of the architectural logic.
| Task | β Standard (For Humans) | π§ CortexAST (For AI) | Result |
|---|---|---|---|
| Exploration | tree / ls β filenames only |
map_repo β files + public symbols inside |
Instant architecture map |
| Reading Code | cat β 2 000-line dump |
read_symbol β exact AST node only |
Nuclear token savings |
| Finding Stuff | grep β string matches incl. comments |
find_usages β AST-accurate, zero false positives |
Calls / Type Refs / Fields |
| Refactoring | git diff β line & whitespace noise |
save_checkpoint + compare_checkpoint |
Crystal-clear semantic diff |
| Cross-Service | Manual file-by-file search | propagation_checklist β Proto β Rust β TS |
Prevents missing propagation |
| Blast Radius | Guessing | call_hierarchy β Incoming + Outgoing callers |
Safe rename / delete |
Returns a hierarchical codebase map showing files and their exported symbols.
search_filterβ case-insensitive substring, OR via|(e.g."auth|user"); matches file paths and, for repos β€ 300 files, symbol names tooignore_gitignoreβ settrueto include generated / git-ignored filesmax_charsβ output cap (default 8 000 chars)- Built-in guardrails: did-you-mean path recovery, regex-input warning, overflow diagnostics
Extracts the exact, full source of any symbol (function, struct, class, const) via AST.
symbol_names: ["A","B","C"]β batch mode, multiple symbols in one call- "Symbol not found" error: lists up to 30 available symbols + recovery hint pointing to
find_usages/map_repo
Always use instead of grep / rg. 100 % accurate AST usages across the workspace, zero false positives from comments or strings. Categorises hits:
- Calls β function / method invocations
- TypeRefs β type annotations, generics
- FieldInits β struct field assignments
Use before any function rename, move, or delete. Shows who calls the function (Incoming) and what the function calls (Outgoing).
Token-budget-aware XML slice of a directory or file. Skeletonises all source (bodies pruned, imports collapsed).
queryβ optional semantic vector search; ranks files by relevance first- Inline / spill: output β€ 8 KB returned inline; larger output written to
/tmp/cortexast_slice_{hash}.xmlβ useread_fileto access it
Auto-detects project type (cargo check / tsc --noEmit), runs the compiler, maps errors directly to AST source lines.
Save structural snapshots before edits and compare semantics after β without whitespace or line-number noise.
save_checkpointβ Use before any non-trivial edit or refactor. Snapshots a symbol's AST to disk with a semantic tag (e.g.pre-refactor)list_checkpointsβ shows all saved snapshots grouped by tagcompare_checkpointβ structural diff between two snapshots; ignores whitespace and line-number noise
Use before changing any shared type, struct, interface, or API contract.
Generates a strict Markdown checklist grouped by language / domain (Proto β Rust β TS β Python β Other).
symbol_nameβ AST-traces the symbol across the entire workspaceignore_gitignore: trueβ includes generated stubs (gRPC, Protobuf, etc.)- Line numbers per file (up to 5 shown,
β¦suffix if more) - Hard cap: 50 files, 8 000 chars; BLAST RADIUS WARNING if exceeded
changed_pathβ legacy file-based mode (still supported)
Target: CortexAST source (10+ Rust files, core logic)
Hardware: Apple M4 Pro / 14 CPU Β· 20 GPU Β· 24 GB RAM
| Metric | Raw Copy-Paste | π§ CortexAST |
|---|---|---|
| Total Size | 127 536 bytes | 9 842 bytes β 92.3 % smaller |
| Est. Token Cost | ~31 884 tokens | ~2 460 tokens |
| Processing Time | N/A | < 0.1 s (Pure Rust) |
| Information Density | Low (noise-heavy) | High (pure logic) |
- Nuclear Skeletonisation β function bodies collapse to signatures, imports stripped, indentation flattened
- JIT Hybrid Vector Search β
model2vec-rs(pure Rust, < 100 MB RAM);xxh3content hashing; incremental updates on-demand only - Enterprise Workspace Engine β auto-discovers nested microservices (
Cargo.toml,package.json,pyproject.toml) and routes token budgets across monorepos - Bulletproof Safety β null-byte detection, 1 MB file cap, minified-bundle guard, UTF-8 lossy fallback, index auto-repair
Download from Releases:
| Platform | File |
|---|---|
| macOS Apple Silicon | cortexast-macos-aarch64 |
| macOS Intel | cortexast-macos-x86_64 |
| Linux x86_64 | cortexast-linux-x86_64 |
| Linux ARM64 | cortexast-linux-aarch64 |
| Windows x86_64 | cortexast-windows-x86_64.exe |
chmod +x cortexast-macos-aarch64
./cortexast-macos-aarch64 --helpgit clone https://github.com/DevsHero/CortexAST.git
cd CortexAST
cargo build --release
# Binary: ./target/release/cortexastSee docs/BUILDING.md for cross-compilation instructions.
Add to your MCP client config (Claude Desktop / VS Code / Cursor / Cline / Windsurf):
{
"mcpServers": {
"cortexast": {
"command": "/absolute/path/to/cortexast",
"args": ["mcp"]
}
}
}See docs/MCP_SETUP.md for per-client setup instructions.
To maximise CortexAST's effectiveness, add the rules below to your AI assistant's instruction file. This ensures the agent always prefers CortexAST tools over basic shell commands and follows the correct workflow to minimise hallucination and token waste.
File: .github/copilot-instructions.md
## CortexAST Priority Rules
- ALWAYS use `map_repo` instead of ls/tree/find
- ALWAYS use `read_symbol` instead of cat/head/any file read
- ALWAYS use `find_usages` instead of grep/rg/ag/semantic_search
- ALWAYS call `save_checkpoint` before any non-trivial edit or refactor
- ALWAYS call `propagation_checklist` before changing a shared type, struct, or interface
- Use `get_context_slice` with a `query` for multi-file overviews (output may spill to file β read it)
- Use `call_hierarchy` to measure blast radius before renaming, moving, or deleting any functionFile: .cursor/rules/cortexast.mdc (or .cursorrules)
---
description: CortexAST tool priority rules for AI agents
globs: ["**/*"]
alwaysApply: true
---
## CortexAST Priority Rules
- ALWAYS use `map_repo` instead of ls/tree/find
- ALWAYS use `read_symbol` instead of cat/head/any file read
- ALWAYS use `find_usages` instead of grep/rg/ag
- ALWAYS call `save_checkpoint` before any non-trivial edit or refactor
- ALWAYS call `propagation_checklist` before changing a shared type, struct, or interface
- Use `get_context_slice` with a `query` for multi-file overviews (output may spill to file β read it)
- Use `call_hierarchy` to measure blast radius before renaming, moving, or deleting any functionFile: .windsurfrules
## CortexAST Priority Rules
- ALWAYS use `map_repo` instead of ls/tree/find
- ALWAYS use `read_symbol` instead of cat/head/any file read
- ALWAYS use `find_usages` instead of grep/rg/ag
- ALWAYS call `save_checkpoint` before any non-trivial edit or refactor
- ALWAYS call `propagation_checklist` before changing a shared type, struct, or interface
- Use `get_context_slice` with a `query` for multi-file overviews (output may spill to file β read it)
- Use `call_hierarchy` to measure blast radius before renaming, moving, or deleting any functionFile: .clinerules
## CortexAST Priority Rules
- ALWAYS use `map_repo` instead of ls/tree/find
- ALWAYS use `read_symbol` instead of cat/head/any file read
- ALWAYS use `find_usages` instead of grep/rg/ag
- ALWAYS call `save_checkpoint` before any non-trivial edit or refactor
- ALWAYS call `propagation_checklist` before changing a shared type, struct, or interface
- Use `get_context_slice` with a `query` for multi-file overviews (output may spill to file β read it)
- Use `call_hierarchy` to measure blast radius before renaming, moving, or deleting any functionAdd to claude_desktop_config.json β systemPrompt:
CortexAST Priority Rules:
- ALWAYS use map_repo instead of ls/tree/find
- ALWAYS use read_symbol instead of cat/head/any file read
- ALWAYS use find_usages instead of grep/rg/ag
- ALWAYS call save_checkpoint before any non-trivial edit or refactor
- ALWAYS call propagation_checklist before changing a shared type, struct, or interface
- Use get_context_slice with a query for multi-file overviews (output may spill to file β read it)
- Use call_hierarchy to measure blast radius before renaming, moving, or deleting any function
PRs welcome.
- Core: Rust (Tokio, Rayon, Model2Vec, Tree-sitter)
- Focus: performance, compression ratio, multi-language correctness
See CHANGELOG.md for version history.
Crafted with π¦ by DevsHero.