meta-kb

A self-improving knowledge base about LLM agent infrastructure.

Browse the wiki

Start here: The Landscape of LLM Agent Infrastructure


The State of LLM Knowledge Substrate	The State of Agent Memory
The State of Context Engineering	The State of Agent Architecture
The State of Multi-Agent Systems	The State of Self-Improving Systems

Knowledge Graph	Compilation Pipeline

How it works

Inspired by Andrej Karpathy's tweet about using LLMs to compile and maintain markdown wikis from raw sources. This repo applies that pattern to the topic of LLM knowledge systems itself, then adds a self-improvement loop. The repo IS the demo.

Self-improving — the compiler extracts atomic claims, verifies each against its cited source, and auto-fixes source attribution errors. The Karpathy loop (eval → analyze failures → update prompts → recompile) improved accuracy from 63.9% → 78.6% → 80.0% across three iterations.
Incremental — bun run compile --incremental detects source changes via content hashing, recompiles only affected buckets and entities. --status shows pending changes without compiling.
Deep research — the pipeline clones repos, reads 15-25 source files, fetches docs, and synthesizes architecture-level analysis.
Dual compilation — both a deterministic script pipeline and an agent-native skill graph produce the same output.
Neutral — all projects (including the author's own) receive the same depth and the same criticism.

How it was built: METHODOLOGY.md | System Design

Fork this for your own topic

This is a general-purpose knowledge compiler. To build your own wiki on any topic:

Fork this repo and clear raw/ and wiki/
Edit one file — config/domain.ts defines your topic, audience, taxonomy buckets, and scoring calibration
Add sources — bun run ingest <url> scores automatically, or add .md files manually
Compile — bun run compile generates the full wiki

Both compilation paths read from config/domain.ts, so they adapt automatically to your topic.

Example topics: ML papers survey, security research tracker, startup playbook, programming language ecosystem map, open-source alternatives directory.

Contributing

The easiest contribution is a new source — PR a .md file into raw/ or open an issue with a URL. See CONTRIBUTING.md for details.

Setup

bun install
cp .env.example .env  # add your ANTHROPIC_API_KEY

Environment variables:

ANTHROPIC_API_KEY — for compilation and scoring
APIFY_API_TOKEN — for Twitter scraping (ingestion only)
GITHUB_TOKEN — for GitHub API (ingestion only)
XQUIK_API_KEY — for X article extraction (optional, ingestion only)

Adding sources

bun run ingest <url1> [url2] ...       # ingest sources (auto-detects platform)
bun run research <url1> [url2] ...     # deep-research specific repos or papers
bun run research --all                 # deep-research all unresearched sources

The ingestion script detects platform (GitHub, arXiv, X/Twitter, general articles), supports awesome-list detection and X article extraction via Xquik. Each source gets taxonomy tags (via Haiku), a 4-dimension relevance score (via Sonnet), and a key insight extraction automatically. To re-score all sources (e.g., after changing config/domain.ts), run bun run rescore.

Deep research goes further — cloning repos, reading 15-25 key source files, fetching documentation, then synthesizing structured analysis (architecture, design tradeoffs, failure modes, benchmarks) into raw/deep/. See the deep-research skill for the full methodology.

Two ways to compile

Path A: Skill graph (agent-native)

Ask any AI coding agent: "Compile the wiki from raw sources."

The compile-wiki skill orchestrates a 6-phase pipeline using subagents — each phase has its own skill with focused context. Synthesis articles and reference cards compile in parallel via subagents. Works with Claude Code, Codex, Cursor, or any agent that can read .claude/skills/.

For incremental updates after ingesting new sources, use the incremental-compile skill — it detects what changed and only regenerates affected articles.

Path B: Script pipeline (deterministic)

bun run compile       # raw/ → build/ → wiki/
bun run lint          # verify structural integrity
bun run diagrams      # generate D2 + D3 visualizations

Both paths produce the same output structure. Run both for a comparison diff between agent-native and deterministic compilation.

Stats

Sources: 142 curated (31 tweets, 71 repos, 16 papers, 24 articles) + 57 deep research files
Taxonomy: 6 buckets (knowledge substrate, agent memory, context engineering, agent architecture, multi-agent systems, self-improving systems)
Wiki: 155 articles (6 synthesis, 87 project cards, 61 concept explainers, field map, indexes)
Deep research: 157K words of source-code-level analysis
Self-eval: 268 atomic claims extracted, sampled and verified against sources each compilation
Compilation: Script pipeline (bun run compile) or agent skill graph (.claude/skills/compile-wiki/)

Roadmap

Incremental recompilation — bun run compile --incremental skips unchanged sources, regenerates only dirty buckets/entities
Source acquisition — fill coverage gaps in thin buckets, add historical retrospectives and production case studies
Cross-article synthesis — sequential compilation with evidence registry to eliminate cross-article repetition
Claims-first migration — invert pipeline to raw → claims → articles for better attribution accuracy and reliable incremental recompilation
Temporal claim decay — auto-expire time-sensitive claims (star counts, benchmarks) and flag articles for refresh

See DESIGN.md for the full architectural vision and evaluation findings.

License

Code: MIT. Wiki content: CC-BY-SA 4.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.claude/skills		.claude/skills
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
build		build
compile		compile
config		config
docs		docs
raw		raw
scripts		scripts
tests		tests
wiki		wiki
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

meta-kb

Browse the wiki

How it works

Fork this for your own topic

Contributing

Setup

Adding sources

Two ways to compile

Path A: Skill graph (agent-native)

Path B: Script pipeline (deterministic)

Stats

Roadmap

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

meta-kb

Browse the wiki

How it works

Fork this for your own topic

Contributing

Setup

Adding sources

Two ways to compile

Path A: Skill graph (agent-native)

Path B: Script pipeline (deterministic)

Stats

Roadmap

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages