Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 15 additions & 7 deletions docs/blog/2026-04-12-welcome/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Traditional RAG systems rely on vector embeddings and similarity search. This ap
Vectorless takes a different path:

- **Hierarchical Semantic Trees** — Documents are parsed into a tree of sections, preserving structure and relationships.
- **LLM Navigation** — Queries are resolved by intelligently traversing the tree, not by comparing vectors.
- **LLM Agent Navigation** — Queries are resolved by agents that navigate the tree using commands (ls, cd, cat, find, grep), making every decision through LLM reasoning.
- **Zero Infrastructure** — No vector DB, no embedding models, no similarity search. Just an LLM API key.

## Quick Start
Expand All @@ -25,7 +25,7 @@ Vectorless takes a different path:

```python
import asyncio
from vectorless import Engine, IndexContext
from vectorless import Engine, IndexContext, QueryContext

async def main():
engine = Engine(
Expand All @@ -38,7 +38,9 @@ async def main():
doc_id = result.doc_id

# Query
answer = await engine.query(doc_id, "What is the total revenue?")
answer = await engine.query(
QueryContext("What is the total revenue?").with_doc_ids([doc_id])
)
print(answer.single().content)

asyncio.run(main())
Expand Down Expand Up @@ -69,11 +71,17 @@ async fn main() -> vectorless::Result<()> {
}
```

## What's Next?
## How It Works

1. **Index** — Documents are parsed into hierarchical semantic trees with pre-computed navigation indexes and keyword mappings.
2. **Query** — The Orchestrator coordinates multi-document retrieval by dispatching Worker agents. Each Worker navigates the tree using commands, collects evidence, and self-evaluates sufficiency.
3. **Result** — Evidence is deduplicated, ranked by BM25 relevance, and returned as original document text.

## What's Next

- Cross-document relationship graph
- Incremental indexing with content fingerprinting
- Multi-format support (Markdown, PDF, DOCX)
- Cross-document graph-aware retrieval with score boosting
- DOCX format support
- Streaming query results with real-time progress events

The project is open source under Apache-2.0. Contributions welcome!

Expand Down
59 changes: 47 additions & 12 deletions docs/docs/architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Vectorless transforms documents into hierarchical semantic trees and uses LLM-po
┌──────────────┐ ┌──────▼───────┐
│ Result │◀────│ Retrieval │
(Answer) │ │ Pipeline │
(Evidence) │ │ Pipeline │
└──────────────┘ └──────────────┘
```

Expand All @@ -33,6 +33,7 @@ The indexing pipeline processes documents through ordered stages:
| **Enhance** | 30 | Generate LLM summaries (Full, Selective, or Lazy strategy) |
| **Enrich** | 40 | Calculate metadata, page ranges, resolve cross-references |
| **Reasoning Index** | 45 | Build keyword-to-node mappings, synonym expansion, summary shortcuts |
| **Navigation Index** | 50 | Build NavEntry + ChildRoute data for agent navigation |
| **Optimize** | 60 | Final tree optimization |

Each stage is independently configurable. The pipeline supports incremental re-indexing via content fingerprinting.
Expand Down Expand Up @@ -70,11 +71,11 @@ Engine.query()
→ Dispatcher
→ Query Understanding (LLM) → QueryPlan (intent, concepts, strategy)
→ Orchestrator (always — single or multi-doc)
→ Analyze (LLM selects documents + tasks)
→ Analyze (LLM reviews DocCards, selects documents + tasks)
→ Supervisor Loop:
Dispatch Workers → Evaluate (LLM sufficiency check)
→ if insufficient → Replan (LLM) → loop
→ Rerank (dedup → BM25 score → synthesis/fusion)
→ Rerank (dedup → BM25 score → evidence formatting)
```

### Query Understanding
Expand All @@ -84,41 +85,75 @@ Every query first passes through LLM-based understanding:
| Field | Description |
|-------|-------------|
| **Intent** | Factual, Analytical, Navigational, or Summary |
| **Complexity** | Simple, Moderate, or Complex |
| **Key Concepts** | LLM-extracted concepts (distinct from keywords) |
| **Strategy Hint** | focused, exploratory, comparative, or summary |
| **Key Concepts** | LLM-extracted concepts (distinct from keywords) |

### Orchestrator (Supervisor)

The Orchestrator is the central coordinator. It always runs — even for single-document queries. Its supervisor loop:

1. **Analyze** — LLM reviews DocCards and selects relevant documents with specific tasks
1. **Analyze** — LLM reviews DocCards (lightweight metadata) and selects relevant documents with specific tasks
2. **Dispatch** — Fan-out Workers in parallel (one per document)
3. **Evaluate** — LLM checks if collected evidence is sufficient to answer the query
4. **Replan** (if insufficient) — LLM identifies missing information and dispatches additional Workers

When the user specifies document IDs directly, the Orchestrator skips the analysis phase and dispatches Workers immediately.

### Worker (Evidence Collector)

Each Worker navigates a single document's tree to collect evidence:
Each Worker navigates a single document's tree to collect evidence through a command-based loop:

1. **Bird's-eye** — `ls` the root for an overview
2. **Plan** — LLM generates a navigation plan
3. **Navigate** — Loop: LLM command → execute → repeat (with budget)
2. **Plan** — LLM generates a navigation plan based on keyword index hits
3. **Navigate** — Loop: LLM selects command → execute → observe result → repeat
4. **Return** — Collected evidence only — no answer synthesis

Workers use tree commands (`ls`, `cd`, `cat`, `grep`, `find`, `findtree`) and a `check` command for self-evaluation.
#### Available Commands

| Command | Description |
|---------|-------------|
| `ls` | List children at current position (with summaries and leaf counts) |
| `cd <name>` | Enter a child node |
| `cd ..` | Go back to parent |
| `cat <name>` | Read node content (automatically collected as evidence) |
| `head <name>` | Preview first N lines (does NOT collect evidence) |
| `find <keyword>` | Search the document's ReasoningIndex for a keyword |
| `findtree <pattern>` | Search for nodes by title pattern (case-insensitive) |
| `grep <pattern>` | Regex search across content in current subtree |
| `wc <name>` | Show content size (lines, words, chars) |
| `pwd` | Show current navigation path |
| `check` | Evaluate if collected evidence is sufficient |
| `done` | End navigation |

#### Navigation Strategy

Workers prioritize keyword-based navigation over manual exploration:

1. When keyword index hits are available, Workers use `find` with the exact keyword to jump directly to relevant sections
2. Workers use `ls` when no keyword hints exist or when discovering unknown structure
3. Workers use `findtree` when the section title pattern is known but not the exact name

#### Dynamic Re-planning

After a `check` command finds insufficient evidence, the Worker triggers a re-plan — the LLM generates a new navigation plan based on what's missing. This allows the Worker to adapt its strategy mid-navigation.

### Rerank Pipeline

After all Workers complete, the Orchestrator runs the final pipeline:

1. **Dedup** — Remove duplicate and low-quality evidence
2. **BM25 Scoring** — Rank evidence by keyword relevance
3. **Answer Generation** — LLM synthesizes or fuses evidence into a final answer
3. **Evidence Formatting** — Return original document text with source attribution

The system returns raw evidence text — no LLM synthesis or paraphrasing. This ensures the user sees the exact document content that matches their query.

## DocCard Catalog

When multiple documents are indexed, Vectorless maintains a lightweight `catalog.bin` containing DocCard metadata for each document. This allows the Orchestrator to analyze and select relevant documents without loading the full document trees — a significant optimization for workspaces with many documents.

## Cross-Document Graph

When multiple documents are indexed, Vectorless automatically builds a relationship graph based on shared keywords and Jaccard similarity. This graph enables cross-document retrieval with score boosting.
When multiple documents are indexed, Vectorless automatically builds a relationship graph based on shared keywords and Jaccard similarity. The graph is constructed as a background task after each indexing operation.

## Zero Infrastructure

Expand Down
32 changes: 9 additions & 23 deletions docs/docs/features/cross-document-graph.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,36 +23,18 @@ Document A ←── 0.72 ──→ Document B

### Graph Building

After each indexing operation, the graph is automatically rebuilt:
After each indexing operation, the graph is automatically rebuilt as a background task:

1. Extract keyword profiles from each document's reasoning index
2. Compute pairwise Jaccard similarity
3. Create edges for document pairs exceeding the similarity threshold
4. Store the graph in the workspace

### Graph-Aware Retrieval
The graph builder uses keyword weights from the ReasoningIndex — keywords that appear in titles get 2.0× weight, summaries 1.5×, and content 1.0×. This ensures that structurally important keywords have more influence on the similarity calculation.

When using the cross-document strategy, the graph boosts scores for connected documents:
### Accessing the Graph

1. Search each document independently
2. Identify high-confidence results (score > 0.5)
3. For each high-confidence result, boost neighbor documents' scores
4. Re-rank the merged result set

```python
from vectorless import Engine, QueryContext

engine = Engine(api_key="sk-...", model="gpt-4o")

# Query across all documents with graph boosting
result = await engine.query(
QueryContext("Compare the approaches")
)
```

## Accessing the Graph

### Python
#### Python

```python
graph = await engine.get_graph()
Expand All @@ -67,7 +49,7 @@ if graph:
print(f" → {edge.target_doc_id} (weight: {edge.weight:.2f})")
```

### Rust
#### Rust

```rust
if let Some(graph) = engine.get_graph().await? {
Expand Down Expand Up @@ -100,3 +82,7 @@ min_keyword_jaccard: 0.1 — Minimum Jaccard similarity threshold
max_keywords_per_doc: 50 — Max keywords extracted per document
max_edges_per_node: 10 — Max edges per document node
```

## Current Status

The graph is built and persisted during indexing. Graph-aware retrieval features (such as score boosting for connected documents) are planned for a future release. Currently, the graph serves as a relationship discovery and inspection tool accessible via the API.
27 changes: 20 additions & 7 deletions docs/docs/features/summary-strategies.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ sidebar_position: 1

# Summary Strategies

Summaries are critical for retrieval quality. The Pilot uses summaries to evaluate candidate nodes during tree navigation. Without summaries, the Pilot can only use node titles for decision-making, which significantly reduces accuracy.
Summaries are critical for retrieval quality. The Worker agent uses summaries in the NavigationIndex to decide which branches to explore and where to navigate. Without summaries, the Worker can only use node titles for decision-making, which significantly reduces accuracy.

## Available Strategies

Expand Down Expand Up @@ -32,7 +32,7 @@ let strategy = SummaryStrategy::selective(100, true);
- `min_tokens` — Minimum content tokens to generate a summary (default: 100)
- `branch_only` — Only generate for non-leaf nodes (default: true)

**Trade-off**: Lower indexing cost, but leaf nodes lack summaries. The Pilot falls back to title-only evaluation at leaf level.
**Trade-off**: Lower indexing cost, but leaf nodes lack summaries. The Worker falls back to title-only evaluation at leaf level.

### Lazy

Expand All @@ -56,10 +56,23 @@ let strategy = SummaryStrategy::lazy(true);

## How Summaries Are Used

During retrieval, the Pilot builds context for each candidate node:
During retrieval, the Worker agent reads summary data from the NavigationIndex at each decision point:

1. **Title** — Always available, highest priority signal
2. **Summary** — Used for semantic evaluation at fork points
3. **Content** — Used for BM25 scoring and final result
1. **`ls` output** — Child nodes show their descriptions (derived from summaries) and leaf counts
2. **Navigation decisions** — The LLM evaluates summaries to decide which branch to enter
3. **Keyword index** — Topic tags from summaries are indexed for `find` command lookups
4. **DocCards** — Root-level summaries power the Orchestrator's document selection

When a node has no summary, the Pilot's decision quality degrades. This is why **Full** is the default — it ensures the Pilot always has summaries to work with.
When a node has no summary, the Worker's navigation quality degrades. This is why **Full** is the default — it ensures the Worker always has summary context to work with.

## Navigation-Oriented Summaries

Branch nodes receive structured summaries with three components:

| Component | Purpose | Used By |
|-----------|---------|---------|
| **OVERVIEW** | 2-3 sentence routing summary | Worker's `ls` output |
| **QUESTIONS** | 3-5 typical questions this branch can answer | Keyword index |
| **TAGS** | 2-4 topic keywords | ReasoningIndex `find` command |

This structured format enables the Worker to quickly assess whether a branch is worth exploring without reading its full content.
7 changes: 2 additions & 5 deletions docs/docs/getting-started.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,7 @@ async fn main() -> vectorless::Result<()> {
let result = engine.query(
QueryContext::new("What is the total revenue?").with_doc_ids(vec![doc_id.to_string()])
).await?;

if let Some(item) = result.single() {
println!("{}", item.content);
}
println!("{}", result.content);

Ok(())
}
Expand All @@ -97,4 +94,4 @@ async fn main() -> vectorless::Result<()> {

- [Architecture](/docs/architecture) — Understand the indexing and retrieval pipeline
- [Indexing Overview](/docs/indexing/overview) — Learn about each pipeline stage
- [Retrieval Strategies](/docs/retrieval/strategies) — Choose the right strategy for your use case
- [Retrieval Strategies](/docs/retrieval/strategies) — Understand how queries are processed
15 changes: 9 additions & 6 deletions docs/docs/intro.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ It transforms documents into hierarchical semantic trees and uses LLMs to naviga

1. **Parse** — Documents (Markdown, PDF) are parsed into hierarchical semantic trees, preserving structure and relationships between sections.
2. **Index** — Trees are stored with metadata, keywords, and summaries. The pipeline resolves cross-references ("see Section 2.1") and expands keywords with LLM-generated synonyms for improved recall. Incremental indexing skips unchanged files via content fingerprinting.
3. **Query** — An LLM navigates the tree to find the most relevant sections. Multiple search algorithms (Beam Search, MCTS, Greedy) are available, and the Pilot component provides LLM-guided navigation at key decision points.
3. **Query** — An LLM-powered agent navigates the tree to find the most relevant sections. The Orchestrator coordinates multi-document queries, dispatching Workers that use `ls`, `cd`, `cat`, `find`, and `grep` commands to explore the tree and collect evidence.

## Quick Start

Expand All @@ -24,7 +24,7 @@ pip install vectorless

```python
import asyncio
from vectorless import Engine, IndexContext
from vectorless import Engine, IndexContext, QueryContext

async def main():
engine = Engine(
Expand All @@ -35,7 +35,9 @@ async def main():
result = await engine.index(IndexContext.from_path("./report.pdf"))
doc_id = result.doc_id

answer = await engine.query(doc_id, "What is the total revenue?")
answer = await engine.query(
QueryContext("What is the total revenue?").with_doc_ids([doc_id])
)
print(answer.single().content)

asyncio.run(main())
Expand Down Expand Up @@ -74,11 +76,12 @@ async fn main() -> vectorless::Result<()> {
## Features

- **Hierarchical Semantic Trees** — Preserves document structure, not flat chunks
- **LLM-Powered Retrieval** — Structural reasoning over the tree, not vector similarity
- **Cross-Reference Navigation** — Automatically resolves "see Section 2.1", "Appendix G" references and follows them during retrieval
- **LLM-Powered Agent Navigation** — Worker agents navigate the tree using commands (ls, cd, cat, find, grep), making every retrieval decision through LLM reasoning
- **Cross-Reference Resolution** — Automatically resolves "see Section 2.1", "Appendix G" references during indexing
- **Synonym Expansion** — LLM-generated synonyms for indexed keywords improve recall for differently-worded queries
- **Multi-Algorithm Search** — Beam Search, MCTS, Greedy, and ToC Navigator with LLM Pilot guidance
- **Orchestrator Supervisor Loop** — Multi-document queries are coordinated by an LLM supervisor that dispatches Workers, evaluates evidence, and replans when needed
- **Cross-Document Graph** — Automatic relationship discovery between documents via shared keywords
- **Incremental Indexing** — Content fingerprinting skips unchanged files
- **DocCard Catalog** — Lightweight document metadata index enables fast multi-document analysis without loading full documents
- **Multi-Format** — Markdown and PDF support
- **Zero Infrastructure** — No vector DB, no embedding models, just an LLM API key
Loading
Loading