feat(rag): improve RAG system — hybrid search, re-ranking, semantic chunking

## Summary

The current RAG system uses FTS5 (BM25) + vector cosine similarity in parallel,
but the results are not re-ranked and chunking is fixed-size. This leads to
irrelevant context being injected and relevant context being missed on long
documents.

## Improvements

### 1. Semantic chunking
Replace fixed-size chunking with sentence/paragraph-aware splitting:
- Split on sentence boundaries rather than arbitrary token counts
- Respect markdown headers as natural chunk boundaries
- Configurable overlap between chunks

### 2. Hybrid search scoring (RRF)
Combine BM25 and vector scores using Reciprocal Rank Fusion instead of
running them in parallel and taking a union:
```go
rrf_score = 1/(k + bm25_rank) + 1/(k + vector_rank)
```
This produces a single ranked list with better precision.

### 3. Re-ranking (optional cross-encoder)
When a cross-encoder model is available locally (e.g. via Ollama),
use it to re-rank the top-N candidates before injecting into context.

### 4. Memory TTL and auto-pruning
Add a configurable TTL for memory entries — old, rarely-accessed records
are pruned automatically to keep the vector store lean.

### 5. Namespace isolation
Ensure RAG queries are always scoped to the current session namespace
to prevent cross-session memory leakage.

## Acceptance criteria

- [ ] Semantic chunker in `internal/rag/chunker.go`
- [ ] RRF fusion in `internal/rag/search.go` replacing the current union approach
- [ ] TTL field on `vector_records` table, pruning job runs on store open
- [ ] Namespace isolation enforced at the query layer
- [ ] Benchmarks: RRF recall@5 >= current implementation on the existing test fixtures
- [ ] `docs/` updated (RAG section in architecture.md or new `docs/rag.md`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rag): improve RAG system — hybrid search, re-ranking, semantic chunking #25

Summary

Improvements

1. Semantic chunking

2. Hybrid search scoring (RRF)

3. Re-ranking (optional cross-encoder)

4. Memory TTL and auto-pruning

5. Namespace isolation

Acceptance criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat(rag): improve RAG system — hybrid search, re-ranking, semantic chunking #25

Description

Summary

Improvements

1. Semantic chunking

2. Hybrid search scoring (RRF)

3. Re-ranking (optional cross-encoder)

4. Memory TTL and auto-pruning

5. Namespace isolation

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions