Skip to content

feat(routing,context,adapters): explicit pipeline + embedding backend + history-aware routing (#8, #27, #56)#236

Open
dgenio wants to merge 1 commit into
mainfrom
claude/triage-issues-VZT4U
Open

feat(routing,context,adapters): explicit pipeline + embedding backend + history-aware routing (#8, #27, #56)#236
dgenio wants to merge 1 commit into
mainfrom
claude/triage-issues-VZT4U

Conversation

@dgenio
Copy link
Copy Markdown
Owner

@dgenio dgenio commented May 17, 2026

#56 — Routing pipeline decomposition

  • New routing/pipeline.py composer with explicit stages:
    retrieve -> rerank -> navigate -> pack
  • New Navigator + CardPacker protocols in protocols.py
  • BeamSearchNavigator lifted verbatim from router.py (byte-identical
    default behaviour; verified by the existing 50+ router regression tests
    and make scorecard-check)
  • DefaultCardPacker wraps make_choice_cards with a soft budget_tokens cap
  • Router.route() now delegates to RoutingPipeline.navigate(); the public
    API surface and every RouteResult field are preserved

#8 — Optional embedding-based retrieval backend

  • New EmbeddingBackend protocol in protocols.py
  • New [embeddings] extra (pip install 'contextweaver[embeddings]')
  • New extras/embeddings.py: SentenceTransformerBackend + the
    HybridEmbeddingRetriever (70/30 embedding+TF-IDF weighted sum so
    lexical exact-id / exact-tag hits keep their floor)
  • Router(embedding_backend=...) constructs the hybrid retriever via a
    lazy import so the core install never pulls torch
  • Mock HashEmbeddingBackend in tests provides a deterministic
    default-install test path; real sentence-transformers integration
    test is gated behind pytest.importorskip

#27 — History-aware re-routing + tool-dependency metadata

  • Phase 1: new routing/history.py with RouteHistory dataclass +
    adjust_scores helper. Router.route(..., history=RouteHistory(...))
    applies a repeat penalty to already-called tools and boosts
    candidates whose description resembles last_result_summary (computed
    via the router's fitted retriever, so the boost is in the same
    scoring space as the primary query). Per-item deltas surface on the
    new RouteResult.history_adjustments field + trace.extra.
  • Phase 2: SelectableItem gains optional depends_on / provides /
    requires fields; adjust_scores applies a satisfaction boost when a
    candidate's requires are fully covered by provides of already-
    called tools, and a penalty when depends_on references an
    uncalled tool. All three default to None and round-trip omitted
    when unset.
  • ContextManager.build_route_prompt auto-constructs a RouteHistory
    from the event log (tools whose tool_result is in the log). Set
    history_from_log=False or pass history=... explicitly to opt out
    / override.
  • Catalog.validate_dependencies() returns human-readable warnings for
    depends_on entries pointing at unknown tool ids.

Schemas + extras

  • schemas/catalog.schema.json + docs/schemas/v0/catalog.schema.json
    regenerated (3 new nullable array fields on items).
  • pyproject.toml adds [embeddings] extra and a sentence_transformers
    mypy override.

Verification
ruff format --check src/ tests/ examples/ scripts/ -> clean
ruff check src/ tests/ examples/ scripts/ -> clean
mypy src/ -> 0 issues / 79 files
pytest -q -> 1111 passed, 6 skipped
(+70 new tests)
scripts/gen_schemas.py --check -> schemas up to date
scripts/render_scorecard.py --check -> exit 0 (no drift)
make example -> all scripts clean
make demo -> Demo complete
make llms-check -> up to date

Closes #8
Closes #27
Closes #56

https://claude.ai/code/session_017YLnTSUmEXLXV85JC29oYf

… + history-aware routing (#8, #27, #56)

#56 — Routing pipeline decomposition
- New routing/pipeline.py composer with explicit stages:
  retrieve -> rerank -> navigate -> pack
- New Navigator + CardPacker protocols in protocols.py
- BeamSearchNavigator lifted verbatim from router.py (byte-identical
  default behaviour; verified by the existing 50+ router regression tests
  and `make scorecard-check`)
- DefaultCardPacker wraps make_choice_cards with a soft budget_tokens cap
- Router.route() now delegates to RoutingPipeline.navigate(); the public
  API surface and every RouteResult field are preserved

#8 — Optional embedding-based retrieval backend
- New EmbeddingBackend protocol in protocols.py
- New [embeddings] extra (`pip install 'contextweaver[embeddings]'`)
- New extras/embeddings.py: SentenceTransformerBackend + the
  HybridEmbeddingRetriever (70/30 embedding+TF-IDF weighted sum so
  lexical exact-id / exact-tag hits keep their floor)
- Router(embedding_backend=...) constructs the hybrid retriever via a
  lazy import so the core install never pulls torch
- Mock HashEmbeddingBackend in tests provides a deterministic
  default-install test path; real sentence-transformers integration
  test is gated behind pytest.importorskip

#27 — History-aware re-routing + tool-dependency metadata
- Phase 1: new routing/history.py with RouteHistory dataclass +
  adjust_scores helper.  Router.route(..., history=RouteHistory(...))
  applies a repeat penalty to already-called tools and boosts
  candidates whose description resembles last_result_summary (computed
  via the router's fitted retriever, so the boost is in the same
  scoring space as the primary query).  Per-item deltas surface on the
  new RouteResult.history_adjustments field + trace.extra.
- Phase 2: SelectableItem gains optional depends_on / provides /
  requires fields; adjust_scores applies a satisfaction boost when a
  candidate's `requires` are fully covered by `provides` of already-
  called tools, and a penalty when `depends_on` references an
  uncalled tool.  All three default to None and round-trip omitted
  when unset.
- ContextManager.build_route_prompt auto-constructs a RouteHistory
  from the event log (tools whose tool_result is in the log).  Set
  history_from_log=False or pass history=... explicitly to opt out
  / override.
- Catalog.validate_dependencies() returns human-readable warnings for
  depends_on entries pointing at unknown tool ids.

Schemas + extras
- schemas/catalog.schema.json + docs/schemas/v0/catalog.schema.json
  regenerated (3 new nullable array fields on items).
- pyproject.toml adds [embeddings] extra and a sentence_transformers
  mypy override.

Verification
  ruff format --check src/ tests/ examples/ scripts/   -> clean
  ruff check src/ tests/ examples/ scripts/            -> clean
  mypy src/                                            -> 0 issues / 79 files
  pytest -q                                            -> 1111 passed, 6 skipped
                                                         (+70 new tests)
  scripts/gen_schemas.py --check                       -> schemas up to date
  scripts/render_scorecard.py --check                  -> exit 0 (no drift)
  make example                                         -> all scripts clean
  make demo                                            -> Demo complete
  make llms-check                                      -> up to date

Closes #8
Closes #27
Closes #56

https://claude.ai/code/session_017YLnTSUmEXLXV85JC29oYf
Copilot AI review requested due to automatic review settings May 17, 2026 18:02
@github-actions
Copy link
Copy Markdown

Benchmark delta (vs main)

Soft regression feedback only — this comment never blocks the PR.
Latency budget: ⚠️ when head > base × 1.3. Accuracy budget: ⚠️ when head < base - 1pp.

Routing summary (single backend × catalog sizes)

size recall@k (head Δ vs base) MRR (head Δ vs base) p99 (ms)
50 ✅ 0.5649 (+0.0000) ✅ 0.4978 (+0.0000) ✅ 0.503 (base 0.463)
83 ✅ 0.3825 (+0.0000) ✅ 0.3242 (+0.0000) ✅ 0.723 (base 0.876)
1000 ✅ 0.1475 (+0.0000) ✅ 0.1456 (+0.0000) ✅ 38.163 (base 31.897)

Per-backend × per-size matrix

backend size recall@k (Δ) MRR (Δ) p99 (ms)
bm25 100 ✅ 0.3825 (+0.0000) ✅ 0.3399 (+0.0000) ✅ 6.386 (base 5.642)
bm25 500 ✅ 0.2250 (+0.0000) ✅ 0.2165 (+0.0000) ✅ 31.224 (base 27.538)
bm25 1000 ✅ 0.1575 (+0.0000) ✅ 0.1525 (+0.0000) ✅ 86.143 (base 78.368)
fuzzy 100 ✅ 0.0000 (+0.0000) ✅ 0.0000 (+0.0000) ✅ 0.000 (base 0.000)
fuzzy 500 ✅ 0.0000 (+0.0000) ✅ 0.0000 (+0.0000) ✅ 0.000 (base 0.000)
fuzzy 1000 ✅ 0.0000 (+0.0000) ✅ 0.0000 (+0.0000) ✅ 0.000 (base 0.000)
tfidf 100 ✅ 0.3825 (+0.0000) ✅ 0.3220 (+0.0000) ✅ 1.055 (base 0.872)
tfidf 500 ✅ 0.2325 (+0.0000) ✅ 0.2314 (+0.0000) ✅ 9.815 (base 8.660)
tfidf 1000 ✅ 0.1475 (+0.0000) ✅ 0.1456 (+0.0000) ✅ 38.649 (base 30.071)

Context pipeline (per scenario)

scenario tokens dropped dedup
large_catalog 1514 (base 1514, Δ+0) 0 (base 0, Δ+0) 0 (base 0, Δ+0)
long_conversation 2548 (base 2548, Δ+0) 0 (base 0, Δ+0) 0 (base 0, Δ+0)
short_conversation 496 (base 496, Δ+0) 0 (base 0, Δ+0) 0 (base 0, Δ+0)
stress_conversation 6651 (base 6651, Δ+0) 7 (base 7, Δ+0) 4 (base 4, Δ+0)

Numbers come from make benchmark / make benchmark-matrix.
Latency is hardware-dependent — treat the markers as a rough guide.
See benchmarks/scorecard.md for the full picture.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR advances the routing engine by making routing stages explicit and swappable (pipeline composer + navigator + packer), adding an optional embedding-based retriever behind an extra, and introducing history-aware routing adjustments plus dependency metadata on SelectableItem. It also wires ContextManager.build_route_prompt() to optionally auto-construct routing history from the event log and updates schemas/docs/tests accordingly.

Changes:

  • Introduce explicit routing pipeline components (RoutingPipeline, BeamSearchNavigator, DefaultCardPacker) and refactor Router to delegate navigation via the pipeline.
  • Add optional embedding retrieval via [embeddings] extra (SentenceTransformerBackend, HybridEmbeddingRetriever) and expose an EmbeddingBackend protocol.
  • Add history-aware score adjustments (RouteHistory, adjust_scores) plus dependency metadata fields (depends_on / provides / requires) and catalog validation warnings.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/test_types.py Adds round-trip + default/omit tests for new dependency metadata fields on SelectableItem.
tests/test_router.py Adds regression tests for pipeline delegation and history-aware routing adjustments.
tests/test_pipeline.py Adds tests for RoutingPipeline construction and stage-level entry points.
tests/test_packer.py Adds tests for DefaultCardPacker ordering and budget behavior.
tests/test_navigator.py Adds tests for BeamSearchNavigator determinism and eligibility behavior.
tests/test_manager.py Adds tests for ContextManager.build_route_prompt() history-from-log behavior and overrides.
tests/test_history.py Adds unit tests for RouteHistory serialization and adjust_scores rules.
tests/test_extras_embeddings.py Adds deterministic mock backend tests and gated sentence-transformers integration tests.
tests/test_catalog.py Adds tests for Catalog.validate_dependencies() warnings.
src/contextweaver/types.py Extends SelectableItem with optional dependency metadata + serialization behavior.
src/contextweaver/routing/router.py Adds pipeline support, embedding backend option, and history-aware adjustments to routing results.
src/contextweaver/routing/pipeline.py Introduces RoutingPipeline composer + from_config factory and stage entry points.
src/contextweaver/routing/packer.py Introduces DefaultCardPacker and a soft token-budget cap for card lists.
src/contextweaver/routing/navigator.py Extracts beam-search navigation into a standalone, protocol-driven navigator.
src/contextweaver/routing/history.py Adds RouteHistory dataclass and deterministic history-based score adjustment logic.
src/contextweaver/routing/catalog.py Adds dependency reference validation (depends_on) and warnings.
src/contextweaver/protocols.py Adds Navigator, CardPacker, EmbeddingBackend protocols + NavigationResult.
src/contextweaver/extras/embeddings.py Implements optional sentence-transformers backend + hybrid embedding/TF-IDF retriever.
src/contextweaver/extras/init.py Documents the new embeddings extra in the extras package overview.
src/contextweaver/context/manager.py Adds history / history_from_log to build_route_prompt() + constructs history from the event log.
src/contextweaver/init.py Re-exports new routing/public surface types (pipeline, navigator, packer, history, embedding protocol).
schemas/catalog.schema.json Regenerates catalog schema with new nullable dependency metadata fields.
docs/schemas/v0/catalog.schema.json Regenerates published v0 schema mirror with dependency metadata fields.
pyproject.toml Adds [embeddings] optional dependency group and mypy override for sentence-transformers.
CHANGELOG.md Documents the new pipeline, embeddings backend, and history/dependency routing features.
AGENTS.md Updates module map and routing pipeline description to include the new routing modules/features.

Comment on lines +789 to +807
items = self._event_log.all()
tool_results = [i for i in items if i.kind == ItemKind.tool_result]
if not tool_results:
return None
called_ids: list[str] = []
seen: set[str] = set()
for item in tool_results:
tid = item.parent_id or item.id
if tid in seen:
continue
seen.add(tid)
called_ids.append(tid)
last = tool_results[-1]
summary = (last.text or "")[:500] or None
return _RouteHistory(
called_tool_ids=called_ids,
last_result_summary=summary,
step_number=len(called_ids) + 1,
)
Comment thread tests/test_manager.py
Comment on lines +468 to +485
log.append(
ContextItem(
id="tc1",
parent_id=None,
kind=ItemKind.tool_call,
text="db_read invoked",
)
)
# The tool_result.parent_id is the tool call's id but we expose called
# tool ids via that — see the _build_route_history_from_log helper.
log.append(
ContextItem(
id="tr1",
parent_id="db_read",
kind=ItemKind.tool_result,
text="rows: id, name, email",
)
)
Comment on lines 333 to +450
@@ -342,6 +367,8 @@ def __init__(
routing_config: RoutingConfig | None = None,
retriever: Retriever | None = None,
engine_registry: EngineRegistry | None = None,
embedding_backend: EmbeddingBackend | None = None,
pipeline: RoutingPipeline | None = None,
) -> None:
if routing_config is not None:
beam_width = routing_config.beam_width
@@ -355,6 +382,12 @@ def __init__(
f"Unknown scorer_backend {scorer_backend!r}; "
f"valid options: {sorted(_SCORER_BACKENDS)}"
)
if embedding_backend is not None and retriever is not None:
raise ConfigError(
"Pass either retriever= or embedding_backend=, not both. "
"Construct an embedding-aware Retriever and pass it via retriever= "
"if you need both signals combined under a custom policy."
)
self._graph = graph
self._beam_width = beam_width
self._max_depth = max_depth
@@ -366,6 +399,14 @@ def __init__(
if retriever is not None:
self._retriever: Retriever = retriever
self._retriever_engine_name = self._engine_registry.default_for("retriever") or "tfidf"
elif embedding_backend is not None:
# Late import keeps the core install free of any sentence-
# transformers / hnswlib / torch dependency. Importing the
# adapter only happens when a backend is actually supplied.
from contextweaver.extras.embeddings import HybridEmbeddingRetriever

self._retriever = HybridEmbeddingRetriever(embedding_backend)
self._retriever_engine_name = "embedding+tfidf"
elif scorer is not None:
self._retriever = _ScorerRetriever(scorer)
self._retriever_engine_name = "tfidf"
@@ -377,9 +418,42 @@ def __init__(
self._retriever_engine_name = self._engine_registry.default_for("retriever") or "tfidf"
self._indexed = False
self._doc_id_to_idx: dict[str, int] = {}
self._pipeline = self._build_pipeline(pipeline)
if items is not None:
self.set_items(items)

def _build_pipeline(self, override: RoutingPipeline | None) -> RoutingPipeline:
"""Construct the routing pipeline (issue #56).

When *override* is supplied, its navigator / packer / reranker
replace the bundled defaults; the retriever is always set to the
one this :class:`Router` already resolved so corpus indexing has
a single source of truth.
"""
navigator = BeamSearchNavigator(
beam_width=self._beam_width,
max_depth=self._max_depth,
top_k=self._top_k,
confidence_gap=self._confidence_gap,
)
if override is None:
return RoutingPipeline(
retriever=self._retriever,
reranker=None,
navigator=navigator,
)
return RoutingPipeline(
retriever=self._retriever,
reranker=override.reranker,
navigator=override.navigator or navigator,
packer=override.packer,
)
Comment on lines +653 to +674
def _result_similarity_map(
self,
collected: dict[str, tuple[float, list[str]]],
active_items: dict[str, SelectableItem],
) -> list[tuple[str, tuple[float, list[str]]]]:
"""Return *collected* sorted by ``(-score, id)``, untrimmed.

Truncation to ``self._top_k`` is the caller's responsibility so
ambiguity / runner-up reads can use the full ranking even when
``top_k=1`` (issue #14).
history: RouteHistory,
scored: list[tuple[str, float]],
) -> dict[str, float] | None:
"""Per-candidate similarity to ``history.last_result_summary``.

Reuses the router's fitted retriever so the boost is computed in
the same scoring space as the primary query. Returns ``None`` when
the history has no summary so :func:`adjust_scores` can skip the
boost stage entirely.
"""
return sorted(
(entry for entry in collected.items() if entry[0] in active_items),
key=lambda x: (-x[1][0], x[0]),
)

def _expand_subtree(
self,
query: str,
node_id: str,
base_score: float,
base_path: list[str],
active_items: dict[str, SelectableItem],
eligible_internals: set[str],
*,
max_depth: int | None = None,
) -> dict[str, tuple[float, list[str]]]:
"""Expand children of *node_id* recursively, collecting items.

Children outside *active_items* (leaves) or *eligible_internals*
(internals) are skipped before scoring so excluded subtrees do
not consume backtracking work (issue #112 / #22).
"""
depth_limit = max_depth if max_depth is not None else self._max_depth
result: dict[str, tuple[float, list[str]]] = {}
stack: list[tuple[float, str, list[str], int]] = [(base_score, node_id, base_path, 0)]
while stack:
score, nid, path, depth = stack.pop()
children = self._graph.successors(nid)
if not children or depth >= depth_limit:
if nid in active_items:
result[nid] = (score, path[1:])
summary = history.last_result_summary
if not summary:
return None
sims: dict[str, float] = {}
for item_id, _ in scored:
idx = self._doc_id_to_idx.get(item_id)
if idx is None:
continue
for child in sorted(children):
if not self._is_eligible_child(child, active_items, eligible_internals):
continue
s = self._score_node(query, child)
new_path = path + [child]
if child in self._items:
result[child] = (score + s, new_path[1:])
else:
stack.append((score + s, child, new_path, depth + 1))
return result
sims[item_id] = self._retriever.score_one(summary, idx)
return sims
Comment on lines +11 to +14
``embedding_backend=`` argument is supplied. Importing this module
without the ``sentence-transformers`` dependency raises ``ImportError``
with the exact install hint above — matching the convention used by
:mod:`contextweaver.extras.otel`.
Comment on lines +154 to +162
def __init__(
self,
backend: EmbeddingBackend,
*,
embedding_weight: float = 0.7,
) -> None:
if not 0.0 <= embedding_weight <= 1.0:
raise ValueError(f"embedding_weight must be in [0.0, 1.0], got {embedding_weight}")
self._backend = backend
Comment on lines +17 to +18
2. ``rerank`` — :class:`~contextweaver.protocols.Reranker` re-orders the
shortlist. Defaults to :class:`NoOpReranker` which leaves order
Comment on lines +388 to +394
"""Pluggable card-rendering stage of the routing pipeline.

A packer turns a ranked list of items into :class:`ChoiceCard` instances
(and optionally a rendered text block) within a token budget. The
bundled default :class:`~contextweaver.routing.packer.DefaultCardPacker`
wraps :func:`contextweaver.routing.cards.make_choice_cards` +
:func:`contextweaver.routing.cards.render_cards_text`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants