feat(routing,context,adapters): explicit pipeline + embedding backend + history-aware routing (#8, #27, #56) by dgenio · Pull Request #236 · dgenio/contextweaver

dgenio · 2026-05-17T18:02:04Z

#56 — Routing pipeline decomposition

New routing/pipeline.py composer with explicit stages:
retrieve -> rerank -> navigate -> pack
New Navigator + CardPacker protocols in protocols.py
BeamSearchNavigator lifted verbatim from router.py (byte-identical
default behaviour; verified by the existing 50+ router regression tests
and make scorecard-check)
DefaultCardPacker wraps make_choice_cards with a soft budget_tokens cap
Router.route() now delegates to RoutingPipeline.navigate(); the public
API surface and every RouteResult field are preserved

#8 — Optional embedding-based retrieval backend

New EmbeddingBackend protocol in protocols.py
New [embeddings] extra (pip install 'contextweaver[embeddings]')
New extras/embeddings.py: SentenceTransformerBackend + the
HybridEmbeddingRetriever (70/30 embedding+TF-IDF weighted sum so
lexical exact-id / exact-tag hits keep their floor)
Router(embedding_backend=...) constructs the hybrid retriever via a
lazy import so the core install never pulls torch
Mock HashEmbeddingBackend in tests provides a deterministic
default-install test path; real sentence-transformers integration
test is gated behind pytest.importorskip

#27 — History-aware re-routing + tool-dependency metadata

Phase 1: new routing/history.py with RouteHistory dataclass +
adjust_scores helper. Router.route(..., history=RouteHistory(...))
applies a repeat penalty to already-called tools and boosts
candidates whose description resembles last_result_summary (computed
via the router's fitted retriever, so the boost is in the same
scoring space as the primary query). Per-item deltas surface on the
new RouteResult.history_adjustments field + trace.extra.
Phase 2: SelectableItem gains optional depends_on / provides /
requires fields; adjust_scores applies a satisfaction boost when a
candidate's requires are fully covered by provides of already-
called tools, and a penalty when depends_on references an
uncalled tool. All three default to None and round-trip omitted
when unset.
ContextManager.build_route_prompt auto-constructs a RouteHistory
from the event log (tools whose tool_result is in the log). Set
history_from_log=False or pass history=... explicitly to opt out
/ override.
Catalog.validate_dependencies() returns human-readable warnings for
depends_on entries pointing at unknown tool ids.

Schemas + extras

schemas/catalog.schema.json + docs/schemas/v0/catalog.schema.json
regenerated (3 new nullable array fields on items).
pyproject.toml adds [embeddings] extra and a sentence_transformers
mypy override.

Verification
ruff format --check src/ tests/ examples/ scripts/ -> clean
ruff check src/ tests/ examples/ scripts/ -> clean
mypy src/ -> 0 issues / 79 files
pytest -q -> 1111 passed, 6 skipped
(+70 new tests)
scripts/gen_schemas.py --check -> schemas up to date
scripts/render_scorecard.py --check -> exit 0 (no drift)
make example -> all scripts clean
make demo -> Demo complete
make llms-check -> up to date

Closes #8
Closes #27
Closes #56

https://claude.ai/code/session_017YLnTSUmEXLXV85JC29oYf

… + history-aware routing (#8, #27, #56) #56 — Routing pipeline decomposition - New routing/pipeline.py composer with explicit stages: retrieve -> rerank -> navigate -> pack - New Navigator + CardPacker protocols in protocols.py - BeamSearchNavigator lifted verbatim from router.py (byte-identical default behaviour; verified by the existing 50+ router regression tests and `make scorecard-check`) - DefaultCardPacker wraps make_choice_cards with a soft budget_tokens cap - Router.route() now delegates to RoutingPipeline.navigate(); the public API surface and every RouteResult field are preserved #8 — Optional embedding-based retrieval backend - New EmbeddingBackend protocol in protocols.py - New [embeddings] extra (`pip install 'contextweaver[embeddings]'`) - New extras/embeddings.py: SentenceTransformerBackend + the HybridEmbeddingRetriever (70/30 embedding+TF-IDF weighted sum so lexical exact-id / exact-tag hits keep their floor) - Router(embedding_backend=...) constructs the hybrid retriever via a lazy import so the core install never pulls torch - Mock HashEmbeddingBackend in tests provides a deterministic default-install test path; real sentence-transformers integration test is gated behind pytest.importorskip #27 — History-aware re-routing + tool-dependency metadata - Phase 1: new routing/history.py with RouteHistory dataclass + adjust_scores helper. Router.route(..., history=RouteHistory(...)) applies a repeat penalty to already-called tools and boosts candidates whose description resembles last_result_summary (computed via the router's fitted retriever, so the boost is in the same scoring space as the primary query). Per-item deltas surface on the new RouteResult.history_adjustments field + trace.extra. - Phase 2: SelectableItem gains optional depends_on / provides / requires fields; adjust_scores applies a satisfaction boost when a candidate's `requires` are fully covered by `provides` of already- called tools, and a penalty when `depends_on` references an uncalled tool. All three default to None and round-trip omitted when unset. - ContextManager.build_route_prompt auto-constructs a RouteHistory from the event log (tools whose tool_result is in the log). Set history_from_log=False or pass history=... explicitly to opt out / override. - Catalog.validate_dependencies() returns human-readable warnings for depends_on entries pointing at unknown tool ids. Schemas + extras - schemas/catalog.schema.json + docs/schemas/v0/catalog.schema.json regenerated (3 new nullable array fields on items). - pyproject.toml adds [embeddings] extra and a sentence_transformers mypy override. Verification ruff format --check src/ tests/ examples/ scripts/ -> clean ruff check src/ tests/ examples/ scripts/ -> clean mypy src/ -> 0 issues / 79 files pytest -q -> 1111 passed, 6 skipped (+70 new tests) scripts/gen_schemas.py --check -> schemas up to date scripts/render_scorecard.py --check -> exit 0 (no drift) make example -> all scripts clean make demo -> Demo complete make llms-check -> up to date Closes #8 Closes #27 Closes #56 https://claude.ai/code/session_017YLnTSUmEXLXV85JC29oYf

github-actions · 2026-05-17T18:05:27Z

Benchmark delta (vs `main`)

Soft regression feedback only — this comment never blocks the PR.
Latency budget: ⚠️ when head > base × 1.3. Accuracy budget: ⚠️ when head < base - 1pp.

Routing summary (single backend × catalog sizes)

size	recall@k (head Δ vs base)	MRR (head Δ vs base)	p99 (ms)
50	✅ 0.5649 (+0.0000)	✅ 0.4978 (+0.0000)	✅ 0.503 (base 0.463)
83	✅ 0.3825 (+0.0000)	✅ 0.3242 (+0.0000)	✅ 0.723 (base 0.876)
1000	✅ 0.1475 (+0.0000)	✅ 0.1456 (+0.0000)	✅ 38.163 (base 31.897)

Per-backend × per-size matrix

backend	size	recall@k (Δ)	MRR (Δ)	p99 (ms)
bm25	100	✅ 0.3825 (+0.0000)	✅ 0.3399 (+0.0000)	✅ 6.386 (base 5.642)
bm25	500	✅ 0.2250 (+0.0000)	✅ 0.2165 (+0.0000)	✅ 31.224 (base 27.538)
bm25	1000	✅ 0.1575 (+0.0000)	✅ 0.1525 (+0.0000)	✅ 86.143 (base 78.368)
fuzzy	100	✅ 0.0000 (+0.0000)	✅ 0.0000 (+0.0000)	✅ 0.000 (base 0.000)
fuzzy	500	✅ 0.0000 (+0.0000)	✅ 0.0000 (+0.0000)	✅ 0.000 (base 0.000)
fuzzy	1000	✅ 0.0000 (+0.0000)	✅ 0.0000 (+0.0000)	✅ 0.000 (base 0.000)
tfidf	100	✅ 0.3825 (+0.0000)	✅ 0.3220 (+0.0000)	✅ 1.055 (base 0.872)
tfidf	500	✅ 0.2325 (+0.0000)	✅ 0.2314 (+0.0000)	✅ 9.815 (base 8.660)
tfidf	1000	✅ 0.1475 (+0.0000)	✅ 0.1456 (+0.0000)	✅ 38.649 (base 30.071)

Context pipeline (per scenario)

scenario	tokens	dropped	dedup
large_catalog	1514 (base 1514, Δ+0)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
long_conversation	2548 (base 2548, Δ+0)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
short_conversation	496 (base 496, Δ+0)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
stress_conversation	6651 (base 6651, Δ+0)	7 (base 7, Δ+0)	4 (base 4, Δ+0)

Numbers come from make benchmark / make benchmark-matrix.
Latency is hardware-dependent — treat the markers as a rough guide.
See benchmarks/scorecard.md for the full picture.

Copilot

Pull request overview

This PR advances the routing engine by making routing stages explicit and swappable (pipeline composer + navigator + packer), adding an optional embedding-based retriever behind an extra, and introducing history-aware routing adjustments plus dependency metadata on SelectableItem. It also wires ContextManager.build_route_prompt() to optionally auto-construct routing history from the event log and updates schemas/docs/tests accordingly.

Changes:

Introduce explicit routing pipeline components (RoutingPipeline, BeamSearchNavigator, DefaultCardPacker) and refactor Router to delegate navigation via the pipeline.
Add optional embedding retrieval via [embeddings] extra (SentenceTransformerBackend, HybridEmbeddingRetriever) and expose an EmbeddingBackend protocol.
Add history-aware score adjustments (RouteHistory, adjust_scores) plus dependency metadata fields (depends_on / provides / requires) and catalog validation warnings.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
tests/test_types.py	Adds round-trip + default/omit tests for new dependency metadata fields on `SelectableItem`.
tests/test_router.py	Adds regression tests for pipeline delegation and history-aware routing adjustments.
tests/test_pipeline.py	Adds tests for `RoutingPipeline` construction and stage-level entry points.
tests/test_packer.py	Adds tests for `DefaultCardPacker` ordering and budget behavior.
tests/test_navigator.py	Adds tests for `BeamSearchNavigator` determinism and eligibility behavior.
tests/test_manager.py	Adds tests for `ContextManager.build_route_prompt()` history-from-log behavior and overrides.
tests/test_history.py	Adds unit tests for `RouteHistory` serialization and `adjust_scores` rules.
tests/test_extras_embeddings.py	Adds deterministic mock backend tests and gated sentence-transformers integration tests.
tests/test_catalog.py	Adds tests for `Catalog.validate_dependencies()` warnings.
src/contextweaver/types.py	Extends `SelectableItem` with optional dependency metadata + serialization behavior.
src/contextweaver/routing/router.py	Adds pipeline support, embedding backend option, and history-aware adjustments to routing results.
src/contextweaver/routing/pipeline.py	Introduces `RoutingPipeline` composer + `from_config` factory and stage entry points.
src/contextweaver/routing/packer.py	Introduces `DefaultCardPacker` and a soft token-budget cap for card lists.
src/contextweaver/routing/navigator.py	Extracts beam-search navigation into a standalone, protocol-driven navigator.
src/contextweaver/routing/history.py	Adds `RouteHistory` dataclass and deterministic history-based score adjustment logic.
src/contextweaver/routing/catalog.py	Adds dependency reference validation (`depends_on`) and warnings.
src/contextweaver/protocols.py	Adds `Navigator`, `CardPacker`, `EmbeddingBackend` protocols + `NavigationResult`.
src/contextweaver/extras/embeddings.py	Implements optional sentence-transformers backend + hybrid embedding/TF-IDF retriever.
src/contextweaver/extras/init.py	Documents the new embeddings extra in the extras package overview.
src/contextweaver/context/manager.py	Adds `history` / `history_from_log` to `build_route_prompt()` + constructs history from the event log.
src/contextweaver/init.py	Re-exports new routing/public surface types (pipeline, navigator, packer, history, embedding protocol).
schemas/catalog.schema.json	Regenerates catalog schema with new nullable dependency metadata fields.
docs/schemas/v0/catalog.schema.json	Regenerates published v0 schema mirror with dependency metadata fields.
pyproject.toml	Adds `[embeddings]` optional dependency group and mypy override for sentence-transformers.
CHANGELOG.md	Documents the new pipeline, embeddings backend, and history/dependency routing features.
AGENTS.md	Updates module map and routing pipeline description to include the new routing modules/features.

+        items = self._event_log.all()
+        tool_results = [i for i in items if i.kind == ItemKind.tool_result]
+        if not tool_results:
+            return None
+        called_ids: list[str] = []
+        seen: set[str] = set()
+        for item in tool_results:
+            tid = item.parent_id or item.id
+            if tid in seen:
+                continue
+            seen.add(tid)
+            called_ids.append(tid)
+        last = tool_results[-1]
+        summary = (last.text or "")[:500] or None
+        return _RouteHistory(
+            called_tool_ids=called_ids,
+            last_result_summary=summary,
+            step_number=len(called_ids) + 1,
+        )


+    log.append(
+        ContextItem(
+            id="tc1",
+            parent_id=None,
+            kind=ItemKind.tool_call,
+            text="db_read invoked",
+        )
+    )
+    # The tool_result.parent_id is the tool call's id but we expose called
+    # tool ids via that — see the _build_route_history_from_log helper.
+    log.append(
+        ContextItem(
+            id="tr1",
+            parent_id="db_read",
+            kind=ItemKind.tool_result,
+            text="rows: id, name, email",
+        )
+    )


@@ -342,6 +367,8 @@ def __init__(
        routing_config: RoutingConfig | None = None,
        retriever: Retriever | None = None,
        engine_registry: EngineRegistry | None = None,
+        embedding_backend: EmbeddingBackend | None = None,
+        pipeline: RoutingPipeline | None = None,
    ) -> None:
        if routing_config is not None:
            beam_width = routing_config.beam_width
@@ -355,6 +382,12 @@ def __init__(
                f"Unknown scorer_backend {scorer_backend!r}; "
                f"valid options: {sorted(_SCORER_BACKENDS)}"
            )
+        if embedding_backend is not None and retriever is not None:
+            raise ConfigError(
+                "Pass either retriever= or embedding_backend=, not both. "
+                "Construct an embedding-aware Retriever and pass it via retriever= "
+                "if you need both signals combined under a custom policy."
+            )
        self._graph = graph
        self._beam_width = beam_width
        self._max_depth = max_depth
@@ -366,6 +399,14 @@ def __init__(
        if retriever is not None:
            self._retriever: Retriever = retriever
            self._retriever_engine_name = self._engine_registry.default_for("retriever") or "tfidf"
+        elif embedding_backend is not None:
+            # Late import keeps the core install free of any sentence-
+            # transformers / hnswlib / torch dependency.  Importing the
+            # adapter only happens when a backend is actually supplied.
+            from contextweaver.extras.embeddings import HybridEmbeddingRetriever
+
+            self._retriever = HybridEmbeddingRetriever(embedding_backend)
+            self._retriever_engine_name = "embedding+tfidf"
        elif scorer is not None:
            self._retriever = _ScorerRetriever(scorer)
            self._retriever_engine_name = "tfidf"
@@ -377,9 +418,42 @@ def __init__(
            self._retriever_engine_name = self._engine_registry.default_for("retriever") or "tfidf"
        self._indexed = False
        self._doc_id_to_idx: dict[str, int] = {}
+        self._pipeline = self._build_pipeline(pipeline)
        if items is not None:
            self.set_items(items)

+    def _build_pipeline(self, override: RoutingPipeline | None) -> RoutingPipeline:
+        """Construct the routing pipeline (issue #56).
+
+        When *override* is supplied, its navigator / packer / reranker
+        replace the bundled defaults; the retriever is always set to the
+        one this :class:`Router` already resolved so corpus indexing has
+        a single source of truth.
+        """
+        navigator = BeamSearchNavigator(
+            beam_width=self._beam_width,
+            max_depth=self._max_depth,
+            top_k=self._top_k,
+            confidence_gap=self._confidence_gap,
+        )
+        if override is None:
+            return RoutingPipeline(
+                retriever=self._retriever,
+                reranker=None,
+                navigator=navigator,
+            )
+        return RoutingPipeline(
+            retriever=self._retriever,
+            reranker=override.reranker,
+            navigator=override.navigator or navigator,
+            packer=override.packer,
+        )


+    def _result_similarity_map(
        self,
-        collected: dict[str, tuple[float, list[str]]],
-        active_items: dict[str, SelectableItem],
-    ) -> list[tuple[str, tuple[float, list[str]]]]:
-        """Return *collected* sorted by ``(-score, id)``, untrimmed.
-
-        Truncation to ``self._top_k`` is the caller's responsibility so
-        ambiguity / runner-up reads can use the full ranking even when
-        ``top_k=1`` (issue #14).
+        history: RouteHistory,
+        scored: list[tuple[str, float]],
+    ) -> dict[str, float] | None:
+        """Per-candidate similarity to ``history.last_result_summary``.
+
+        Reuses the router's fitted retriever so the boost is computed in
+        the same scoring space as the primary query.  Returns ``None`` when
+        the history has no summary so :func:`adjust_scores` can skip the
+        boost stage entirely.
        """
-        return sorted(
-            (entry for entry in collected.items() if entry[0] in active_items),
-            key=lambda x: (-x[1][0], x[0]),
-        )
-
-    def _expand_subtree(
-        self,
-        query: str,
-        node_id: str,
-        base_score: float,
-        base_path: list[str],
-        active_items: dict[str, SelectableItem],
-        eligible_internals: set[str],
-        *,
-        max_depth: int | None = None,
-    ) -> dict[str, tuple[float, list[str]]]:
-        """Expand children of *node_id* recursively, collecting items.
-
-        Children outside *active_items* (leaves) or *eligible_internals*
-        (internals) are skipped before scoring so excluded subtrees do
-        not consume backtracking work (issue #112 / #22).
-        """
-        depth_limit = max_depth if max_depth is not None else self._max_depth
-        result: dict[str, tuple[float, list[str]]] = {}
-        stack: list[tuple[float, str, list[str], int]] = [(base_score, node_id, base_path, 0)]
-        while stack:
-            score, nid, path, depth = stack.pop()
-            children = self._graph.successors(nid)
-            if not children or depth >= depth_limit:
-                if nid in active_items:
-                    result[nid] = (score, path[1:])
+        summary = history.last_result_summary
+        if not summary:
+            return None
+        sims: dict[str, float] = {}
+        for item_id, _ in scored:
+            idx = self._doc_id_to_idx.get(item_id)
+            if idx is None:
                continue
-            for child in sorted(children):
-                if not self._is_eligible_child(child, active_items, eligible_internals):
-                    continue
-                s = self._score_node(query, child)
-                new_path = path + [child]
-                if child in self._items:
-                    result[child] = (score + s, new_path[1:])
-                else:
-                    stack.append((score + s, child, new_path, depth + 1))
-        return result
+            sims[item_id] = self._retriever.score_one(summary, idx)
+        return sims


+``embedding_backend=`` argument is supplied.  Importing this module
+without the ``sentence-transformers`` dependency raises ``ImportError``
+with the exact install hint above — matching the convention used by
+:mod:`contextweaver.extras.otel`.


+    def __init__(
+        self,
+        backend: EmbeddingBackend,
+        *,
+        embedding_weight: float = 0.7,
+    ) -> None:
+        if not 0.0 <= embedding_weight <= 1.0:
+            raise ValueError(f"embedding_weight must be in [0.0, 1.0], got {embedding_weight}")
+        self._backend = backend


+2. ``rerank`` — :class:`~contextweaver.protocols.Reranker` re-orders the
+   shortlist.  Defaults to :class:`NoOpReranker` which leaves order


+    """Pluggable card-rendering stage of the routing pipeline.
+
+    A packer turns a ranked list of items into :class:`ChoiceCard` instances
+    (and optionally a rendered text block) within a token budget.  The
+    bundled default :class:`~contextweaver.routing.packer.DefaultCardPacker`
+    wraps :func:`contextweaver.routing.cards.make_choice_cards` +
+    :func:`contextweaver.routing.cards.render_cards_text`.


Copilot AI review requested due to automatic review settings May 17, 2026 18:02

Copilot started reviewing on behalf of dgenio May 17, 2026 18:02 View session

Copilot AI reviewed May 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(routing,context,adapters): explicit pipeline + embedding backend + history-aware routing (#8, #27, #56)#236

feat(routing,context,adapters): explicit pipeline + embedding backend + history-aware routing (#8, #27, #56)#236
dgenio wants to merge 1 commit into
mainfrom
claude/triage-issues-VZT4U

dgenio commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		2. ``rerank`` — :class:`~contextweaver.protocols.Reranker` re-orders the
		shortlist. Defaults to :class:`NoOpReranker` which leaves order

Conversation

dgenio commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Benchmark delta (vs main)

Routing summary (single backend × catalog sizes)

Per-backend × per-size matrix

Context pipeline (per scenario)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Benchmark delta (vs `main`)