Skip to content

feat(adapters,extras): CrewAI adapter + Mem0 external-memory backend (#193, #195)#258

Open
dgenio wants to merge 1 commit into
mainfrom
claude/triage-issues-SS79Y
Open

feat(adapters,extras): CrewAI adapter + Mem0 external-memory backend (#193, #195)#258
dgenio wants to merge 1 commit into
mainfrom
claude/triage-issues-SS79Y

Conversation

@dgenio
Copy link
Copy Markdown
Owner

@dgenio dgenio commented May 17, 2026

Lands Phase 1 of the two-issue "interop expansion" group selected by
the triage pass. Single combined PR per Mode B owner authorisation.

#193 — CrewAI adapter (representative for the framework-adapter epic)

  • adapters/crewai.py — thin stateless converter mirroring
    adapters/fastmcp.py's shape. Plain-dict core
    (crewai_tool_to_selectable, crewai_tools_to_catalog,
    infer_crewai_namespace) works without the [crewai] extra
    installed; load_crewai_catalog consumes live
    crewai.tools.BaseTool instances when the extra is available.
  • New [crewai] optional-dependency group (crewai>=0.80) +
    crewai>=0.80 added to [dev] so CI exercises the real upstream
    BaseTool wire shape end-to-end.
  • examples/crewai_adapter_demo.py wired into make example runs
    the plain-dict path so it succeeds without the optional extra.
  • 14 tests in tests/test_adapters_crewai.py (namespace inference,
    dict + live BaseTool conversion, batch → Catalog, error paths).
  • docs/integration_crewai.md is the public guide; cookbook §5
    points at the demo + integration page.

Follow-ups (deferred per Lean scope-shape): Pydantic AI,
smolagents, Agno adapters track on the same issue.

#195 — Mem0 external-memory backend (representative)

  • extras/memory/mem0.py — Mem0EpisodicStore + Mem0FactStore
    implement the existing store.protocols.EpisodicStore / FactStore
    Protocols verbatim. No Protocol widening (per
    docs/agent-context/invariants.md and the issue's "don't widen
    the protocol" guidance). Writes go through
    Memory.add(infer=False) so the raw text is stored as-is;
    records are stamped with cw_episode_id / cw_fact_id metadata
    for canonical-ID resolution against mem0's generated UUIDs.
  • Module-load guarded import with friendly ImportError mirroring
    extras/otel.py:71-76.
  • New [mem0] optional-dependency group (mem0ai>=0.1) + mem0ai>=0.1
    added to [dev] so CI exercises the real Memory class signatures
    (LLM-touching methods stubbed via MagicMock(spec=Memory)).
  • 21 tests in tests/test_extras_memory_mem0.py (always-runs
    ImportError-message path + functional tests under HAS_MEM0
    gate covering both stores' full protocol surface).
  • docs/integration_memory.md is the decision-matrix doc covering
    Mem0 / Zep / LangMem; cookbook §6 ships the 8-line wiring
    snippet.

Follow-ups: Zep, LangMem backends share the same Protocol shape;
rows already listed in the decision matrix and interop.md.

Shared bookkeeping

  • mkdocs.yml + docs/interop.md interop matrix gain the new rows.
  • README.md Framework Integrations table updated in both
    occurrences (lines 222-242 and 357-367) plus the FAQ list at
    line 531-545.
  • AGENTS.md Module Map gains 4 new entries
    (adapters/crewai.py, extras/otel.py, extras/memory/,
    extras/memory/mem0.py).
  • CHANGELOG.md [Unreleased] entries for both issues.

Test-quality fix (Mode B latitude, documented delta)

  • test_module_does_not_import_provider_sdk_at_load_time in the
    three provider-message test files now spawns a fresh
    subprocess instead of asserting against the in-session
    sys.modules. The previous form was sensitive to whatever any
    other test had already imported (crewai → openai is the
    concrete case once [dev] pulls crewai in). The invariant the
    tests are checking is unchanged; the assertion is now
    independent of test ordering and installed extras.

Module-size note

extras/memory/mem0.py lands at 461 lines, over the soft 300-line
guide, in line with the adapters/mcp.py (401) /
adapters/proxy_runtime.py (462) precedent. Mode B authorised the
modest overrun rather than splitting one cohesive backend
(Mem0EpisodicStore + Mem0FactStore share three private helpers)
across two files for fragmentation's sake.

Verification

ruff format --check src/ tests/ examples/ scripts/ → clean
ruff check src/ tests/ examples/ scripts/ → clean
mypy src/ → 0 issues / 80 files
python -m pytest -q → 1182 passed, 6 skipped
(+35 new tests, including
3 subprocess-isolated)
scripts/gen_schemas.py --check → schemas up to date
scripts/render_scorecard.py --check → clean (no benchmark drift)
make example → all 15 scripts ran
make demo → Demo complete

Closes neither issue outright; both remain open with this
PR linked. Each issue's follow-up backends (Pydantic AI /
smolagents / Agno for #193; Zep / LangMem for #195) share the
landed Protocol surface and shim shape.

…193, #195)

Lands Phase 1 of the two-issue "interop expansion" group selected by
the triage pass. Single combined PR per Mode B owner authorisation.

#193 — CrewAI adapter (representative for the framework-adapter epic)

  - adapters/crewai.py — thin stateless converter mirroring
    adapters/fastmcp.py's shape. Plain-dict core
    (crewai_tool_to_selectable, crewai_tools_to_catalog,
    infer_crewai_namespace) works without the [crewai] extra
    installed; load_crewai_catalog consumes live
    crewai.tools.BaseTool instances when the extra is available.
  - New [crewai] optional-dependency group (crewai>=0.80) +
    crewai>=0.80 added to [dev] so CI exercises the real upstream
    BaseTool wire shape end-to-end.
  - examples/crewai_adapter_demo.py wired into `make example` runs
    the plain-dict path so it succeeds without the optional extra.
  - 14 tests in tests/test_adapters_crewai.py (namespace inference,
    dict + live BaseTool conversion, batch → Catalog, error paths).
  - docs/integration_crewai.md is the public guide; cookbook §5
    points at the demo + integration page.

  Follow-ups (deferred per Lean scope-shape): Pydantic AI,
  smolagents, Agno adapters track on the same issue.

#195 — Mem0 external-memory backend (representative)

  - extras/memory/mem0.py — Mem0EpisodicStore + Mem0FactStore
    implement the existing store.protocols.EpisodicStore / FactStore
    Protocols verbatim. No Protocol widening (per
    docs/agent-context/invariants.md and the issue's "don't widen
    the protocol" guidance). Writes go through
    Memory.add(infer=False) so the raw text is stored as-is;
    records are stamped with cw_episode_id / cw_fact_id metadata
    for canonical-ID resolution against mem0's generated UUIDs.
  - Module-load guarded import with friendly ImportError mirroring
    extras/otel.py:71-76.
  - New [mem0] optional-dependency group (mem0ai>=0.1) + mem0ai>=0.1
    added to [dev] so CI exercises the real Memory class signatures
    (LLM-touching methods stubbed via MagicMock(spec=Memory)).
  - 21 tests in tests/test_extras_memory_mem0.py (always-runs
    ImportError-message path + functional tests under HAS_MEM0
    gate covering both stores' full protocol surface).
  - docs/integration_memory.md is the decision-matrix doc covering
    Mem0 / Zep / LangMem; cookbook §6 ships the 8-line wiring
    snippet.

  Follow-ups: Zep, LangMem backends share the same Protocol shape;
  rows already listed in the decision matrix and interop.md.

Shared bookkeeping

  - mkdocs.yml + docs/interop.md interop matrix gain the new rows.
  - README.md Framework Integrations table updated in both
    occurrences (lines 222-242 and 357-367) plus the FAQ list at
    line 531-545.
  - AGENTS.md Module Map gains 4 new entries
    (adapters/crewai.py, extras/otel.py, extras/memory/,
    extras/memory/mem0.py).
  - CHANGELOG.md [Unreleased] entries for both issues.

Test-quality fix (Mode B latitude, documented delta)

  - test_module_does_not_import_provider_sdk_at_load_time in the
    three provider-message test files now spawns a fresh
    subprocess instead of asserting against the in-session
    sys.modules. The previous form was sensitive to whatever any
    other test had already imported (crewai → openai is the
    concrete case once [dev] pulls crewai in). The invariant the
    tests are checking is unchanged; the assertion is now
    independent of test ordering and installed extras.

Module-size note

  extras/memory/mem0.py lands at 461 lines, over the soft 300-line
  guide, in line with the adapters/mcp.py (401) /
  adapters/proxy_runtime.py (462) precedent. Mode B authorised the
  modest overrun rather than splitting one cohesive backend
  (Mem0EpisodicStore + Mem0FactStore share three private helpers)
  across two files for fragmentation's sake.

Verification

  ruff format --check src/ tests/ examples/ scripts/   → clean
  ruff check src/ tests/ examples/ scripts/            → clean
  mypy src/                                            → 0 issues / 80 files
  python -m pytest -q                                  → 1182 passed, 6 skipped
                                                         (+35 new tests, including
                                                         3 subprocess-isolated)
  scripts/gen_schemas.py --check                       → schemas up to date
  scripts/render_scorecard.py --check                  → clean (no benchmark drift)
  make example                                         → all 15 scripts ran
  make demo                                            → Demo complete

Closes neither issue outright; both remain open with this
PR linked. Each issue's follow-up backends (Pydantic AI /
smolagents / Agno for #193; Zep / LangMem for #195) share the
landed Protocol surface and shim shape.
Copilot AI review requested due to automatic review settings May 17, 2026 22:03
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands contextweaver interop with a CrewAI tool adapter and a Mem0-backed external memory implementation, with accompanying docs, examples, tests, and optional dependency groups.

Changes:

  • Adds contextweaver.adapters.crewai for CrewAI tool definitions → SelectableItem / Catalog.
  • Adds contextweaver.extras.memory.mem0 implementing episodic and fact store protocols over Mem0.
  • Updates integration docs, cookbook, interop matrix, README, examples, dependency extras, and provider-SDK import tests.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/contextweaver/adapters/crewai.py New CrewAI adapter conversion helpers.
src/contextweaver/adapters/__init__.py Re-exports CrewAI adapter helpers.
src/contextweaver/extras/memory/mem0.py New Mem0 episodic/fact store backend.
src/contextweaver/extras/memory/__init__.py Documents external-memory subpackage.
src/contextweaver/extras/__init__.py Adds Mem0 memory extra to extras overview.
tests/test_adapters_crewai.py Adds CrewAI adapter tests.
tests/test_extras_memory_mem0.py Adds Mem0 backend tests.
tests/test_adapters_openai_messages.py Isolates provider SDK leak test in subprocess.
tests/test_adapters_anthropic_messages.py Isolates provider SDK leak test in subprocess.
tests/test_adapters_gemini_contents.py Isolates provider SDK leak test in subprocess.
examples/crewai_adapter_demo.py Adds runnable CrewAI adapter demo.
Makefile Wires CrewAI demo into make example.
pyproject.toml Adds [crewai], [mem0], dev deps, and mypy ignores.
docs/integration_crewai.md Adds CrewAI integration guide.
docs/integration_memory.md Adds external memory backend guide.
docs/cookbook.md Adds CrewAI and external-memory recipes.
docs/interop.md Updates interop matrix with CrewAI and memory rows.
README.md Adds CrewAI and Mem0 links to integration sections.
mkdocs.yml Adds new docs pages to navigation.
CHANGELOG.md Records new adapter/backend and test changes.
AGENTS.md Updates module map for CrewAI and memory extras.

Comment on lines +102 to +103
bundle = StoreBundle(episodic=episodic, facts=facts)
ctx_mgr = ContextManager(store_bundle=bundle)
Comment thread docs/cookbook.md
Comment on lines +311 to +314
episodic=Mem0EpisodicStore(memory, user_id="agent:support-bot"),
facts=Mem0FactStore(memory, user_id="agent:support-bot"),
)
ctx_mgr = ContextManager(store_bundle=bundle)
Comment on lines +138 to +144
item = ctx_mgr.ingest_tool_result(
tool_name=tool.name,
content=str(raw),
)
# Return the firewalled summary; the raw bytes stay addressable
# in ctx_mgr.artifact_store and are accessible via drilldown.
return item.text
Comment on lines +117 to +118
- ``cache_function`` (optional): CrewAI cache predicate; preserved in
``metadata`` only — not exercised by contextweaver's routing.
Comment on lines +177 to +179
If you need the original description without the preamble, parse it out
of `item.metadata` or convert from a plain dict via
`crewai_tool_to_selectable({...})`.
Comment on lines +350 to +351
if mem_id is not None:
self._memory.delete(mem_id)
@github-actions
Copy link
Copy Markdown

Benchmark delta (vs main)

Soft regression feedback only — this comment never blocks the PR.
Latency budget: ⚠️ when head > base × 1.3. Accuracy budget: ⚠️ when head < base - 1pp.

Routing summary (single backend × catalog sizes)

size recall@k (head Δ vs base) MRR (head Δ vs base) p99 (ms)
50 ✅ 0.5649 (+0.0000) ✅ 0.4978 (+0.0000) ✅ 0.501 (base 0.463)
83 ✅ 0.3825 (+0.0000) ✅ 0.3242 (+0.0000) ✅ 0.711 (base 0.876)
1000 ✅ 0.1475 (+0.0000) ✅ 0.1456 (+0.0000) ⚠️ 42.776 (base 31.897)

Per-backend × per-size matrix

backend size recall@k (Δ) MRR (Δ) p99 (ms)
bm25 100 ✅ 0.3825 (+0.0000) ✅ 0.3399 (+0.0000) ✅ 6.492 (base 5.642)
bm25 500 ✅ 0.2250 (+0.0000) ✅ 0.2165 (+0.0000) ✅ 31.579 (base 27.538)
bm25 1000 ✅ 0.1575 (+0.0000) ✅ 0.1525 (+0.0000) ✅ 89.023 (base 78.368)
fuzzy 100 ✅ 0.0000 (+0.0000) ✅ 0.0000 (+0.0000) ✅ 0.000 (base 0.000)
fuzzy 500 ✅ 0.0000 (+0.0000) ✅ 0.0000 (+0.0000) ✅ 0.000 (base 0.000)
fuzzy 1000 ✅ 0.0000 (+0.0000) ✅ 0.0000 (+0.0000) ✅ 0.000 (base 0.000)
tfidf 100 ✅ 0.3825 (+0.0000) ✅ 0.3220 (+0.0000) ✅ 1.026 (base 0.872)
tfidf 500 ✅ 0.2325 (+0.0000) ✅ 0.2314 (+0.0000) ✅ 9.534 (base 8.660)
tfidf 1000 ✅ 0.1475 (+0.0000) ✅ 0.1456 (+0.0000) ✅ 37.248 (base 30.071)

Context pipeline (per scenario)

scenario tokens dropped dedup
large_catalog 1514 (base 1514, Δ+0) 0 (base 0, Δ+0) 0 (base 0, Δ+0)
long_conversation 2548 (base 2548, Δ+0) 0 (base 0, Δ+0) 0 (base 0, Δ+0)
short_conversation 496 (base 496, Δ+0) 0 (base 0, Δ+0) 0 (base 0, Δ+0)
stress_conversation 6651 (base 6651, Δ+0) 7 (base 7, Δ+0) 4 (base 4, Δ+0)

Numbers come from make benchmark / make benchmark-matrix.
Latency is hardware-dependent — treat the markers as a rough guide.
See benchmarks/scorecard.md for the full picture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants