Skip to content

feat(openclaw): remove LLM dependency via agent-driven extraction#20

Open
yyiilluu wants to merge 1 commit intomainfrom
feat/openclaw-llm-free-integration
Open

feat(openclaw): remove LLM dependency via agent-driven extraction#20
yyiilluu wants to merge 1 commit intomainfrom
feat/openclaw-llm-free-integration

Conversation

@yyiilluu
Copy link
Copy Markdown
Contributor

Summary

  • Remove the LLM provider API key requirement from the OpenClaw integration. Previously, setup forced users to configure an LLM provider in ~/.reflexio/.env because the Reflexio server ran LLM-based extraction when the hook published captured conversations. OpenClaw agents are already LLM-powered, so we move extraction into the agent's own session and keep the server on pure CRUD + semantic search.
  • Hook becomes search-only. No SQLite buffering, no message:sent, no command:stop, no /api/publish_interaction. Only agent:bootstrap profile injection and message:received playbook retrieval remain. better-sqlite3 dropped from dependencies.
  • /reflexio-extract is now agent-driven. The slash command embeds the v3.0.0 extraction rubric in its SKILL.md; the agent applies the rubric to its current conversation, then upserts each extracted playbook via reflexio user-playbooks searchupdate on match, else add.
  • /reflexio-aggregate deleted. Cross-instance aggregation requires server-side LLM clustering and has no agent-driven equivalent. Teams that need it can use Managed Reflexio or the Claude Code integration.
  • Server-side fallback for missing embeddings. save_user_playbooks tolerates _get_embedding failures and stores SQL NULL in the embedding column (FTS still populated, vec table skipped). Reads already fall back to FTS-only ranking via unified_search_service. The NULL marker leaves the door open for a dedicated re-embed migration.

Changes

Hook (reflexio/integrations/openclaw/hook/)

  • handler.js — remove publishSession, SQLite store, smartTruncate, getSessionId fallback, message:sent buffering, command:stop flush, and bootstrap retry-unpublished. Keep apiPost, formatSearchResults, resolveUserId, bootstrap profile injection, per-message search.
  • package.json — bump to 3.0.0, drop better-sqlite3 dependency, update description.
  • package-lock.json — regenerated clean (empty deps).
  • HOOK.md — update frontmatter events to ["agent:bootstrap", "message:received"]. Rewrite Prerequisites and Privacy sections to match the single-hop localhost model.

Commands (reflexio/integrations/openclaw/commands/)

  • reflexio-extract/SKILL.md — rewrite around the new extract→search→upsert flow. Embed a condensed v3.0.0 rubric (Correction SOPs + Success Path Recipes, trigger quality tests, output schema, tautology check). Explicitly note the canonical rubric file at reflexio/server/prompt/prompt_bank/playbook_extraction_context/v3.0.0.prompt.md is a maintainer reference only and is not shipped with the ClawHub bundle.
  • reflexio-aggregate/ — directory deleted.

Skill, rules, README, TESTING

  • skill/SKILL.md — replace the "two distinct network hops" privacy framing with "single hop, localhost only." Rewrite "Step-by-Step: When to Publish" as "When to Persist a Learning" using /reflexio-extract. Drop references to reflexio publish, publish_interaction, /reflexio-aggregate, and any LLM-provider language. Update command reference table.
  • rules/reflexio.md — simplify transparency section to one sentence. Drop ~/.reflexio/.env and LLM-provider language. Instruct the agent to run /reflexio-extract at natural milestones.
  • README.md — replace the 3-subgraph mermaid diagram with 2 subgraphs (Retrieve, Persist). Drop Scheduled Aggregation, Agent Playbooks, and LLM-key prerequisite paragraphs. Update comparison table: "Server dependencies" row now reads "no LLM provider key required" for OpenClaw.
  • TESTING.md — update Phase 3 to describe explicit /reflexio-extract instead of implicit command:stop flushing. Remove Phase 5.2 /reflexio-aggregate test.
  • publish_clawhub.sh — rewrite the embedded SKILL.md privacy disclosure and First-Use Setup blocks to match the new model. Drop the "step 2 is interactive — it prompts you for an LLM provider" warning.

CLI setup wizard (reflexio/cli/commands/setup_cmd.py)

  • openclaw() command: skip _prompt_llm_provider and _prompt_embedding_provider entirely. Only prompt for storage backend. Docstring and summary output updated.

Storage (reflexio/server/services/storage/sqlite_storage/_playbook.py)

  • save_user_playbooks: wrap _get_embedding (both the parallel-executor and serial branches) in try/except. On failure, log a warning and assign up.embedding = []. At INSERT time, pass _json_dumps(up.embedding or None) so the DB column gets SQL NULL. The existing if up.embedding: guard correctly skips _vec_upsert.

Tests (tests/server/services/storage/test_sqlite_storage.py)

  • test_save_user_playbooks_tolerates_embedding_failure — patches _get_embedding to raise RuntimeError, saves a playbook, and asserts: the row is persisted with SQL NULL embedding, the FTS row is populated, the vec table has no entry, and FTS-only search returns the playbook.
  • test_save_user_playbooks_tolerates_embedding_failure_with_expansion — same shape, but additionally patches _should_expand_documents to True so the parallel-executor branch is exercised.

Diagrams

flowchart TD
    subgraph "Before: Capture → Extract → Aggregate (server-side LLM)"
        B1["Hook buffers turns to SQLite"] --> B2["command:stop → POST /api/publish_interaction"]
        B2 --> B3["Server LLM extracts playbooks"]
        B3 --> B4["Server LLM aggregates into agent playbooks"]
        B4 --> B5["reflexio search"]
        B5 --> B6["Hook injects REFLEXIO_CONTEXT.md"]
    end
Loading
flowchart TD
    subgraph "After: Retrieve + agent-driven Persist"
        A1["message:received → /api/search"] --> A2["Hook injects REFLEXIO_CONTEXT.md"]
        A3["User runs /reflexio-extract"] --> A4["Agent applies v3.0.0 rubric in-context"]
        A4 --> A5["reflexio user-playbooks search"]
        A5 -->|match| A6["reflexio user-playbooks update"]
        A5 -->|no match| A7["reflexio user-playbooks add"]
        A6 --> A1
        A7 --> A1
    end
Loading

Test Plan

Automated:

  • uv run ruff check on all modified Python — clean.
  • uv run pyright on modified source files (_playbook.py, setup_cmd.py) — 0 errors. (The 10 _FakeRow/sqlite3.Row pyright errors in test_sqlite_storage.py are pre-existing on main, unrelated to this PR.)
  • uv run pytest on affected suites — 327 passed:
    • tests/cli/ (all CLI subcommands including test_setup_cmd.py)
    • tests/lib/test_user_playbook_unit.py
    • tests/server/services/storage/test_sqlite_storage.py (includes two new tests)
    • tests/server/services/storage/test_storage_contract_playbook.py
    • tests/e2e_tests/test_openclaw_integration.py
    • tests/server/services/playbook/test_playbook_generation_service.py

Manual verification recommended before merge:

  • reflexio setup openclaw on a fresh machine with no ~/.reflexio/.env — wizard completes without asking for an LLM provider key; hook, skill, /reflexio-extract command, and workspace rule land in ~/.openclaw/.
  • Start an OpenClaw session and run a task that triggers a correction. Invoke /reflexio-extract — verify the agent runs reflexio user-playbooks search --agent-version openclaw-agent followed by add (or update on a re-run). Confirm with reflexio user-playbooks list --agent-version openclaw-agent --limit 10.
  • Verify the hook does not create ~/.reflexio/sessions.db (no SQLite buffering).
  • Move ~/.reflexio/.env aside, restart the server, and repeat the extract flow — both the write and the subsequent search must still succeed (FTS-only ranking).
  • ./publish_clawhub.sh --stage-only — verify the staged SKILL.md does not mention a required LLM provider key, and the bundle contains no reflexio-aggregate/ directory.

Follow-ups (not in this PR)

  • eval/dataset.json + eval/runner.py still exercise the old publish_interaction endpoint (excluded from the ClawHub bundle via .clawhubignore, so no user-facing impact). Worth updating in a separate PR to test the new CRUD flow.
  • A dedicated "re-embed stored user playbooks" CLI command would help users who start on an LLM-free setup and later configure an embedding provider — the new NULL column marker lets such a migration target rows via WHERE embedding IS NULL, but the command itself does not exist yet.

Refactor the OpenClaw integration so the Reflexio server no longer needs
an LLM provider API key. Playbook extraction now runs in the agent's own
session via /reflexio-extract (applies the v3.0.0 rubric in-context, then
upserts via direct CRUD on reflexio user-playbooks). The hook becomes
search-only — no SQLite buffering, no /api/publish_interaction, no
message:sent / command:stop handlers.

Scope of changes:

- Hook (handler.js, HOOK.md, package.json): drop publishSession, SQLite,
  better-sqlite3 dependency, message:sent, command:stop. Keep
  agent:bootstrap profile injection and message:received search. Bump to
  v3.0.0 and regenerate package-lock.json.
- Commands: rewrite /reflexio-extract around extract-rubric → search →
  add/update. Delete /reflexio-aggregate (server-side LLM clustering has
  no agent-driven equivalent). Embed the v3.0.0 rubric verbatim in the
  command file since the canonical source is not shipped with the
  ClawHub bundle.
- Skill / rules / README / TESTING / publish_clawhub.sh: rewrite the
  privacy story from two network hops to single-hop localhost; drop the
  "LLM provider key required" disclosure.
- CLI setup wizard (setup_cmd.py::openclaw): skip _prompt_llm_provider
  and _prompt_embedding_provider; only prompt for storage backend.
- Storage (sqlite_storage/_playbook.py): make save_user_playbooks
  tolerate missing embeddings — log a warning and save with SQL NULL in
  the embedding column so a future re-embed migration can target these
  rows via WHERE embedding IS NULL. Reads already fall back to FTS-only
  via unified_search_service when no query embedding is available.
- Tests: add regression coverage for the missing-embedding write path
  (expansion and non-expansion branches); verify NULL column, FTS
  population, vec table stays empty, and FTS search still finds the row.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant