feat(openclaw): remove LLM dependency via agent-driven extraction#20
Open
feat(openclaw): remove LLM dependency via agent-driven extraction#20
Conversation
Refactor the OpenClaw integration so the Reflexio server no longer needs an LLM provider API key. Playbook extraction now runs in the agent's own session via /reflexio-extract (applies the v3.0.0 rubric in-context, then upserts via direct CRUD on reflexio user-playbooks). The hook becomes search-only — no SQLite buffering, no /api/publish_interaction, no message:sent / command:stop handlers. Scope of changes: - Hook (handler.js, HOOK.md, package.json): drop publishSession, SQLite, better-sqlite3 dependency, message:sent, command:stop. Keep agent:bootstrap profile injection and message:received search. Bump to v3.0.0 and regenerate package-lock.json. - Commands: rewrite /reflexio-extract around extract-rubric → search → add/update. Delete /reflexio-aggregate (server-side LLM clustering has no agent-driven equivalent). Embed the v3.0.0 rubric verbatim in the command file since the canonical source is not shipped with the ClawHub bundle. - Skill / rules / README / TESTING / publish_clawhub.sh: rewrite the privacy story from two network hops to single-hop localhost; drop the "LLM provider key required" disclosure. - CLI setup wizard (setup_cmd.py::openclaw): skip _prompt_llm_provider and _prompt_embedding_provider; only prompt for storage backend. - Storage (sqlite_storage/_playbook.py): make save_user_playbooks tolerate missing embeddings — log a warning and save with SQL NULL in the embedding column so a future re-embed migration can target these rows via WHERE embedding IS NULL. Reads already fall back to FTS-only via unified_search_service when no query embedding is available. - Tests: add regression coverage for the missing-embedding write path (expansion and non-expansion branches); verify NULL column, FTS population, vec table stays empty, and FTS search still finds the row.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
~/.reflexio/.envbecause the Reflexio server ran LLM-based extraction when the hook published captured conversations. OpenClaw agents are already LLM-powered, so we move extraction into the agent's own session and keep the server on pure CRUD + semantic search.message:sent, nocommand:stop, no/api/publish_interaction. Onlyagent:bootstrapprofile injection andmessage:receivedplaybook retrieval remain.better-sqlite3dropped from dependencies./reflexio-extractis now agent-driven. The slash command embeds the v3.0.0 extraction rubric in its SKILL.md; the agent applies the rubric to its current conversation, then upserts each extracted playbook viareflexio user-playbooks search→updateon match, elseadd./reflexio-aggregatedeleted. Cross-instance aggregation requires server-side LLM clustering and has no agent-driven equivalent. Teams that need it can use Managed Reflexio or the Claude Code integration.save_user_playbookstolerates_get_embeddingfailures and stores SQL NULL in the embedding column (FTS still populated, vec table skipped). Reads already fall back to FTS-only ranking viaunified_search_service. The NULL marker leaves the door open for a dedicated re-embed migration.Changes
Hook (
reflexio/integrations/openclaw/hook/)handler.js— removepublishSession, SQLite store,smartTruncate,getSessionIdfallback,message:sentbuffering,command:stopflush, and bootstrap retry-unpublished. KeepapiPost,formatSearchResults,resolveUserId, bootstrap profile injection, per-message search.package.json— bump to3.0.0, dropbetter-sqlite3dependency, update description.package-lock.json— regenerated clean (empty deps).HOOK.md— update frontmattereventsto["agent:bootstrap", "message:received"]. Rewrite Prerequisites and Privacy sections to match the single-hop localhost model.Commands (
reflexio/integrations/openclaw/commands/)reflexio-extract/SKILL.md— rewrite around the new extract→search→upsert flow. Embed a condensed v3.0.0 rubric (Correction SOPs + Success Path Recipes, trigger quality tests, output schema, tautology check). Explicitly note the canonical rubric file atreflexio/server/prompt/prompt_bank/playbook_extraction_context/v3.0.0.prompt.mdis a maintainer reference only and is not shipped with the ClawHub bundle.reflexio-aggregate/— directory deleted.Skill, rules, README, TESTING
skill/SKILL.md— replace the "two distinct network hops" privacy framing with "single hop, localhost only." Rewrite "Step-by-Step: When to Publish" as "When to Persist a Learning" using/reflexio-extract. Drop references toreflexio publish,publish_interaction,/reflexio-aggregate, and any LLM-provider language. Update command reference table.rules/reflexio.md— simplify transparency section to one sentence. Drop~/.reflexio/.envand LLM-provider language. Instruct the agent to run/reflexio-extractat natural milestones.README.md— replace the 3-subgraph mermaid diagram with 2 subgraphs (Retrieve, Persist). Drop Scheduled Aggregation, Agent Playbooks, and LLM-key prerequisite paragraphs. Update comparison table: "Server dependencies" row now reads "no LLM provider key required" for OpenClaw.TESTING.md— update Phase 3 to describe explicit/reflexio-extractinstead of implicitcommand:stopflushing. Remove Phase 5.2/reflexio-aggregatetest.publish_clawhub.sh— rewrite the embedded SKILL.md privacy disclosure and First-Use Setup blocks to match the new model. Drop the "step 2 is interactive — it prompts you for an LLM provider" warning.CLI setup wizard (
reflexio/cli/commands/setup_cmd.py)openclaw()command: skip_prompt_llm_providerand_prompt_embedding_providerentirely. Only prompt for storage backend. Docstring and summary output updated.Storage (
reflexio/server/services/storage/sqlite_storage/_playbook.py)save_user_playbooks: wrap_get_embedding(both the parallel-executor and serial branches) in try/except. On failure, log a warning and assignup.embedding = []. At INSERT time, pass_json_dumps(up.embedding or None)so the DB column gets SQL NULL. The existingif up.embedding:guard correctly skips_vec_upsert.Tests (
tests/server/services/storage/test_sqlite_storage.py)test_save_user_playbooks_tolerates_embedding_failure— patches_get_embeddingto raiseRuntimeError, saves a playbook, and asserts: the row is persisted with SQL NULL embedding, the FTS row is populated, the vec table has no entry, and FTS-only search returns the playbook.test_save_user_playbooks_tolerates_embedding_failure_with_expansion— same shape, but additionally patches_should_expand_documentsto True so the parallel-executor branch is exercised.Diagrams
flowchart TD subgraph "Before: Capture → Extract → Aggregate (server-side LLM)" B1["Hook buffers turns to SQLite"] --> B2["command:stop → POST /api/publish_interaction"] B2 --> B3["Server LLM extracts playbooks"] B3 --> B4["Server LLM aggregates into agent playbooks"] B4 --> B5["reflexio search"] B5 --> B6["Hook injects REFLEXIO_CONTEXT.md"] endflowchart TD subgraph "After: Retrieve + agent-driven Persist" A1["message:received → /api/search"] --> A2["Hook injects REFLEXIO_CONTEXT.md"] A3["User runs /reflexio-extract"] --> A4["Agent applies v3.0.0 rubric in-context"] A4 --> A5["reflexio user-playbooks search"] A5 -->|match| A6["reflexio user-playbooks update"] A5 -->|no match| A7["reflexio user-playbooks add"] A6 --> A1 A7 --> A1 endTest Plan
Automated:
uv run ruff checkon all modified Python — clean.uv run pyrighton modified source files (_playbook.py,setup_cmd.py) — 0 errors. (The 10_FakeRow/sqlite3.Rowpyright errors intest_sqlite_storage.pyare pre-existing onmain, unrelated to this PR.)uv run pyteston affected suites — 327 passed:tests/cli/(all CLI subcommands includingtest_setup_cmd.py)tests/lib/test_user_playbook_unit.pytests/server/services/storage/test_sqlite_storage.py(includes two new tests)tests/server/services/storage/test_storage_contract_playbook.pytests/e2e_tests/test_openclaw_integration.pytests/server/services/playbook/test_playbook_generation_service.pyManual verification recommended before merge:
reflexio setup openclawon a fresh machine with no~/.reflexio/.env— wizard completes without asking for an LLM provider key; hook, skill,/reflexio-extractcommand, and workspace rule land in~/.openclaw/./reflexio-extract— verify the agent runsreflexio user-playbooks search --agent-version openclaw-agentfollowed byadd(orupdateon a re-run). Confirm withreflexio user-playbooks list --agent-version openclaw-agent --limit 10.~/.reflexio/sessions.db(no SQLite buffering).~/.reflexio/.envaside, restart the server, and repeat the extract flow — both the write and the subsequent search must still succeed (FTS-only ranking)../publish_clawhub.sh --stage-only— verify the staged SKILL.md does not mention a required LLM provider key, and the bundle contains noreflexio-aggregate/directory.Follow-ups (not in this PR)
eval/dataset.json+eval/runner.pystill exercise the oldpublish_interactionendpoint (excluded from the ClawHub bundle via.clawhubignore, so no user-facing impact). Worth updating in a separate PR to test the new CRUD flow.WHERE embedding IS NULL, but the command itself does not exist yet.