Skip to content

[RFC]: Experience Distillation for cross-quest knowledge sharing #64

@droidlyx

Description

@droidlyx

Summary

Cross-quest knowledge sharing enables future quests to learn from past success and mistakes. Deepscientist already provides the cross-quest memory store: MemoryService with global + quest scopes, the knowledge card kind, multi-tag cards, MCP memory.write/read/search/list_recent/promote_to_global, and a read_visibility_mode toggle for cross-quest reads.

What's missing is discipline and recall: agents tend to write quest-scope notes that never surface in future quests, and stage-entry reads default to recent rather than relevant. Quest 014 made this concrete — 9 quest-scope cards, 1 global; useful lessons stayed locked inside the producing quest.

This RFC adds an opt-in discipline + recall layer on top of the existing memory: a distill stage skill that runs at closure, gates that reject closure when reviews are missing, a tag-overlap recall MCP for stage-entry priors, and small default + prose nudges that bias everyday writes toward global.

No new MCP namespace. Two startup_contract toggles, both default off.

Motivation

Quests accumulate two kinds of durable knowledge — research lessons (what worked / was falsified / why a baseline fails on a distribution) and framework pitfalls (closure-protocol gotchas, validator paths, contract behaviour). Main's memory can already store and serve both globally, but in practice agents write to quest scope, don't query global at stage entry, and skip distillation at quest end. This RFC closes that gap with gates, a server-side filter, and a default flip — so the discipline doesn't have to swim upstream.

What main already provides (not in scope for this RFC)

  • MemoryService two-scope storage: ~/DeepScientist/memory/ (global) + <quest>/memory/ (quest)
  • Card kinds: papers, ideas, decisions, episodes, knowledge, templates
  • Per-card tags: [...] with normalization
  • MCP memory tools: write, read, search, list_recent, promote_to_global
  • read_visibility_mode: independent | shared_across_quests (config-level cross-quest reads)
  • _visible_quest_roots / list_visible_quest_cards / search_visible_quest_cards helpers

Nothing in this RFC introduces new card storage, new scope semantics, or a new MCP namespace.

Proposed delta

1. Two opt-in toggles (startup_contract)

Both default off.

  • experience_distill: enables the distill skill, the closure gate, and the decision(action='distill_review') validation path.
  • recall_priors: enables the recall_priors_rule injection in stage skills.

2. Distill review — reuse the decision artifact

No new artifact kind. The review is a decision artifact with action='distill_review' and review-specific fields:

kind: decision
action: distill_review
verdict, reason
reviewed_run_ids: [run-id, ...]
cards_written: [{ card_id, scope, action: new|patch, target_run_id }]
neighbor_decisions: [{ candidate_card_id, decision: patch|new|neighbor_but_separate, target_run_id, reason }]

The validator scans <quest>/artifacts/_index.jsonl together with <quest>/.ds/worktrees/*/artifacts/_index.jsonl, so a run recorded in any worktree counts as covered. Cross-quest patches go through validate_cross_quest_patch — a card patched into another quest's lineage must carry a lineage chain back to its origin.

3. distill stage skill

New stage skill at the write / finalize anchors. The prose:

  • per-batch summary scan of completed runs
  • explicit neighbor decision against existing cards (patch / new / neighbor_but_separate)
  • minimum metadata: claim + lineage only (no required subtype / mechanism / conditions / confidence depth)
  • write target: always scope=global

4. Closure gates

When experience_distill: true:

  • artifact.submit_paper_bundle(...) rejects if any completed run lacks a covering distill_review decision.
  • artifact.complete_quest(...) rejects under the same condition.
  • The error lists pending run ids and points to the distill skill.

5. Recall path

memory.list_knowledge_summaries(domain_tags=[...], kind=None, limit=20) — new MCP tool. Server-side filter: returns cards whose tags overlap domain_tags, ranked by overlap count desc, then updated_at desc. Empty domain_tags=[] falls back to recent summaries.

recall_priors_rule is injected into stage skill prompts when the toggle is on, instructing the agent to assemble domain_tags from brief.domains + current_stage + relevant tool/type tags before calling list_knowledge_summaries.

6. Bias everyday writes toward global

Three small changes flowing from the same lesson (9 quest-scope vs 1 global):

  • memory.write MCP default scope flips from quest to global. Explicit scope=quest still works.
  • Stage skill prose at every anchor that writes lessons explicitly prefers scope=global for any reusable insight; quest-scope is reserved for quest-internal task tracking.
  • New MCP artifact.list_recent(kind=None, limit=20) returns the current quest's recent artifacts. Stage skills use this for the "what did I just do" runtime view, so memory.list_recent is no longer co-opted as a runtime-state read.

Public surface

Existing tool Change
memory.write default scope flips questglobal
memory.list_recent / read / search / promote_to_global unchanged
artifact.record(kind='decision', action='distill_review', ...) new validation path under existing decision kind
artifact.submit_paper_bundle / complete_quest hard-rejected when toggle on and any run uncovered
New tool Purpose
memory.list_knowledge_summaries(domain_tags, kind, limit) server-side tag-overlap recall
artifact.list_recent(kind, limit) current-quest recent artifacts (runtime view)
artifact.list_distill_candidates() enumerate completed runs not yet covered

No new public MCP namespace.

Validation plan

  • Closure-gate regression tests cover multi-worktree aggregation.
  • Tag-overlap recall tests cover ordering, empty domain_tags, and tag-set semantics.
  • Cross-quest patch invariant tests cover the lineage requirement.
  • MCP server tests cover new tools and structured rejection paths.

Alternatives considered

  • No cross-quest discipline (status quo). promote_to_global is opt-in and rarely used; quest-scope notes don't surface elsewhere. Quest 014's framework-bug discoveries would have spared quest 015 multiple wasted turns at finalize.
  • Free-form running lessons file. Doesn't scale (whole-file scan), can't gate closure, conflates runtime state with knowledge.
  • Separate distill_review artifact kind (earlier draft). Folded into decision instead — a review is a decision about coverage; one less kind to maintain.
  • Per-slice auto-routing (earlier draft). Routing per analysis slice was overreach; the closure gate alone is enough.

Affected areas

Core:

  • src/deepscientist/artifact/schemas.py — extend DECISION_ACTIONS with distill_review
  • src/deepscientist/artifact/service.py — closure gate scans decisions filtered by action
  • src/deepscientist/artifact/experience_distill.py — toggle reads, candidate iteration, multi-worktree aggregator
  • src/deepscientist/mcp/server.py — new tools; default-scope flip on memory.write
  • src/deepscientist/prompts/builder.pyrecall_priors_rule, distill_required_rule, global-preference nudges

Skills:

  • src/skills/distill/SKILL.md
  • Stage skills using the recent-activity read pattern adopt artifact.list_recent

Tests:

  • tests/test_experience_distill_*.py, tests/test_mcp_servers.py, tests/test_memory_*.py

Docs:

  • docs/en/34_EXPERIENCE_DISTILL_GUIDE.md
  • Pointers from docs/en/07_MEMORY_AND_MCP.md and docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md

Compatibility

  • Both toggles default off; opt-out quests are unaffected.
  • memory.write default-scope flip changes the default only; explicit scope=quest still works. Call sites that pass scope explicitly are unaffected.
  • When experience_distill: true, closure tools hard-reject without a covering review. This is the intended discipline.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions