Skip to content

[Core] Feat: cross-quest experience distill#60

Open
droidlyx wants to merge 77 commits intoResearAI:mainfrom
droidlyx:feat_experience_distill
Open

[Core] Feat: cross-quest experience distill#60
droidlyx wants to merge 77 commits intoResearAI:mainfrom
droidlyx:feat_experience_distill

Conversation

@droidlyx
Copy link
Copy Markdown

@droidlyx droidlyx commented Apr 26, 2026

Summary

This PR adds an opt-in discipline + recall layer on top of DeepScientist's existing memory subsystem so each quest can review its own completed runs, write reusable knowledge cards into global memory, and surface those cards as priors to future quests. Paper-bundle submission is gated on the review. The whole feature can be toggled on or off when creating a quest.

Main already provides cross-quest memory storage (MemoryService two-scope, knowledge card kind, MCP memory.write/read/search/list_recent/promote_to_global, read_visibility_mode). What it lacks is a discipline that ensures lessons get written globally before a quest closes, and a recall hook that surfaces relevant priors at stage entry. This PR closes that gap.

Changes included:

  • add experience_distill and recall_priors fields to the startup contract and project-creation form (both default off)
  • add distill_review as a decision(action='distill_review') validation path under the existing decision artifact kind (not a new kind)
  • add distill stage skill implementing per-batch summary-scan + neighbor decisions over candidate runs; cards require only claim + lineage metadata
  • add artifact.list_distill_candidates, artifact.list_recent, and memory.list_knowledge_summaries MCP tools (all under existing namespaces; no new public namespace)
  • flip memory.write MCP default scope from quest to global so reusable lessons land in the right place by default; explicit scope='quest' still works
  • add hard guards on artifact.submit_paper_bundle and artifact.complete_quest blocking submission until every completed run is covered by a decision(action='distill_review'); aggregate the check across quest_root/artifacts plus every .ds/worktrees/*/artifacts
  • reroute decision(write|finalize) guidance to the distill skill while the gate is open
  • inject a recall_priors_rule cue into stage-skill prompts when recall_priors: on
  • add ds distill-quest CLI for retroactive draft emission on existing repos
  • classify claude 401 as non-retryable and probe enabled runners on daemon startup so auth/config failures surface immediately

Why

DeepScientist already accumulates per-run artifacts and per-quest paper bundles, and main's MemoryService already supports global cards with cross-quest read visibility. But knowledge stayed locked inside the producing quest in practice — agents wrote to quest scope, didn't query global at stage entry, and skipped distillation at quest end. Quest 014 made this concrete: 9 quest-scope cards and 1 global card; useful lessons stayed locked inside the producing quest.

This PR makes "review and externalize what you learned" an explicit gate without hard-coding which conclusions are worth keeping. The knowledge is saved as global cards and read by future quests through a tag-overlap recall MCP. Three small defaults (write→global, stage prose preferring global, artifact.list_recent for runtime view) make global the path of least resistance instead of a road less travelled.

What changed since the original draft

The original draft introduced a separate distill_review artifact kind, a per-slice analysis-routing trigger, deep card-metadata enforcement, and a Chinese distill guide. The slim revision (this PR's current shape) drops those:

  • distill_review is no longer a separate artifact kind — it's decision(action='distill_review'). One less public kind to maintain.
  • ❌ Per-slice analysis-slice auto-routing is gone. The closure gate (at submit_paper_bundle / complete_quest) is the sole discipline mechanism; forcing a distill detour after every analysis slice was overreach.
  • ❌ Card metadata enforces only claim + lineage. Previous required depth (subtype, mechanism, conditions, confidence) was overreach — agents would game shallow taxonomy fields, and the cross-quest patch invariant only ever needed lineage.
  • docs/zh/34_EXPERIENCE_DISTILL_GUIDE.md deleted; zh readers follow the en guide. zh 07_MEMORY_AND_MCP.md and 14_PROMPT_SKILLS_AND_MCP_GUIDE.md cross-reference the en distill guide.
  • memory.write MCP default scope flipped from quest to global.
  • ➕ Stage skill prose at lesson-writing anchors prefers scope=global; quest-scope is reserved for quest-internal task tracking.
  • ➕ New MCP artifact.list_recent(kind, limit) exposes "what did I just do" via the artifact namespace, so memory.list_recent is no longer co-opted as a runtime-state surface.

Tracking: #64.

Public surface

Existing tool Change
memory.write default scope flips questglobal; explicit scope='quest' still works
memory.list_recent / read / search / promote_to_global unchanged
artifact.record(kind='decision', action='distill_review', ...) new validation path under existing decision kind
artifact.submit_paper_bundle / artifact.complete_quest hard-rejected when experience_distill: on and any completed run is uncovered
New MCP tool Purpose
memory.list_knowledge_summaries(domain_tags, kind, limit) server-side tag-overlap recall
artifact.list_recent(kind, limit) current-quest recent artifacts (runtime view)
artifact.list_distill_candidates() enumerate completed runs not yet covered by a distill_review decision

No new public MCP namespace.

Scope

This PR intentionally does not:

  • prescribe what counts as a good knowledge card (only the schema and the neighbor-decision discipline)
  • add venue-, domain-, or metric-specific gating
  • merge or auto-deduplicate cards across quests (neighbor decisions are recorded but kept as separate cards)
  • change behavior for quests with experience_distill: off (the default)
  • replace any of main's existing memory storage primitives — it builds on them

Both new toggles are opt-in advanced fields. There is no migration impact: existing quests keep experience_distill: off and behave as before; the gate only fires when the toggle is explicitly turned on. The memory.write default-scope flip changes the default only; call sites passing scope= explicitly are unaffected.

Tests

  • pytest tests/test_experience_distill_*.py tests/test_mcp_servers.py tests/test_memory_and_artifact.py tests/test_prompt_builder.py — green
  • 102 distill tests across 4 consolidated test files (schema, gate, integration, config_cli) plus a shared fixtures module tests/_distill_fixtures.py. The original 10 distill files were merged to reduce file-count proliferation while preserving every test name and assertion.

Docs

  • new standalone guide docs/en/34_EXPERIENCE_DISTILL_GUIDE.md covering the toggles, finalize gate, hard guards, distill skill flow, decision(action='distill_review') payload, cross-quest recall, MCP tools, retroactive ds distill-quest CLI, and debugging
  • pointers added in docs/en/07_MEMORY_AND_MCP.md, docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md and their Chinese mirrors so existing readers land on the en distill guide

Related

Related: #64

AI Assistance

This PR was prepared with AI assistance and reviewed locally before submission.

Yuxuan Liu and others added 30 commits April 26, 2026 22:38
Implementation plan for the persistent experience distillation skill:
opt-in companion skill plus post-record routing hook in
ArtifactService.record(). Experience is a knowledge kind subtype.
Retroactive CLI ds distill-quest produces drafts for human review.
8 TDD tasks with complete test code, no placeholders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Imports will be re-added in Tasks 3 and 7 when they're actually used.
Avoids F401 lint warnings on the Task 1 commit.
…ience_distill

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…k 6)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fix 1: Use json.dumps() to safely quote slice titles in the YAML frontmatter
(e.g. 'effect of "warm-up"' → valid YAML quoted scalar). Prevents injection
if a title contains double-quotes. Other fields (quest, run, campaign) remain
unquoted since they're internal IDs without special characters.

Fix 2: Capture emit_experience_drafts' return value (written file paths) and
report len(written) instead of len(records), ensuring the JSON output reflects
the actual number of files created (some records may be filtered).

Test: Added test_emit_experience_drafts_escapes_title_with_quote to verify
titles with quotes parse correctly; extended
test_iter_analysis_slice_records_filters_index to verify malformed JSONL lines
and missing files are skipped gracefully.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1 added the distill-quest subcommand to deepscientist.cli but
forgot the bin/ds.js launcher allowlist, so users got
"Unexpected launcher argument" instead of reaching Python.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Past failed installs left .staging.* dirs lying around,
and stale src/ui/dist or src/tui/dist could fool the freshness check
into reusing an obsolete bundle (or get the build into a wedged state
that surfaces as Rollup-cant-resolve-deps errors). Sweep both at the
start of every install so users do not have to clean caches by hand.

Source tree node_modules are intentionally left alone because they
support local npm run dev workflows; --clean-style flag can be added
later if observation justifies it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
git_readiness() previously surfaced missing user.name/email as
warnings + guidance only, so ds startup printed a hint and continued.
The first agent commit then died with "empty ident name" deep into
quest execution, far from the actual cause.

Move both checks from warnings to errors. The existing ds.js block
on payload.git.installed === false is extended to also bail when
payload.git.errors is non-empty, surfacing the configuration step
the user needs to run.

ok now reflects whether errors is empty so doctor and other callers
get a single source of truth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Quest 009 confirmed Phase 1 distill never fires when the agent skips
analysis_campaign and routes main_experiment results straight to write.
The spec adds a finalize-gate: when decision(action='write'|'finalize')
lands, scan completed runs (analysis.slice / main_experiment /
experiment) against a reviewed-set derived from new distill_review
artifacts; if anything is undistilled, route to the distill skill before
allowing write/finalize. Cursor is set-difference, not timestamp, so the
gate is idempotent and stateless.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nine TDD tasks: distill_review schema, candidate iterator, gate evaluator,
overlay function, service.py wire-in, list_distill_candidates MCP tool,
SKILL.md rewrite, retroactive CLI alignment, end-to-end lifecycle test.

Two spec deviations called out in plan (both net simplifications):
  1. Hook moves from guidance.py to experience_distill.py overlay + service.py
     wire-in - preserves build_guidance_for_record() purity.
  2. Replaces "extend list_runs filter" with a new dedicated
     list_distill_candidates MCP tool - list_runs does not currently exist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pe cases

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…uard

Apply .strip().lower() normalization to both 'kind' and 'action' checks in
maybe_inject_distill_finalize_gate to absorb upstream casing variations,
matching the pattern used elsewhere in guidance.py. Add test covering
uppercase and whitespace-padded action values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Also extends maybe_inject_distill_finalize_gate to handle the gate-clear
case: when a previously gate-injected guidance_vm is re-evaluated and
the gate no longer fires, the previous_recommended_skill is restored.
This ensures suppressed-equivalent decision records don't carry a stale
distill redirect after a distill_review has covered all candidates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds experience_distill to START_SETUP_FORM_FIELDS so the field
round-trips through the start-setup MCP patch surface, and coerces
the value as bool in _sanitize_start_setup_form_patch. Pairs with
the UI toggle in CreateProjectDialog.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…or symmetry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Yuxuan Liu and others added 20 commits April 26, 2026 22:38
…st helper

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…le and complete_quest

Quest 011 validation showed the Phase 3 finalize gate only fires on
`record(kind='decision', action ∈ {write,finalize})`. Real closure paths
(`submit_paper_bundle` records `report`; `complete_quest` is a separate
MCP tool) bypass that surface entirely, so an agent with experience_distill
on can ship and complete without ever invoking distill.

Two complementary changes:

- B (prompt cue): _priority_memory_block now injects `distill_required_rule`
  whenever distill is on, the gate has pending candidates, and the active
  skill is a stage skill — naming both closure tools so the agent learns
  the constraint before acting.
- C (hard guards): submit_paper_bundle raises ValueError and complete_quest
  returns `{ok: false, status: 'distill_required', ...}` when the gate has
  pending candidates, regardless of which artifact route the agent picked.

Tests: 4 new prompt-builder cases (cue fires/skips for stage/companion/off);
7 new closure-guard cases covering each gate state per tool. 187 distill +
prompt + paper-bundle + complete-quest tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fact dirs

Quest 011 re-validation showed the closure guards from the previous
commit only looked at quest_root/artifacts. Real `main_experiment`
records typically live in the active idea worktree at
`quest_root/.ds/worktrees/<name>/artifacts/` until the git-graph merge
promotes them — so a worktree with pending runs would still slip past
the prompt cue and both hard guards.

Adds `evaluate_distill_gate_for_quest(quest_root)` which walks every
workspace artifact dir (canonical + each worktree subdir), dedupes
candidates and reviews by artifact_id, and returns the same payload
shape as `evaluate_distill_gate`. Switches the prompt cue and both
service guards to the new aggregator.

Tests: 4 new closure-guard cases (worktree-only candidate; dedupe
across workspaces; review in quest_root covers worktree run). 133
distill + prompt-builder tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…yable

Quest 011 audit found 137 401 errors across multiple quests in the last
2 days. Each failed turn triggered the runner retry loop (default 4
attempts), and each attempt re-tried claude's OAuth refresh, eventually
rate-limiting the refresh endpoint and locking the account out for hours
even after credentials are restored.

Adds a runner_failures pattern that matches "Failed to authenticate" /
"API Error: 401 ... authentication_error" / "Invalid authentication
credentials" only when runner=='claude', returns retriable=False, and
points the user at `claude login` or the ANTHROPIC_API_KEY override.

This stops the retry storm at the daemon's _non_retryable_failure_diagnosis
gate before it reaches the retry policy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ce immediately

Reuses the same `probe_runner_bootstrap` call that `ds doctor` runs for each
enabled runner, but kicks it from `serve()` in a background thread on daemon
start. Probe results land in daemon.jsonl as either
`daemon.runner_probe_ok` (info) or `daemon.runner_probe_failed` (error if a
deterministic FailureDiagnosis matches, warning otherwise) — so a stale
claude OAuth token shows up the moment the daemon comes up rather than
waiting for the first quest turn to discover and pause itself.

Probe runs once per daemon lifetime, off the HTTP serving thread, with a
90s per-runner cap inherited from the existing probe code. Combined with
the previous claude_authentication_failed diagnosis, a dead token now
fails fast in two distinct surfaces (startup probe + per-turn diagnose)
without any retry amplification.

Tests: 5 new cases covering enabled/disabled gating, deterministic
diagnosis routing, no-diagnosis fallback, exception handling, and default
runner-list resolution.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…est worktrees

The MCP wrapper sets DS_WORKTREE_ROOT once at server start (= quest root) and
does not refresh it when activate_branch swaps the active worktree, so
service.record(...) is invoked with workspace_root=quest_root even when the run
lives only in the active worktree's artifacts/_index.jsonl. The distill_review
run-id validator scanned only write_root/artifacts and therefore rejected
perfectly valid reviews with "unknown run artifact_ids", while
list_distill_candidates and evaluate_distill_gate_for_quest already aggregated
across every workspace. Quest 012 hit this in production and worked around it
by manually mirroring the run entry into the quest-root index.

Make the validator collect known_run_ids across _quest_workspace_artifact_dirs(),
matching the closure-side aggregation. write_root and the actual on-disk write
location are unchanged, so this only widens the acceptance set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lution flakes

vite/rollup intermittently fails to resolve hoisted deps (clsx, framer-motion,
react-router) on the fuse-overlayfs container even though npm install reports
success. Retry the build (not the install) until it converges; failure rate
observed ~10-30%, max 5 attempts keeps p(all-fail) below 1%. Could not
reproduce in isolation (11/11 attempts succeeded in a probe), so the underlying
filesystem/cache cause is left for a future bisect — this is a workaround,
not a fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without this check, passing a wrong quest_id silently returns drafts=0
exit 0, since read_distill_reviews and iter_distill_candidate_records
both early-return when their target dir does not exist. Print a clear
"Quest not found" message to stderr and return 1 instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sync the two paragraphs added on the English side (07_MEMORY_AND_MCP.md
section 8 and 14_PROMPT_SKILLS_AND_MCP_GUIDE.md section 13) so the
Chinese docs cover the experience_distill toggle, list_distill_candidates
MCP tool, and distill_review artifact at the same level of detail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the experience-distillation pipeline out of 07_MEMORY_AND_MCP.md
section 8 and 14_PROMPT_SKILLS_AND_MCP_GUIDE.md section 13 into a new
standalone document 34_EXPERIENCE_DISTILL_GUIDE.md (English and Chinese),
covering the experience_distill / recall_priors toggles, the finalize
gate, hard guards on submit_paper_bundle and complete_quest, the distill
skill flow, the distill_review schema, the cross-quest recall surface,
the new MCP tools, the retroactive ds distill-quest CLI, and operational
notes / debugging.

The original 07 / 14 sections now point at the new doc as the single
source of truth; index entries added to docs/en/README.md and
docs/zh/README.md under the architecture section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
These plan and spec files were development-process notes produced by the
local superpowers plugin while iterating on the distill pipeline. They
are not user- or maintainer-facing documentation; the canonical reference
is now docs/en/34_EXPERIENCE_DISTILL_GUIDE.md (and the Chinese mirror).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… specifiers

Vite/rollup occasionally exits 0 on this filesystem but leaves bare specifiers
like `import"scheduler"` in the main bundle (also seen with clsx, framer-motion,
react-router). The browser can't resolve those at runtime, producing
`Failed to resolve module specifier "scheduler"` and a blank page. The previous
retry loop only triggered on non-zero exit and so missed this mode.

After a successful `npm run build`, scan dist/assets/*.js for any of these bare
imports; if any are found, drop dist/ and retry the build. Same max-5 attempts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@droidlyx droidlyx changed the title Feat: cross-quest experience distill [Core] Feat: cross-quest experience distill Apr 28, 2026
droidlyx and others added 9 commits April 28, 2026 23:48
…edicated kind)

- DECISION_ACTIONS gains 'distill_review'; review payload fields
  (reviewed_run_ids, cards_written, neighbor_decisions) validated under
  the decision kind when action=='distill_review'
- Drop 'distill_review' from ARTIFACT_DIRS; one less public kind
- Closure gate and MCP error messages updated to suggest
  artifact.record(kind='decision', action='distill_review', ...)
- Tests + en docs + skill examples updated to the new shape

Per RFC ResearAI#64 slim: a distill review is a decision about coverage;
folding it into the existing kind keeps the public surface narrower
without losing any validation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Remove maybe_inject_distill_routing and iter_analysis_slice_records;
  the per-slice trigger forced every analysis slice through a distill
  detour even when no batch was ready
- Closure gate (maybe_inject_distill_finalize_gate) stays as the sole
  discipline mechanism — its coverage check already catches every
  uncovered run at quest end
- Drop tests/test_experience_distill_routing.py and per-slice
  references in SKILL.md / cli + integration tests

Per RFC ResearAI#64 slim: the closure gate alone provides enough discipline;
forcing distill per slice is overreach.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- validate_experience_metadata now enforces only `claim` (non-empty)
  and `lineage` (non-empty list with quest+run per entry)
- Drop required enforcement of subtype / mechanism / conditions /
  confidence; agents tend to game shallow taxonomy fields and the
  cross-quest patch invariant only ever needed lineage
- Cards may still carry these fields; we just don't reject when absent
- validate_cross_quest_patch unchanged — still enforces lineage chain
  on patched cards

Per RFC ResearAI#64 slim: shallow metadata depth was overreach; claim+lineage
carries real semantic load and the patch invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/skills/distill/SKILL.md: remove per-slice trigger framing,
  subtype/mechanism/conditions/confidence as required, and any
  remaining `kind='distill_review'` shape examples; keep finalize-gate
  protocol, neighbor decisions, claim+lineage minimum, decision-shape
  call example
- docs/en/34_EXPERIENCE_DISTILL_GUIDE.md: same trim; drop the
  ds memory normalize-knowledge mention (CLI doesn't exist) and
  per-slice analysis-slice routing references
- docs/zh/34_EXPERIENCE_DISTILL_GUIDE.md: deleted; zh readers follow
  the en guide pointer instead

Per RFC ResearAI#64 slim: docs match the actual surface after S1-S3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The S4+S5+S6 cleanup removed the `ds distill-quest <quest_id>` section
on the assumption it was a non-existent CLI like `ds memory
normalize-knowledge`. It is real (cli.py registers `distill-quest` and
the `distill_quest_command` handler). Restore a tight section
documenting it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The S4 SKILL.md trim collapsed the description and dropped the word
'experience'. test_experience_distill_skill_bundle asserts the
description contains 'experience' as a discoverability anchor;
restore the keyword so the bundle remains the canonical experience-
distillation skill.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lows

- mcp.memory.write: default `scope` flips quest -> global. Explicit
  scope='quest' still works for quest-internal task tracking
- Stage skills (decision/finalize/idea/baseline/experiment/
  analysis-campaign/write/etc) update memory.write examples and
  prose to land reusable lessons globally by default; quest-scope is
  reserved for quest-internal tracking
- Memory + MCP en/zh docs updated to reflect the new default
- New regression test: memory.write without explicit scope writes
  global; existing quest-scope tests pass scope='quest' explicitly

Per RFC ResearAI#64 slim section 6: quest 014 wrote 9 quest-scope cards and 1
global; flipping the default and aligning stage prose makes global
the path of least resistance for reusable lessons.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- ArtifactService.list_recent(quest_root, kind=None, limit=20):
  newest-first aggregation across quest_root/artifacts/_index.jsonl
  and every .ds/worktrees/*/artifacts/_index.jsonl
- MCP `artifact.list_recent` exposes it under the existing namespace;
  read-only annotations; codex auto-approves
- Tests cover service-level (main + worktree aggregation, kind filter,
  limit cap) and MCP-level shape
- en/zh docs updated; experience_distill guide cross-references it

Per RFC ResearAI#64 slim: stage skills can now read 'what did I just do'
from artifacts directly, freeing memory.list_recent from being
co-opted as a runtime-state surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ures

- Fold 7 small/topical files into 3 broader topical files plus a new
  config_cli file:
  - schema (+ artifact_schemas_distill_review)
  - gate (+ finalize_gate + closure_guards)
  - integration (+ candidates)
  - config_cli (config + cli + skill_bundle)
- Lift shared setup helpers (write_quest_yaml, make_quest_root,
  seed_runs, seed_pending_run, seed_worktree_pending_run,
  seed_distill_review, seed_distill_review_in_quest,
  make_artifact_quest, enable_distill_in_quest, make_decision_record,
  make_baseline_guidance) into tests/_distill_fixtures.py — eliminates
  ~200 lines of fixture duplication across 10 files
- All 102 test function names preserved; assertions byte-identical to
  pre-consolidation behavior; no test dropped
- Net: 10 files -> 4 + 1 helper; 2002 -> 1992 lines (line trim is small
  because savings from collapsed helpers were offset by the helper
  module + per-file docstrings; the visible-file-count drop is the
  win the PR reviewer was asking for)

Per PR feedback: 10 distill test files with duplicated fixtures was
test-file proliferation. Same coverage in fewer files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant