feat: add doc-digest skill with context-aware analysis and change tracking#36
feat: add doc-digest skill with context-aware analysis and change tracking#36
Conversation
New skill (SKILL.md) with layered section chunking, per-section analysis with cross-document context flags (Inconsistency, Clarity, Completeness, Redundancy), session-level feedback disposition (edit vs suggestion inferred from PR context), cross-section change propagation, 2000-line guardrail, and change log summary. Rewrite agent with disposition detection via gh CLI, PR comment posting, and skill delegation. Update command to reference both. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Smoke tests: register doc-digest in skills, agents, and structure validation. Add dedicated consistency test (22 checks) validating status codes, disposition types, analysis flags, cross-file linkage, and guardrail thresholds. Behavioral tests: 6 API-based tests with 4 fixtures covering section presentation, inconsistency detection, summary format, non-markdown chunking, large-section sub-chunking, and PR disposition detection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add design document to dockyard/docs/plans/. Update CLAUDE.md to list doc-digest in skills and update smoke test suite count to 5. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New doc-digest feature requires version bump for cache invalidation so existing users receive the update. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d tests - Improve skill description for better triggering accuracy - Clarify analysis output template with explicit example - Align disposition 'uncertain' language between skill and agent - Add sub-chunk header format specification - Remove unmarked inconsistency from test-markdown.md fixture - Use per-test variables in behavioral tests to prevent cross-test coupling - Simplify command to remove redundant setup section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add JSON validity check in call_claude to handle non-JSON API responses (429, proxy errors) gracefully under set -e - Add dedicated assertion in Test 5 for sub-chunk letter notation (e.g., Section 2a) to avoid false passes on normal section headers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Doc Digest Design Review — FeedbackReviewed 1. Overall: The design conflates two jobsThe skill tries to be both a reading companion (chunk and present documents) and an automated reviewer (analyze, flag issues, track changes, post PR comments). Neither experience is great because they fight each other. The user gets the section content, then unsolicited analysis telling them what to think, then a prompt asking for feedback they may not have. The primary job should be making documents easy to consume — chunk, present, move through. The secondary job is being available for questions when asked. Feedback tracking should happen if/when the user gives feedback, not as a prompted workflow. "Summarize my feedback" at the end, not a mandatory four-part output ceremony. This affects: per-section analysis, disposition system, change tracker, PR integration — most of the design doc's complexity. 2. File Structure: Drop the agent fileDoc-digest is interactive (user responds per-section), which is a poor fit for subagents (autonomous execution, separate context, no mid-task user interaction). Skills are the right abstraction for interactive/guided workflows. The agent file adds a maintenance surface (keeping agent and skill consistent) without architectural payoff — its only unique content is PR detection/posting, which folds naturally into the skill. Collapse to command + skill. 3. Section Chunking: Remove code file chunkingThe non-markdown chunking rules include code → function/class boundaries. Reviewing a code file section-by-section with analysis and feedback is code review, not doc-digest. The skill should stay scoped to documents. Keep YAML/JSON and plain text, drop code. 4. Per-Section Analysis: Ambiguous prompt wording"Does this look right, or do you have feedback?" — answering "yes" is ambiguous (yes it looks right, or yes I have feedback). Same issue with the wrap-up prompt. If prompting survives the redesign, use something like "Any feedback, or ready to move on?" |
Why
The dockyard plugin lacked a document review tool. Reviewing design docs, RFCs, and policies section-by-section with cross-document awareness, feedback tracking, and PR integration fills a real gap in the engineering workflow.
What
skills/doc-digest/SKILL.md): Layered section chunking (headings → semantic fallback), per-section analysis with cross-document context flags (Inconsistency, Clarity, Completeness, Redundancy), session-level feedback disposition (edit vs suggestion inferred from PR context), cross-section change propagation, 2000-line guardrail, and change log summary.agents/doc-digest.md): Disposition detection viaghCLI, PR comment posting in suggestion mode, skill delegation.commands/doc-digest.md): Thin dispatcher referencing agent and skill.How to review
docs/plans/2026-03-01-doc-digest-design.md) for full context on decisionsskills/doc-digest/SKILL.md— this is the core logic (chunking, analysis, feedback, output format)agents/doc-digest.md— orchestration layer (disposition detection, PR integration)tests/smoke/validate-doc-digest-consistency.shfor structural invariants,tests/behavioral/doc-digest/run-tests.shfor API-based assertionsPre-submit review
Local 3-pass code review (correctness, conventions, test quality) found 2 warnings, both fixed:
call_claudefor non-JSON API responses underset -e2a) in Test 5All smoke tests pass: 5/5 suites, 167/167 assertions.