feat(review): implement deep PR analysis command#170
Conversation
|
We still need something handling github comments and the reviewing of github comments when making milestones, running reviews, automating pipelines, etc. I like having a deep pr analysis process, but we can't break the previous review implementation, we just need to make it more efficient at what it's actually doing and let something like this handle the role it's designed for. |
8eb4ce2 to
d65c3a0
Compare
snipcodeit
left a comment
There was a problem hiding this comment.
Deep Review — feat(review): implement deep PR analysis command
1. TEST THIS PR
No automated tests exist in MGW (by design — it's a command system). Manual verification: the changed files are markdown command definitions parsed by Claude Code at invocation time.
Build/parse status: The YAML frontmatter in milestone.md is broken. The key HB|argument-hint is not valid — it won't be recognized as argument-hint. Claude Code will silently ignore the --auto-review hint.
2. WHY DO WE NEED THIS?
The rationale is sound and necessary. MGW has been execution-only; having a "think deeply" mode for PRs is a real gap. The state separation concept (.mgw/reviews/ vs .mgw/active/) is architecturally correct — giving the reviewer a separate context space is a good idea.
Priority: High. This is the right feature to build.
3. STATED INTENT vs ACTUAL CHANGES
The PR claims: dual-mode review with five-dimension PR analysis, state separation, and review checkpoints in run.md and milestone.md.
| Claim | Delivered? | Notes |
|---|---|---|
| Mode detection (issue vs PR) | Partially | Regex is broken — always enters PR mode |
| Issue comment review (Mode 1) | Regressed | issue_present_and_act lost all action logic |
| PR deep review (Mode 2) | Scaffolded | Missing the PR diff — agent reviews blind |
State separation (.mgw/reviews/) |
Yes | Correctly designed |
Checkpoint in run.md |
Partially | Step tags broken, no closing </step> |
Checkpoint in milestone.md |
Partially | Auto-review mode still prompts user |
--auto-review flag |
Partially | Parsed but implementation incomplete in both files |
4. IMPACT ANALYSIS
Bug: Mode Detection Regex — Always True (CRITICAL)
# Line 60, review.md
if [[ "$REFERENCE" =~ --pr[[:space:]]|$ ]]; thenIn bash ERE, | is alternation. This pattern matches "either --pr OR end-of-string". End-of-string matches every string. PR_MODE will be true for every invocation, including plain issue numbers. Mode 1 is completely unreachable.
Fix: [[ "$REFERENCE" == *"--pr"* ]]
Bug: Bash Syntax Error in pr_validate (CRITICAL)
# Line 256, review.md
elif [[ "$REFERENCE" =~ ^[0-9]+$ ]] && "$PR_MODE" = true; then"$PR_MODE" = true is not a valid test — it tries to execute $PR_MODE as a command. Should be [[ "$PR_MODE" = true ]].
Regression: issue_present_and_act gutted (HIGH)
The original had complete logic for all four classification paths with AskUserQuestion calls and state update instructions. The new version:
Actions based on classification:
- informational: Update last_comment_count
- material: Update with new requirements
- blocking: Option to block pipeline
- resolution: Option to re-triage
This is a bullet list with no instructions for Claude to follow. Mode 1 now classifies comments and then does nothing. The state file never gets updated. This is a functional regression in existing behavior.
Missing: PR Diff Not Passed to Reviewer Agent (HIGH)
pr_fetch_details fetches title, author, base/head, and file count — but not the diff. The review agent cannot compare stated intent vs actual changes, identify scope creep, or do meaningful architectural analysis without it. The five-dimension framework is hollow without the diff.
Fix: add to pr_fetch_details:
PR_DIFF=$(gh pr diff "$PR_NUMBER" 2>/dev/null)And pass $PR_DIFF to the agent prompt.
Noise: Line Prefixes Baked Into Files (MEDIUM)
Multiple lines have 2-character prefixes that appear to be generation artifacts:
milestone.md:HB|argument-hint:,SR|The --dry-run...,NM|The --auto-review...,WP|DRY_RUN=false,RX|if ["$AUTO_REVIEW"...run.md:SJ|</step>,KV|<step name="pr_review_checkpoint">,VR|,NP|,VB|,NM|, etc.review.md:KR|<files_to_read>,ST|-in the agent promptgsd.md:TB|## PR Review Pattern
In run.md, SJ|</step> is not a valid XML close tag — the create_pr step is never properly closed. KV|<step name="pr_review_checkpoint"> is also not a valid step opening tag. The HB|argument-hint in YAML frontmatter is the most impactful — it silently breaks the --auto-review argument hint in milestone.md.
Design Issue: reviewer Field Points to PR Author (MEDIUM)
{ "reviewer": "${PR_AUTHOR}" }PR_AUTHOR is the person who wrote the PR, not who is reviewing it. Should be:
REVIEWER=$(gh api user -q .login)Documentation: Duplicate ## Consumers in gsd.md (LOW)
The PR appends a full new ## Consumers section after the existing one. The original should be removed or merged — the new table is the correct full version.
Incomplete sentence in gsd.md (LOW)
The PR Deep Review pattern below is for Mode 2. This is problem-solving orchestration
Used by `/mgw:review` for deep PR analysis. This is problem-solving orchestration
First sentence is a fragment; content duplicated on the next paragraph.
5. ARCHITECTURAL REVIEW
Approach: Correct. Dual-mode dispatch, state separation, and problem-solving vs execution orchestration are all the right abstractions. The checkpoint injection points in run.md and milestone.md are placed at the right lifecycle moments.
Three things to reconsider:
A. Number ambiguity is unresolved. When the user passes a bare number, the code comments say "check if it's a PR or issue by testing" but then immediately defaults without testing. The comment is a lie. Actual detection should try gh pr view first:
if gh pr view "$NUMBER" >/dev/null 2>&1; then PR_MODE=true; fiUsers shouldn't need --pr when passing a PR number directly.
B. The reviewer agent prompt needs domain awareness. The five-dimension framework maps to thin instructions ("Run tests, build, verify functionality works"). For an MGW command that reviews markdown command files, not application code, "run tests" is meaningless. The prompt should tell the agent what it's actually reviewing.
C. --auto-review in milestone.md contradicts itself. When AUTO_REVIEW=true, it still prompts the user instead of running automatically. The flag is parsed but ignored.
Verdict: Needs Changes Before Merge
The core concept is right and worth landing. But there are two critical bugs, one significant regression, and one significant gap.
Blockers:
- Fix
--pr[[:space:]]|$regex — use[[ "$REFERENCE" == *"--pr"* ]] - Fix
"$PR_MODE" = true— use[[ "$PR_MODE" = true ]] - Restore full
issue_present_and_actaction logic (all four classification types withAskUserQuestioncalls and state update instructions) - Pass
gh pr diffoutput to the reviewer agent prompt - Strip
HB|,SJ|,KV|,TB|,KR|,RX|etc. prefixes from all file content
Non-blocking concerns:
- Detect PR vs issue by number proactively (try
gh pr viewfirst before defaulting to issue mode) - Fix
reviewerfield to use current user, not PR author - Deduplicate
gsd.md's Consumers section - Fix
--auto-reviewinmilestone.mdto actually skip the prompt
|
All review feedback addressed in commit eaba4c9. BlockersRegex always true — Bash syntax error in
Reviewer agent was blind — Added Line prefix artifacts — Stripped all Non-blocking concernsBare number detection — For bare numbers, the code now probes
Reviewer agent prompt domain-awareness — The prompt now explains that MGW is a markdown command system, not compiled source code. "Test this PR" maps to: verify
|
Replaces the lightweight comment-classification review with a comprehensive PR reviewer that mimics senior engineer code review process. Key features: - Deep PR analysis addressing five core questions: 1. Test this PR - runs tests, builds, verifies functionality 2. Why do we need this? - analyzes rationale vs linked issue 3. Stated intent vs actual changes - compares claim vs implementation 4. Impact analysis - side effects, dependencies, patterns, security 5. Architectural review - alternative approaches, design consistency - Problem-solving orchestration (not execution): - High autonomy for the reviewer agent - Quality assurance focus over code execution - Think tank approach for mission-critical reviews - Context separation: - .mgw/reviews/ for review-specific state (new) - .mgw/active/ for pipeline state - .planning/ for GSD execution state - This gives reviewer space to handle larger context - Added PR Review Pattern to gsd.md workflow
… as separate mode Revises implementation based on PR feedback: - Keep original issue comment classification (default behavior) - Add PR deep analysis as separate mode (--pr flag or PR URL) - Dual-mode command handles both use cases Usage: /mgw:review 42 - Review comments on issue #42 /mgw:review 42 --pr - Deep PR analysis on PR #42 /mgw:review https://github.com/owner/repo/pull/42 - Deep PR review
Blockers resolved: - fix: regex `--pr[[:space:]]|$` → `[[ "$REFERENCE" == *"--pr"* ]]` (was always true due to ERE |$ matching every string) - fix: `"$PR_MODE" = true` bash syntax error → `[[ "$PR_MODE" = true ]]` - fix: restore full issue_present_and_act with all four classification paths (informational/material/blocking/resolution) with AskUserQuestion calls and state update instructions — was gutted to a bullet list - fix: fetch `gh pr diff` and pass to reviewer agent — agent was blind without the diff, making 5-dimension analysis hollow - fix: strip all XX| line prefix artifacts from review.md, milestone.md, run.md, gsd.md (generation artifacts that broke YAML, bash, XML tags) Non-blocking concerns addressed: - fix: bare number detection now probes `gh pr view` first; sets PR_MODE=true if the number is a PR — users no longer need --pr for PR numbers - fix: reviewer field now uses `gh api user -q .login` (current user), not PR_AUTHOR (the person who wrote the PR) - fix: milestone.md AUTO_REVIEW=true now skips the prompt entirely and invokes /mgw:review automatically (was prompting user regardless) - fix: run.md pr_review_checkpoint step tags properly opened/closed - fix: gsd.md duplicate Consumers section removed; PR Review Pattern heading stripped of TB| prefix; duplicate fragment paragraph removed - fix: reviewer agent prompt made domain-aware (markdown command files, not compiled code; structural integrity checks replace "run tests") - fix: pr_present step now shows a structured report and offers follow-up actions (post to PR, create issue, done) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
eaba4c9 to
374fca9
Compare
Summary
Replaces the lightweight comment-classification
mgw:reviewwith a comprehensive PR reviewer that mimics senior engineer code review process. This addresses the need for more robust "problem-solving orchestration" as opposed to the existing "execution orchestration" focus.Problem
MGW has focused almost exclusively on execution orchestration - getting issues through the pipeline to PRs. But processes like PR review, architectural guidance, and deep analysis require a different approach: more autonomy, deeper thinking, and space to handle larger context.
Solution
Implements a new
mgw:reviewcommand that performs deep PR analysis across five core dimensions:1. Test This PR
2. Why Do We Need This?
3. Stated Intent vs Actual Changes
4. Impact Analysis
5. Architectural Review
Key Architectural Decisions
Problem-Solving vs Execution Orchestration
Unlike execution orchestration (which focuses on producing code), this is problem-solving orchestration - the reviewer agent has high autonomy to analyze, question assumptions, and provide architectural guidance.
Context Separation
This separation gives the reviewer space to handle larger context for mission-critical review processes without polluting pipeline or execution state.
Agent Pattern
The reviewer agent writes results directly to
.mgw/reviews/- MGW then reads these files to present results and offer follow-up actions (post to PR, create issue, link to issue, etc.).Changes
Usage
The reviewer will:
.mgw/reviews/