Skip to content

Preflight execution evidence before review#685

Merged
sungjunlee merged 2 commits into
mainfrom
issue-681-preflight
Jun 8, 2026
Merged

Preflight execution evidence before review#685
sungjunlee merged 2 commits into
mainfrom
issue-681-preflight

Conversation

@sungjunlee

@sungjunlee sungjunlee commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Fixes #681
Refs #678

Summary

Adds execution-evidence preflight before primary reviewer invocation for mechanical evidence failures.

Relay

  • Run: issue-681-20260607235606942-5c342363
  • Request: req-20260607235336010
  • Leaf: issue-681-execution-evidence-preflight
  • Commit: bb24997d8dd8431bb199a2bc6c89aafae0aefab3

Completion Audit

  • Missing execution-evidence.json blocks before primary reviewer invocation.
  • Stale evidence blocks when artifact head_sha differs from reviewed HEAD.
  • Invalid evidence, including schema-invalid and symlink artifact cases, blocks before primary reviewer invocation.
  • Hardened/strict failure is covered by strict evidence preflight test coverage.
  • Existing --review-file fallback/fail-closed behavior remains intact.
  • runner-notes.md documents script-level mechanical proof vs semantic AI review.

Verification

  • node --test tests/relay-review/scripts/review-runner-execution-evidence.test.js tests/relay-review/scripts/review-runner.test.js tests/relay-review/scripts/review-runner-verdict.test.js -> 165/165 pass
  • execution-evidence.json rebound by orchestrator recovery to HEAD bb24997d8dd8431bb199a2bc6c89aafae0aefab3 with test_exit_code=0 and verification_runs[].

Notes

The initial relay-dispatch publication failed because the dispatch prompt said to commit to issue-681 while the manifest branch was issue-681-preflight. The implementation commit was preserved, issue-681-preflight was fast-forwarded to the implementation HEAD, and this PR was created as recovery.

Summary by CodeRabbit

  • 새로운 기능

    • 코드 리뷰 이전에 실행 증거 파일을 기계적으로 검증하는 사전검사(Preflight) 단계 추가
    • 검증 실패 시 리뷰어 호출을 차단하고 검토 상태를 보류로 유지
  • 행동/출력 변경

    • 실패 시 JSON 출력에 상태, 사유, 관련 SHA 및 권장 다음 조치(nextAction)를 포함
    • 특정 파일 기반 옵션은 기존 동작(사전검사 우회)을 유지
  • 테스트

    • 다양한 증거 실패 모드를 검증하는 통합 테스트 추가
  • 잡일

    • 사전검사 실패 이벤트 기록 항목 추가

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7632cb93-a874-42a4-bc3b-bd350a7213dc

📥 Commits

Reviewing files that changed from the base of the PR and between bb24997 and bf17f6c.

📒 Files selected for processing (4)
  • skills/relay-dispatch/scripts/relay-events.js
  • skills/relay-review/scripts/review-runner.js
  • skills/relay-review/scripts/review-runner/preflight.js
  • tests/relay-review/scripts/review-runner.test.js
🚧 Files skipped from review as they are similar to previous changes (3)
  • skills/relay-review/scripts/review-runner/preflight.js
  • tests/relay-review/scripts/review-runner.test.js
  • skills/relay-review/scripts/review-runner.js

Walkthrough

primary reviewer 호출 전에 실행 증거를 기계적으로 검증하는 preflight를 추가하고, 실패 시 executionEvidencePreflight 메타데이터와 REVIEW_PREFLIGHT_FAILED 이벤트를 남기며 reviewer 호출을 차단합니다.

개요

이 PR은 primary reviewer 호출 전에 실행 증거(execution-evidence.json)를 기계적으로 검증하는 사전검사 단계를 추가합니다. 증거 부재/구식/무효 상태는 불필요한 리뷰 라운드 소비를 방지하고 별도의 재설정 지시를 제공합니다.

변경사항

실행 증거 사전검사 메커니즘

Layer / File(s) 요약
Execution Evidence Preflight Validation Function
skills/relay-review/scripts/review-runner/execution-evidence.js, tests/relay-review/scripts/review-runner-execution-evidence.test.js
buildExecutionEvidencePreflight는 아티팩트를 로드하고 품질 실행 상태를 계산하여 pass 또는 blocked 상태와 메타데이터(reason, head SHA들, nextAction)를 반환합니다. 테스트는 head 일치/불일치 시나리오를 검증합니다.
Preflight Blocking Guard Implementation
skills/relay-review/scripts/review-runner/preflight.js
maybeBlockForExecutionEvidencePreflight 헬퍼가 사전검사를 실행하고 reviewFile 존재 여부와 사전검사 상태를 기준으로 조건부 차단합니다. 차단 시 JSON 또는 콘솔 로그를 출력하고 process.exitCode = 2를 설정합니다.
Review Runner Preflight Integration
skills/relay-review/scripts/review-runner.js
maybeBlockForExecutionEvidencePreflight를 import하고 prepareOnly 통과 후 호출합니다. hardenedAssurance를 조기에 한 번만 계산하여 이후 분기에서 재사용하고, 기존 중복 계산을 제거합니다.
Event type and serialization updates
skills/relay-dispatch/scripts/relay-events.js
REVIEW_PREFLIGHT_FAILED 이벤트를 EVENTS에 추가하고 appendRunEvent가 preflight 관련 선택적 필드를 제공될 때만 events 기록에 포함하도록 조건부 필드를 추가했습니다.
Documentation and Integration Test Coverage
skills/relay-review/references/runner-notes.md, tests/relay-review/scripts/review-runner.test.js
runner-notes.md는 사전검사 단계와 JSON 출력 구조를 문서화합니다. 통합 테스트는 missing/stale/schema-invalid/symlink/strict-failing 증거 상태에서 사전검사가 primary reviewer를 차단하고 리뷰어가 호출되지 않음을 검증합니다.

Sequence Diagram(s)

sequenceDiagram
  participant Runner
  participant PreflightGuard
  participant buildExecutionEvidencePreflight
  participant relayEvents
  Runner->>PreflightGuard: prepareOnly done -> maybeBlockForExecutionEvidencePreflight(runDir, reviewedHead, strict)
  PreflightGuard->>buildExecutionEvidencePreflight: buildExecutionEvidencePreflight(runDir, reviewedHead, strict)
  buildExecutionEvidencePreflight-->>PreflightGuard: { status, qualityExecutionStatus, reason, evidenceHeadSha, artifactPath, nextAction }
  alt status == "pass"
    PreflightGuard-->>Runner: allow primary reviewer invocation
  else status == "blocked"
    PreflightGuard->>relayEvents: appendRunEvent(REVIEW_PREFLIGHT_FAILED, { preflight_type, quality_execution_status, evidence_head_sha, ... })
    PreflightGuard-->>Runner: block, emit JSON result, exitCode = 2
  end
Loading

🎯 3 (Moderate) | ⏱️ ~25 분

가능한 관련 PR:

  • sungjunlee/dev-relay#331: 이벤트 이름/스키마 관련 변경과 연관 — 본 PR은 REVIEW_PREFLIGHT_FAILED 이벤트 추가와 이벤트 레코드 필드 직렬화 업데이트를 포함합니다.

"나는 작은 토끼, 증거를 먼저 본다
파일이 사라지면 킁, 멈추고 소리친다
헤드가 맞는지, 심볼릭인지, 명령이 온전한지
한 번 더 확인하고, 리뷰어를 부르지 않네 🐇
고장난 증거는 수리로 되돌려라"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed 제목 "Preflight execution evidence before review"는 PR의 주요 변경사항을 명확하고 간결하게 요약합니다. 실행 증거 검증 프리플라이트 추가라는 핵심 기능을 직접적으로 설명합니다.
Linked Issues check ✅ Passed 모든 연결된 이슈 #681의 요구사항이 충족되었습니다. 실행 증거 프리플라이트가 구현되었고, 누락/오래된/유효하지 않은 증거를 차단하며, 리뷰어 호출 전에 작동하고, JSON 출력과 문서화가 포함되었습니다.
Out of Scope Changes check ✅ Passed 모든 변경사항이 이슈 #681의 범위 내에 있습니다. 프리플라이트 검증, 테스트, 문서화, 이벤트 기록 등이 모두 실행 증거 검증 기능과 직접 관련됩니다.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-681-preflight

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sungjunlee

Copy link
Copy Markdown
Owner Author

Relay Review

Verdict: LGTM
Summary: VERIFIED: missing execution-evidence.json preflight blocks before reviewer invocation. VERIFIED: stale artifact head_sha mismatch blocks. VERIFIED: invalid evidence covers schema-invalid and symlink/non-regular handling through shared artifact reader. VERIFIED: strict hardened checks reuse existing command/hash/exit and verification_runs semantics. VERIFIED: preflight JSON includes status, reason, reviewedHeadSha, evidenceHeadSha when valid, and nextAction. VERIFIED: --review-file fallback still bypasses preflight and keeps fail-closed execution evidence override. VERIFIED: tests assert mechanical failures do not create reviewer marker output. VERIFIED: runner-notes documents mechanical script proof vs AI semantic review and repair guidance. No blocking issues found by inspection; tests were not executed in this review context.
Contract: PASS
Quality Review: PASS
Quality Execution: PASS
Rounds: 1

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@skills/relay-review/scripts/review-runner/execution-evidence.js`:
- Around line 266-267: The code reads the execution evidence artifact into
artifactLoad then calls computeQualityExecutionStatus which re-reads the same
artifact, risking inconsistent snapshots; modify computeQualityExecutionStatus
to accept the already-read artifact (artifactLoad) or add a new parameter (e.g.,
evidenceArtifact) and use that instead of reading from disk, then update the
call site where artifactLoad is created (and any callers of
computeQualityExecutionStatus) so reason/qualityExecutionStatus and
evidenceHeadSha derive from the same in-memory snapshot.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77070411-10c2-421a-8c09-130e7a73e6e9

📥 Commits

Reviewing files that changed from the base of the PR and between c50f743 and bb24997.

📒 Files selected for processing (6)
  • skills/relay-review/references/runner-notes.md
  • skills/relay-review/scripts/review-runner.js
  • skills/relay-review/scripts/review-runner/execution-evidence.js
  • skills/relay-review/scripts/review-runner/preflight.js
  • tests/relay-review/scripts/review-runner-execution-evidence.test.js
  • tests/relay-review/scripts/review-runner.test.js

Comment on lines +266 to +267
const artifactLoad = readExecutionEvidenceArtifact(runDir);
const executionStatus = computeQualityExecutionStatus({ runDir, reviewedHead, strict });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

프리플라이트가 증거 파일을 두 번 읽어 결과 스냅샷이 엇갈릴 수 있습니다.

Line 266에서 아티팩트를 읽은 뒤, Line 267에서 computeQualityExecutionStatus가 같은 파일을 다시 읽습니다. 두 호출 사이에 파일이 바뀌면 reason/qualityExecutionStatusevidenceHeadSha가 서로 다른 스냅샷 기준으로 섞일 수 있습니다.

🔧 제안 패치
-function computeQualityExecutionStatus({ runDir, reviewedHead, strict = false }) {
-  const artifactLoad = readExecutionEvidenceArtifact(runDir);
+function computeQualityExecutionStatus({
+  runDir,
+  reviewedHead,
+  strict = false,
+  artifactLoad: preloadedArtifact = null,
+}) {
+  const artifactLoad = preloadedArtifact || readExecutionEvidenceArtifact(runDir);
   if (artifactLoad.state === "missing") {
     return {
       status: "missing",
@@
 function buildExecutionEvidencePreflight({ runDir, reviewedHead, strict = false }) {
   const artifactLoad = readExecutionEvidenceArtifact(runDir);
-  const executionStatus = computeQualityExecutionStatus({ runDir, reviewedHead, strict });
+  const executionStatus = computeQualityExecutionStatus({
+    runDir,
+    reviewedHead,
+    strict,
+    artifactLoad,
+  });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/relay-review/scripts/review-runner/execution-evidence.js` around lines
266 - 267, The code reads the execution evidence artifact into artifactLoad then
calls computeQualityExecutionStatus which re-reads the same artifact, risking
inconsistent snapshots; modify computeQualityExecutionStatus to accept the
already-read artifact (artifactLoad) or add a new parameter (e.g.,
evidenceArtifact) and use that instead of reading from disk, then update the
call site where artifactLoad is created (and any callers of
computeQualityExecutionStatus) so reason/qualityExecutionStatus and
evidenceHeadSha derive from the same in-memory snapshot.

@sungjunlee

Copy link
Copy Markdown
Owner Author

Relay Review

Verdict: LGTM
Summary: VERIFIED: missing execution-evidence.json preflight blocks before primary reviewer invocation. VERIFIED: stale artifact head_sha mismatch blocks. VERIFIED: invalid evidence covers schema failure and symlink/non-regular handling through the shared artifact reader. VERIFIED: strict checks reuse existing command/hash/exit and verification_runs semantics via computeQualityExecutionStatus. VERIFIED: JSON output is additive and includes status, reason, reviewedHeadSha, evidenceHeadSha when valid, and nextAction. VERIFIED: --review-file fallback still bypasses preflight blocking and keeps fail-closed execution evidence override. VERIFIED: tests assert mechanical failures do not create reviewer marker output. VERIFIED: runner-notes documents script-level mechanical proof vs AI semantic review and repair guidance. Tests were not executed in this review context; this is inspection-only.
Contract: PASS
Quality Review: PASS
Quality Execution: PASS
Rounds: 2

@sungjunlee sungjunlee merged commit 038f7a5 into main Jun 8, 2026
2 checks passed
@sungjunlee sungjunlee deleted the issue-681-preflight branch June 8, 2026 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

relay-review: preflight execution evidence before spending reviewer rounds

1 participant