Skip to content

feat: add 3-pass adversarial review protocol to CEO prompt#515

Open
RobotSail wants to merge 3 commits into
akashgit:mainfrom
RobotSail:feature/3-pass-review
Open

feat: add 3-pass adversarial review protocol to CEO prompt#515
RobotSail wants to merge 3 commits into
akashgit:mainfrom
RobotSail:feature/3-pass-review

Conversation

@RobotSail

Copy link
Copy Markdown
Contributor

Closes #1

Changes

  • Added new ## Mode: PR Review (--mode review --pr <N>) section to the CEO agent prompt
  • Defines a 6-step protocol (PR1–PR6) for 3-pass adversarial PR review:
    • PR1: Read PR context (metadata + diff)
    • PR2: Detect language and select review lenses (code quality, language-specific, security)
    • PR3: Round 1 — independent blind review by 3 reviewers
    • PR4: Round 2 — cross-pollinated deep review (each reviewer sees others' round 1 findings)
    • PR5: Round 3 — adversarial stress test (each reviewer sees all prior findings)
    • PR6: Consolidate findings and post KEEP/REVERT verdict
  • Inserted between the existing "Mode: Refine" section and "CEO Self-Learning Protocol" section
  • No other parts of the file were modified

@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.77%. Comparing base (5985563) to head (52464b1).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #515   +/-   ##
=======================================
  Coverage   86.77%   86.77%           
=======================================
  Files          64       64           
  Lines       10027    10027           
=======================================
  Hits         8701     8701           
  Misses       1326     1326           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Upgrade the --mode review pipeline from a single-reviewer, single-pass
approach to a 3-reviewer, 3-round adversarial protocol with cross-pollination.

Changes:
- factory/agents/prompts/ceo.md: Add Mode: PR Review section with full
  3-pass protocol (PR1-PR6 steps, language detection, cross-pollination,
  adversarial stress test, consolidated verdict)
- factory/cli.py: Update --mode review task string to reference the new
  protocol instead of the old single-reviewer instructions
- skills/3-pass-review/SKILL.md: Add standalone skill for manual
  invocation via /3-pass-review <PR#>

The protocol runs 3 independent reviewers (code quality, language-specific,
security) across 3 rounds. Each round, reviewers see the others' prior
findings — forcing them to challenge, validate, and go deeper. Round 3 is
explicitly adversarial (try to break the PR). The CEO consolidates all
findings into a structured KEEP/REVERT verdict.
@RobotSail RobotSail force-pushed the feature/3-pass-review branch from fb5c013 to 7f923ae Compare June 9, 2026 22:57
@colehurwitz

Copy link
Copy Markdown
Collaborator

@akashgit can you take a look at this one. it will change the review logic significantly.

Each reviewer sees its own full prior trajectory in subsequent rounds
but only receives concise summaries from the other reviewers. This
preserves depth without anchoring or context bloat.

CEO writes summaries after each round as a mandatory step.
@osilkin98 osilkin98 added extends-functionality enhancement Improves existing feature/functionality or code quality, does not change behavior of codebase and removed extends-functionality labels Jun 10, 2026
@osilkin98

Copy link
Copy Markdown
Collaborator

@ceo-review

@osilkin98

Copy link
Copy Markdown
Collaborator

@ceo-review you did not actually look, can you please review?

@RobotSail

Copy link
Copy Markdown
Contributor Author

@ceo-review can you check this out?

@osilkin98 osilkin98 added kind:hardening Same behavior, fewer failure modes stage:judgment Deciding what is true/kept: eval, gates, review, state detection labels Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Improves existing feature/functionality or code quality, does not change behavior of codebase kind:hardening Same behavior, fewer failure modes stage:judgment Deciding what is true/kept: eval, gates, review, state detection

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Validate factory pipeline end-to-end on cloud-gateway

3 participants