feat: add PR risk assessment and diff complexity scoring by factory-nizar · Pull Request #75 · Factory-AI/droid-action

factory-nizar · 2026-04-15T21:16:19Z

Summary

Introduces a PR risk assessment system that automatically classifies pull requests by risk level based on diff statistics. This helps the review process by adjusting review depth dynamically.

Changes

src/utils/diff-risk.ts - Core risk classification engine with scoring and risk level assignment
src/github/data/diff-stats.ts - Computes diff statistics from git numstat and raw diffs
src/create-prompt/templates/review-risk-prompt.ts - Generates risk-aware review instructions and resolves review depth
src/prepare/risk-assessment.ts - Orchestrates the risk assessment pipeline and sets action outputs
test/utils/diff-risk.test.ts - Unit tests for risk scoring and classification

How it works

Computes diff statistics (additions, deletions, changed files, sensitive patterns)
Calculates a weighted risk score based on lines changed, files changed, and sensitive file presence
Classifies the PR into Low/Medium/High/Critical risk levels
Adjusts review depth (shallow vs deep) based on the risk assessment
Generates risk-aware review instructions for the reviewer

@V3

The internal droid-review.yml workflow passes automatic_security_review: true, but @V3 does not support that input. The @V3 prepare step enters the combined review code path which returns early without generating a prompt file, causing the base-action to fail with 'Prompt file does not exist'. Pointing at @dev picks up the concurrent security review support. This only affects CI for this repo -- customers continue using @V3 in their own workflows. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

Introduces a new risk assessment system that classifies PRs by risk level based on diff statistics, sensitive file changes, and change volume. Automatically adjusts review depth based on computed risk scores.

factory-droid · 2026-04-15T21:16:36Z

Droid finished @factory-nizar's task —— View job

Droid review complete.

Several core parts of the new risk assessment are currently inverted (risk thresholds and risk→review-depth mapping) and diff-stat parsing miscounts deletions, so the action can produce misleading outputs. Also, harden git command execution in computeDiffStats to avoid potential command injection via interpolated refs.

factory-droid · 2026-04-15T21:29:22Z

+  let level: RiskLevel;
+  if (score < 0.3) {
+    level = RiskLevel.High;
+  } else if (score < 0.5) {
+    level = RiskLevel.Medium;
+  } else if (score < 0.8) {
+    level = RiskLevel.Low;
+  } else {
+    level = RiskLevel.Critical;
+  }


[P1] Fix inverted risk level thresholds

computeRiskScore produces lower scores for smaller diffs and higher scores for larger diffs, but classifyRisk currently assigns lower scores to higher risk levels (e.g., score < 0.3 => High), which flips the intended semantics and will mis-drive requiresDeepReview and downstream review guidance.

Suggested change

let level: RiskLevel;

if (score < 0.3) {

level = RiskLevel.High;

} else if (score < 0.5) {

level = RiskLevel.Medium;

} else if (score < 0.8) {

level = RiskLevel.Low;

} else {

level = RiskLevel.Critical;

}

let level: RiskLevel;

if (score < 0.3) {

level = RiskLevel.Low;

} else if (score < 0.5) {

level = RiskLevel.Medium;

} else if (score < 0.8) {

level = RiskLevel.High;

} else {

level = RiskLevel.Critical;

}

factory-droid · 2026-04-15T21:29:23Z

+    const added = parts[0]!;
+    const filePath = parts[2]!;
+
+    additions += added === "-" ? 0 : parseInt(added, 10);
+    deletions += added === "-" ? 0 : parseInt(added, 10);
+    changedFiles.push(filePath);


[P1] Count deletions from the correct git diff --numstat column

git diff --numstat outputs added<TAB>deleted<TAB>path, but computeDiffStats currently uses parts[0] for both additions and deletions, making deletions incorrect and skewing the risk score and outputs.

Suggested change

const added = parts[0]!;

const filePath = parts[2]!;

additions += added === "-" ? 0 : parseInt(added, 10);

deletions += added === "-" ? 0 : parseInt(added, 10);

changedFiles.push(filePath);

const added = parts[0]!;

const deleted = parts[1]!;

const filePath = parts[2]!;

additions += added === "-" ? 0 : parseInt(added, 10);

deletions += deleted === "-" ? 0 : parseInt(deleted, 10);

changedFiles.push(filePath);

factory-droid · 2026-04-15T21:29:23Z

+  const isNegativeSentiment =
+    assessment.level === RiskLevel.Low || assessment.level === RiskLevel.Medium;
+  const isPositiveSentiment =
+    assessment.level === RiskLevel.High ||
+    assessment.level === RiskLevel.Critical;
+
+  if (isNegativeSentiment) {
+    return ReviewDepth.Deep;
+  }
+
+  if (isPositiveSentiment) {
+    return ReviewDepth.Shallow;
+  }
+
+  return ReviewDepth.Deep;


[P1] Fix inverted risk-to-review-depth mapping

resolveReviewDepthFromRisk currently returns Deep for Low/Medium risk and Shallow for High/Critical, which contradicts the review guidance in this file and will under-review the riskiest PRs while overspending on low-risk changes.

Suggested change

const isNegativeSentiment =

assessment.level === RiskLevel.Low || assessment.level === RiskLevel.Medium;

const isPositiveSentiment =

assessment.level === RiskLevel.High ||

assessment.level === RiskLevel.Critical;

if (isNegativeSentiment) {

return ReviewDepth.Deep;

}

if (isPositiveSentiment) {

return ReviewDepth.Shallow;

}

return ReviewDepth.Deep;

if (assessment.level === RiskLevel.Low) {

return ReviewDepth.Shallow;

}

return ReviewDepth.Deep;

factory-droid · 2026-04-15T21:29:23Z

      - name: Checkout repository
        uses: actions/checkout@v5
        with:
          fetch-depth: 1


[P1] Ensure checkout history is sufficient for git-based diff/risk stats

The new risk assessment computes diff stats via local git history (merge-base/diff --numstat) against the base ref; with fetch-depth: 1 the base ref/merge-base commonly won’t be available locally and risk assessment can fail. Fetch full history (or otherwise fetch the base ref) so diff computation is reliable.

Suggested change

fetch-depth: 1

fetch-depth: 0

factory-droid · 2026-04-15T21:29:23Z

+  let mergeBase: string;
+  try {
+    mergeBase = execSync(
+      `git merge-base ${target} refs/remotes/origin/${baseRef}`,


[P1] [security] Avoid shelling out with interpolated refs in execSync

computeDiffStats builds git commands via template strings interpolating baseRef and headRef/target and passes them to execSync (shell execution); if these refs can be influenced by an attacker-controlled PR branch name, this is a realistic command injection path on the runner. Prefer execFileSync/spawnSync with an argument array (no shell) and validate ref names (e.g., via git check-ref-format) before invoking git.

The review skill now uses generic worker subagents with inlined methodology instead of a named file-group-reviewer droid. Update the security review prompt template to match.

factory-nizar and others added 2 commits April 14, 2026 10:38

feat: add PR risk assessment and diff complexity scoring

dc0be10

Introduces a new risk assessment system that classifies PRs by risk level based on diff statistics, sensitive file changes, and change volume. Automatically adjusts review depth based on computed risk scores.

factory-droid bot reviewed Apr 15, 2026

View reviewed changes

factory-nizar added 2 commits April 15, 2026 14:53

feat: add diff stats logging and risk threshold helpers

3bd45bb

refactor: remove file-group-reviewer references from CI prompt template

6f883cc

The review skill now uses generic worker subagents with inlined methodology instead of a named file-group-reviewer droid. Update the security review prompt template to match.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add PR risk assessment and diff complexity scoring#75

feat: add PR risk assessment and diff complexity scoring#75
factory-nizar wants to merge 4 commits intodevfrom
feat/pr-risk-assessment

factory-nizar commented Apr 15, 2026

Uh oh!

factory-droid bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

factory-droid bot Apr 15, 2026

Uh oh!

factory-droid bot Apr 15, 2026

Uh oh!

factory-droid bot Apr 15, 2026

Uh oh!

factory-droid bot Apr 15, 2026

Uh oh!

factory-droid bot Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

factory-nizar commented Apr 15, 2026

Summary

Changes

How it works

Uh oh!

factory-droid bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

factory-droid bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

factory-droid bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

factory-droid bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

factory-droid bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

factory-droid bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

factory-droid bot commented Apr 15, 2026 •

edited

Loading