Skip to content

fix: address 6 confirmed review issues on PR #514#520

Merged
osilkin98 merged 7 commits into
mainfrom
factory/run-d8c85613
Jun 13, 2026
Merged

fix: address 6 confirmed review issues on PR #514#520
osilkin98 merged 7 commits into
mainfrom
factory/run-d8c85613

Conversation

@colehurwitz

Copy link
Copy Markdown
Collaborator

Closes #519. Addresses review feedback from @osilkin98 on PR #514.

Summary

Fixes all 6 issues confirmed in the code review:

  1. EvalFragment.passed semantic overload → add coverage_pct field
  2. Undisclosed scoring changes → stub new dimensions to return None (behavioral parity with main)
  3. cargo test --workspace → cargo test (restore main behavior)
  4. TypeScript detection regression → NodeEvaluator.name returns typescript unconditionally
  5. NodeEvaluator.name statefulness → resolved by fix 4 (no more self._project_path reference)
  6. Characterization tests updated to reflect all fixes

This PR is a behavior-preserving architecture refactor. New eval dimensions for Go/Rust/Node will be enabled in a follow-up PR with explicit disclosure.

Refs: PR #514, #513

colehurwitz and others added 5 commits June 9, 2026 17:10
…ompat

The multi-language eval refactor required source files in detect() methods,
breaking discovery which only needs marker files (package.json, Cargo.toml,
etc.) to identify project language. Phantom detection prevention is already
handled by each run_*() method returning None when no source files exist.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…or paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- EvalFragment: add coverage_pct field to fix passed semantic overload
- NodeEvaluator.name: return typescript unconditionally (fix detection regression + statefulness)
- Rust: revert cargo test --workspace to cargo test
- Go/Rust/Node: stub new eval dimensions to None (behavioral parity with main)
- Tests: update characterization tests to reflect all fixes

Closes #519

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.76952% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.92%. Comparing base (5985563) to head (e161326).

Files with missing lines Patch % Lines
factory/eval/languages/node.py 95.12% 2 Missing ⚠️
factory/eval/languages/rust.py 94.11% 2 Missing ⚠️
factory/eval/languages/go.py 96.29% 1 Missing ⚠️
factory/eval/languages/python.py 98.18% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #520      +/-   ##
==========================================
+ Coverage   86.77%   87.92%   +1.14%     
==========================================
  Files          64       70       +6     
  Lines       10027    10120      +93     
==========================================
+ Hits         8701     8898     +197     
+ Misses       1326     1222     -104     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

colehurwitz and others added 2 commits June 12, 2026 18:23
Filter out EvalFragment entries where coverage_pct is None before
summing, preventing a TypeError crash if a future evaluator returns
a coverage fragment without setting coverage_pct. Also handle the
empty-list edge case to avoid ZeroDivisionError.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…priority

Swap node and rust registration so detection priority is
python → node/typescript → rust → go, matching the original
_detect_language behavior. Without this, projects with both
Cargo.toml and package.json would incorrectly detect as 'rust'.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@colehurwitz colehurwitz marked this pull request as ready for review June 12, 2026 23:02
@colehurwitz

Copy link
Copy Markdown
Collaborator Author

✅ Factory Review: KEEP

Verdict: KEEP
Reason: Fixes all 6 confirmed review issues on PR #514: EvalFragment semantic overload, undisclosed scoring changes, cargo test --workspace, TypeScript detection regression, NodeEvaluator statefulness, circular tests. Score +0.0054. All checks pass.

Experiment: #20
Hypothesis: Fix all 6 confirmed review issues on PR #514

Score Comparison

Metric Value
Before 0.7356
After 0.7410
Delta +0.0054
Threshold 0.6000

Guard Checks

Check Result
scope ✅ PASS
eval_immutable ✅ PASS

Posted by Factory CEO

@colehurwitz

Copy link
Copy Markdown
Collaborator Author

✅ Factory Review: KEEP

Verdict: KEEP
Reason: All 6 confirmed review issues from PR #514 resolved with behavioral parity to main

Experiment: #20
Hypothesis: Address all 6 confirmed review issues on PR #514: coverage_pct semantic fix, None returns for unsupported dimensions, cargo test revert, stateless NodeEvaluator.name, simplified detect(), registration order fix

Score Comparison

Metric Value
Before 0.7356
After 0.7410
Delta +0.0054
Threshold 0.0100

Guard Checks

Check Result
scope ✅ PASS
eval_immutable ✅ PASS

Posted by Factory CEO

@colehurwitz

Copy link
Copy Markdown
Collaborator Author

@osilkin98 it made a new PR. I filed a bug report about this. But please take a look

@RobotSail

Copy link
Copy Markdown
Contributor

@colehurwitz Thank you for addressing the changes, sorry it's been such a hassle. Let's merge this and fix the other bug.

@osilkin98 osilkin98 merged commit 17e224f into main Jun 13, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: address all 6 confirmed review issues on PR #514

3 participants