fix: address osilkin98 review on PR #445#485
Conversation
- growth.py: Add LANG_CONFIG dict, _detect_project_language helper, multi-language function counting via regex, and fix eval_observability to pass detected language instead of hardcoded "python" - hygiene.py: Add coverage support for Rust (cargo tarpaulin), Go (go test -cover), Node.js (jest --coverage); lint for Go (go vet); type_check for Rust (cargo check) and Go (go build) - profile.py: Remove Python gate on coverage dimension, add per-language coverage commands via _coverage_command helper - introspect.py: Extend _detect_framework for Rust (actix-web, axum, rocket) and Go (gin, echo, fiber); extend _detect_project_evals to scan .sh scripts alongside .py - 41 new tests covering all multi-language eval paths Closes #444 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Check go.mod before go.sum for framework detection to avoid false positives from transitive dependencies - Use 'npx --no-install jest' to prevent auto-installing Jest into target projects that don't use it Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tifacts go build writes compiled binaries into the project directory as a side effect. go vet performs type checking as a prerequisite without producing any output files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Closes #446 - Add LANG_CONFIG["java"] with extensions, skip_dirs, function_regex (public/protected methods excluding constructors), and entry_point_patterns - Add Java detection to growth.py fallback and introspect.py _detect_language - Add framework detection for Spring Boot, Quarkus, Micronaut - Add test/lint/type_check command detection for Maven and Gradle projects - Add Java syntax_check and coverage commands to profile.py - Add ".java" extension mapping to study.py - Add comprehensive test coverage for all Java paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Kotlin DSL Gradle projects use build.gradle.kts instead of build.gradle. Without this check, framework detection silently misses all dependencies in those projects. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use re.findall + sum instead of re.search to correctly count passed/failed tests across all crates in a Rust workspace and all suites in a Node monorepo. Also adds --workspace flag to cargo test and a shutil.which guard for cargo. Closes #447 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Go type_check: use `go build -o /dev/null ./...` instead of `go vet ./...` which was duplicating the lint dimension. `go build` does actual type checking via the compiler, distinct from `go vet` static analysis. - Java syntax_check: infer build tool from test_command (mvn/gradlew/gradle) instead of returning no-op `javac -version` for all Java projects. - Java coverage: infer build tool from test_command to select the correct jacoco command (gradlew/gradle) instead of unconditionally using Maven. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nal tools - Change Java fallback in _syntax_check_command from 'javac -version' (no-op that only prints version) to 'true' for consistency with the unknown-language fallback - Add shutil.which guards in eval_type_check before cargo check (Rust) and go build (Go) - Add shutil.which guards in eval_coverage before cargo tarpaulin (Rust), go test -cover (Go), and npx jest --coverage (Node) - Follows the same pattern used in eval_tests for Rust: log.warning + continue Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rust workspace triple-counting: detect [workspace] in root Cargo.toml and filter out member sub-crates from _find_sub_projects() roots - Java hygiene eval: add _detect_java_project(), _java_build_tool(), and Java branches to eval_tests, eval_lint, eval_type_check, eval_coverage with shutil.which guards matching the cargo/go pattern - Jest over-counting: anchor regex to ^Tests: with re.MULTILINE so "Test Suites: N passed" lines are not double-counted - Add Java build markers (pom.xml, build.gradle, build.gradle.kts) to _find_sub_projects() markers list Closes #448 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove redundant if/else in _java_test_cmd where both branches
returned identical output
- Add shutil.which('go') guard in eval_lint Go section to match
the pattern used in all other language sections
- Replace hardcoded '/dev/null' with os.devnull in eval_type_check
Go section for cross-platform portability
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- eval_type_check: distinguish Maven ('compile') vs Gradle ('compileJava')
to match how eval_lint already handles this
- eval_tests: use re.findall + sum for Java 'Tests run:' lines so
multi-module Maven builds aggregate all modules, not just the first
- Add test for multi-module Maven test result aggregation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 34 new tests across 5 test classes covering all untested code paths in factory/eval/hygiene.py from the multi-language eval work: - TestJavaBuildTool: gradlew priority, gradle, mvn fallback, no-tool - TestJavaTestCmd: cmd delegation and None return - TestEvalLintLanguages: Go/Java pass/fail/no-tool branches - TestEvalTypeCheckLanguages: Rust/Go/Java pass/fail/no-tool branches - TestEvalCoverageLanguages: Rust/Go/Node/Java parse/fallback/no-tool Closes #450 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…growth) Cover 48 uncovered lines in PR #445's diff to raise codecov/patch above 87.08%: - Java _detect_test_command: pom.xml, gradlew, build.gradle, build.gradle.kts - Java _detect_lint_command: pom.xml → checkstyle, no pom → None - Java _detect_type_check_command: mvn/gradlew/gradle compile variants - Top-level .sh eval script discovery: evaluate.sh, benchmark.sh, bench.sh - Java _syntax_check_command: mvn/gradlew/gradle compile mappings, None → 'true' - Java _coverage_command: gradlew/gradle jacocoTestReport - _detect_project_language ImportError fallback for all 6 languages + unknown Closes #451 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaces cargo tarpaulin with cargo llvm-cov as the primary Rust coverage tool (faster, cross-platform). Falls back to tarpaulin if llvm-cov is not installed. Increases timeout to 600s and logs warnings for timeouts and tool failures instead of silently skipping. Closes #452 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs prevented Rust coverage from working on real projects: 1. _run_cmd now adds ~/.cargo/bin to PATH if the directory exists, ensuring cargo-llvm-cov and cargo-tarpaulin are discoverable. 2. Tarpaulin fallback now triggers on any non-zero llvm-cov exit code, not just 'no such command'/'no such subcommand' error strings. This handles failures like missing llvm-tools-preview. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The test asserted '.cargo/bin' was absent from PATH entirely, but on CI runners ~/.cargo/bin is already in the system PATH. Assert on the specific temp directory path instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace continue with else blocks in all hygiene eval functions to avoid skipping remaining language checks in polyglot projects - Add pom.xml existence guard to _java_build_tool mvn branch - Log warning instead of fabricating '1 passed' for unparsed Java test output - Parse JaCoCo XML for actual Java coverage instead of always reporting 0% - Add missing shutil.which guards for go (eval_tests) and cargo (eval_lint) - Add coverage_output_unrecognized warnings for Go and Node coverage - Detect javascript vs typescript based on tsconfig.json presence - Add javascript to LANG_CONFIG in growth.py - Log warning on ImportError in growth.py _detect_project_language - Add log.debug to bare except OSError in workspace dedup - Add polyglot, cargo --workspace, and java no-build-tool tests Closes #467 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
XML attribute order is not guaranteed by the spec. The previous regex required type, missed, covered in exact order. Now we match the LINE counter element first, then extract missed/covered attributes separately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In multi-package JaCoCo reports, package-level counters appear before the report-level aggregate. Using re.search captured the first (package-level) match instead of the last (report-level) one. Switch to re.findall and take [-1] to get the correct aggregate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hygiene fixes: - Add Gradle test output regex and log unparsed failures - Preserve both llvm-cov and tarpaulin stderr on Rust coverage fallback - Catch ValueError in JaCoCo XML int() parsing - Guard Path.home() against RuntimeError - Add shutil.which guard for npm/npx in Node test/lint/type_check Introspect fix: - Wrap bare .read_text() calls in _detect_framework with try/except Tests: - Add 6 new tests: polyglot independence, Go not-on-PATH, Java unparsed output, Rust lint missing cargo, shutil.which side_effect routing, Go type_check command assertion - Update Rust coverage test to match concatenated stderr format Closes #470 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…st tests Add 'javascript' to _coverage_command() alongside 'typescript' for Jest coverage. Mock shutil.which in 4 Node/Jest hygiene tests so the npm guard doesn't skip the Node branch on machines without npm installed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ing) Add Rust/Go type check detection, OSError catch in growth eval, remove stale --passWithNoTests from coverage, add log.error to exception blocks, add Go test fallback warning, and add Rust/Go type check tests. Closes #476 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…error handling) - growth.py: guard _find_src_dirs against double-counting src/ subdirs - hygiene.py: credit 1 pass for Java rc==0 with unparsed output - hygiene.py: warn on Rust coverage rc==0 with unparseable output - hygiene.py: broaden Cargo PATH injection catch to KeyError/OSError - hygiene.py: check gradlew executability in _java_build_tool - hygiene.py: split JaCoCo except into OSError/ValueError with distinct logs - introspect.py: use os.devnull instead of /dev/null for Go type check - introspect.py: add Gradle checkstyleMain support for Java lint - introspect.py: add log.debug to _detect_framework file read failures - profile.py: document hygiene.py tarpaulin fallback for Rust coverage - tests: add Gradle test output parsing test - tests: add javascript coverage command test, update lint command tests Closes #478 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align _detect_lint_command with _java_build_tool in hygiene.py by checking os.access(gradlew, os.X_OK) in addition to exists(). A non-executable gradlew should not be used as the lint command. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…gth, error attribution) - Anchor Rust test regex to 'test result:' lines to avoid false matches - Java unparsed test output (rc=0) no longer credits a pass - Add tool_missing flag to all 4 eval functions for precise neutral messages - Fix all ruff E501 line-length violations across hygiene.py, profile.py, introspect.py - Go eval_tests: add else branch for non-zero rc without FAIL in output - Rust coverage: track cov_tool variable for accurate error messages - profile.py: improve Rust coverage fallback comment - introspect.py: add newline separator in Java file concatenation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The else branch after the Go `go test` result parsing could theoretically match when rc==0 with no 'ok' lines and no FAIL keyword, producing a misleading "returned non-zero" warning. Change the bare else to an explicit `elif rc != 0` so the warning only fires for actual non-zero exits. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… _safe_read)
- Use tool[-1].endswith("mvn") instead of == "mvn" at all 3 locations in
hygiene.py for robustness against absolute paths
- Standardize all ~34 shutil.which mock patches in test_hygiene.py to use
module-level patching (factory.eval.hygiene.shutil.which)
- Extract _safe_read(path) helper in introspect.py, replacing 7 repetitive
try/except blocks in _detect_framework
- Fix PR description: go build -o /dev/null for type_check, not go vet
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…overage - Add go-chi/chi framework detection in _detect_framework before stdlib fallback - Change Go dep reading from or-based to concat-based so both go.mod and go.sum are searched - Remove --passWithNoTests from TS/JS coverage command to match hygiene.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
✅ Factory Review: KEEPVerdict: KEEP Experiment: #18 Score Comparison
Guard Checks
Precheck GateCode Review Notes
Posted by Factory CEO |
✅ Factory Review: KEEPVerdict: KEEP Experiment: #18 Score Comparison
Guard Checks
Posted by Factory CEO |
colehurwitz
left a comment
There was a problem hiding this comment.
position test
colehurwitz
left a comment
There was a problem hiding this comment.
position test
| """Detect project language, reusing introspect logic when available.""" | ||
| try: | ||
| from factory.discovery.introspect import _detect_language | ||
| return _detect_language(project_path) |
colehurwitz
left a comment
There was a problem hiding this comment.
position test
| (tmp_path / "go.mod").write_text("module example.com/x\n") | ||
| which_map = {"cargo": "/usr/bin/cargo", "go": None} | ||
| with ( | ||
| patch("factory.eval.hygiene.shutil.which", side_effect=lambda cmd: which_map.get(cmd)), |
colehurwitz
left a comment
There was a problem hiding this comment.
position test
| with ( | ||
| patch("factory.eval.hygiene.shutil.which", side_effect=lambda cmd: "/usr/bin/go" if cmd == "go" else None), | ||
| patch("factory.eval.hygiene._run_cmd", return_value=(0, "", "")) as mock_run, | ||
| ): |
colehurwitz
left a comment
There was a problem hiding this comment.
position test line-based
| return "./gradlew jacocoTestReport" | ||
| if "gradle" in tc: | ||
| return "gradle jacocoTestReport" | ||
| return "mvn verify -Djacoco.skip=false" |
There was a problem hiding this comment.
test line-based comment
colehurwitz
left a comment
There was a problem hiding this comment.
Review: Multi-language eval support
Solid multi-language expansion with good defensive coding (tool-missing guards, workspace dedup, symlink safety). However, 3 tests are failing and there are a few issues to address before merge.
Failing tests
TestCoverageCommandJavaGradlew::test_mvn_returns_mvn_jacoco— test expects"mvn jacoco:report"but code returns"mvn verify -Djacoco.skip=false". Test and implementation disagree.TestPolyglotShutilWhichSideEffect::test_which_side_effect_routes_correctly—_detect_rust_projectnow requires.rssource files, but the test only createsCargo.toml. Needs a source file fixture.TestGoTypeCheckCommand::test_go_type_check_uses_go_build— same issue:go.modexists but no.gofiles, so_detect_go_projectreturnsFalseand_run_cmdis never called.
What's good
- Extensive test coverage (195 passing across new/modified test files)
- Symlink boundary check prevents path traversal outside project
_safe_readwith proper exception handling- Tool-missing guards with
shutil.whichand structured warnings - Workspace dedup for Rust/Maven/Gradle multi-module projects
- Coverage score clamped to 1.0
See inline comments for:
- Symlink boundary check bug risk
_has_source_filesperformance concern- Maven coverage command inconsistency
- Duplicated detection logic tech debt
- Missing source file fixtures in failing tests
🤖 Generated with Claude Code
| continue | ||
| # Also follow symlinks | ||
| resolved = child.resolve() | ||
| if not str(resolved).startswith(str(project_boundary) + os.sep): |
There was a problem hiding this comment.
Bug risk: String prefix comparison will incorrectly skip the project root itself when resolved == project_boundary (no trailing sep after os.sep is appended). Consider using Path.is_relative_to() (Python 3.9+) which handles edge cases correctly:
if not resolved.is_relative_to(project_boundary):| return roots or [project_path] | ||
|
|
||
|
|
||
| def _has_source_files(project_path: Path, extensions: tuple[str, ...]) -> bool: |
There was a problem hiding this comment.
Perf: rglob("*") scans the entire directory tree just to check if one matching file exists. For large projects this is expensive. Consider early-exit per extension:
def _has_source_files(project_path: Path, extensions: tuple[str, ...]) -> bool:
return any(
next(project_path.rglob(f"*{ext}"), None) is not None
for ext in extensions
)| return "./gradlew jacocoTestReport" | ||
| if "gradle" in tc: | ||
| return "gradle jacocoTestReport" | ||
| return "mvn verify -Djacoco.skip=false" |
There was a problem hiding this comment.
Inconsistency: This returns "mvn verify -Djacoco.skip=false" but eval_coverage in hygiene.py runs [*tool, "verify", "-q", "-Djacoco.skip=false"] (with -q). The test test_mvn_returns_mvn_jacoco expects "mvn jacoco:report" — a third variant. These need alignment. This test is currently failing.
| } | ||
|
|
||
|
|
||
| def _detect_project_language(project_path: Path) -> str: |
There was a problem hiding this comment.
Tech debt: This ~20-line fallback reimplementation of _detect_language from introspect.py is effectively dead code — both modules live in the same package, so ImportError won't fire in production. If someone updates detection logic in introspect.py, this fallback will silently drift. Consider importing unconditionally or at minimum documenting when this path would be hit.
|
|
||
|
|
||
| class TestPolyglotShutilWhichSideEffect: | ||
| def test_which_side_effect_routes_correctly(self, tmp_path): |
There was a problem hiding this comment.
Failing test: _detect_rust_project now requires actual .rs source files (via _has_source_files), but this test only creates Cargo.toml and go.mod — no source files for either language. Add fixtures:
(tmp_path / "src").mkdir()
(tmp_path / "src" / "lib.rs").write_text("pub fn x() {}")
(tmp_path / "main.go").write_text("package main")|
|
||
|
|
||
| class TestGoTypeCheckCommand: | ||
| def test_go_type_check_uses_go_build(self, tmp_path): |
There was a problem hiding this comment.
Failing test: Same issue — _detect_go_project requires .go source files but this test only creates go.mod. Add:
(tmp_path / "main.go").write_text("package main")|
@colehurwitz Did you mean to open a new PR? I see #445 is still open |
|
@oxsilkin yeah, this was something refactory did here. I think the idea behind this PR is good (expanding evals to different languages), but probably we need to redo it with all the context that was learned from both PRs. So point refactory at these prs and distill a new version. |
Closes #484
Changes
🤖 Generated with Claude Code