Hardening: per-module audit + deferred-gaps remediation#93
Merged
Conversation
…ometry batch_opt/fire_optimizer.py: clamp the force-norm denominator in the velocity mix. A conformer can be 'progressing' (v.f > 0) while its float32 force norm underflows to exactly 0 (forces ~1e-24, squares round to zero), so forces / f_norm produced inf/NaN that was selected into the velocity and permanently corrupted that conformer (and could even pass the fmax > tol convergence test as NaN). clamp_min(1e-30) is a no-op for any real force. batch_opt/optimization_engine.py: recompute fmax at the final geometry alongside the already-recomputed energy. The loop measures forces at the pre-step geometry then always takes one more FIRE step, so the reported fmax lagged the reported coordinates by a full step (verified: 6.73 vs 6.58 on a still-moving conformer). Padded-atom slots are zeroed before the reduction, matching the in-loop convergence check. Adds regression tests for both; also clears pre-existing lint in the two touched test files.
…etup failure workflow_workers.py: wrap optim_rank_wrapper's per-chunk loop body in try/except. Previously a single failing chunk (a molecule the optimizer chokes on, a CUDA OOM, an mkdir collision, or an isomer step that produced an empty/missing SDF -> RDKit OSError when ranking opens it) killed the whole optimizer process and silently dropped every chunk still queued behind it, while the parent went on to finalize the partial output as success. Each chunk is now isolated and logged; the worker drains the rest. Also skip ranking with a clear warning when optimizing produced no output, rather than letting RDKit raise on the missing file. workflow.py: move phase-1 setup (validate/job-dir/logging) inside run()'s try so the *_encoded temp file written by encode_ids is cleaned up when a later setup step raises; cleanup now uses is_file() so the Path() default (the cwd) is never unlinked. Validate the input suffix before encoding so an unsupported extension raises FileFormatError instead of a generic ValueError and leaves no orphaned encoded file. Adds regression tests for chunk isolation, pre-encode format rejection, and encoded-file cleanup on setup failure (all three fail without the fix).
… .smi rows, accept aimnet registry names config.py: promote input_format to a declared Auto3DOptions field. It was set as a dynamic attribute via __setitem__, which dataclasses.replace() silently dropped -- a latent AttributeError for any consumer reading it off a replace()'d copy. Now it survives replace()/pickling and is part of the dict-like API. utils/validation.py: - check_smi_format: tolerate ragged rows. The line was unpacked with 'smiles, id = line.split()', which crashed with 'too many values to unpack' on any line with a trailing column even though the chunk loader deliberately reads only the first two columns (usecols=[0,1]). Now take the first two tokens and raise the structured InputValidationError (not a raw ValueError) when a line lacks an ID. Also fix the SMILES count, which counted blank lines. - check_input: add an else-branch raising FileFormatError for an unexpected input_format instead of falling through to an UnboundLocalError. - check_valid_configuration: accept aimnet registry names (aimnet2-2025, aimnet2-nse, ...), which model_factory and the CLI schema already accept; the validator was wrongly rejecting them as unknown engines. Updates two tests that asserted the old buggy contract (extra columns must raise; missing-ID raised ValueError) to the corrected behavior.
…mbed against bad SMILES isomers/parallel_embed.py: - Broaden the per-future except from (ValueError, RuntimeError, KeyError, PicklingError) to Exception. RDKit raises Boost.Python.ArgumentError (a TypeError), which escaped the narrow except, propagated out of the generator, and aborted the whole batch -- silently dropping every not-yet-completed molecule (a nondeterministic set under as_completed). This is a per-molecule boundary, so the wider except cannot mask unrelated bugs. - Iterate futures in submission order instead of as_completed so the emitted molecule order is deterministic and matches the serial path; all futures are already running concurrently, so no parallelism is lost. isomer_engine.py: embed_conformer now returns None on an unparseable SMILES (instead of crashing on AddHs(None)) and _run_serial_embedding skips+warns, mirroring the parallel worker so a single bad SMILES no longer truncates the serial run. Adds regression tests for input-order preservation and the serial None-guard (all fail without the fix). The broadened except is exercised by the existing mixed valid/invalid test; direct subprocess fault injection is omitted as too fragile across mp start methods.
…ral centers and aborting enumeration utils/stereochemistry.py: - get_stereo_info: exclude multi-letter stereo-class descriptors (@sp, @th, @oh, @tb, @al) from the @ match via a negative lookahead. The old regex treated the '@' of @SP1/@th1 as a bare tetrahedral '@', and create_enantiomer then spliced in a second '@' to make the unparseable token @@sp1 -- which RDKit rejects, silently aborting the molecule's whole stereo enumeration. Such tokens (square-planar/octahedral/etc. metal centers) are now left intact; plain tetrahedral @/@@ inversion is unchanged. - no_enantiomer: skip comparisons between stereo-info lists of different lengths instead of calling no_enantiomer_helper (which raises on a length mismatch). A differing marker count means a different set of stereocenters, so the two cannot be enantiomers; the previous ValueError bubbled up and made amend_configuration abandon the molecule's enumeration. no_enantiomer_helper keeps its strict (and separately tested) contract. Adds regression tests for non-tetrahedral token preservation and the length-mismatch path (both fail without the fix).
logging_config.py attached the root Auto3D logger's StreamHandler to stdout. The workflow emits INFO lines (Output path, energy unit, timing) through child loggers during every run, so 'auto3d run --json' produced stdout that interleaved those log lines with the JSON document and was not parseable. Diagnostics belong on stderr regardless of --json; move the handler there. Adds a regression test asserting the handler targets stderr.
… validate on missing file cli/commands/models.py: 'auto3d models info aimnet2' (the canonical default engine name) reported 'Unknown engine' because ENGINE_INFO keys it as AIMNET; map AIMNET2 -> AIMNET. cli/commands/config.py: 'config validate' on a missing file fell through to 'Check YAML syntax and field values', a misleading hint for a not-found file; add an existence check mirroring 'config show'. tautomer.py: select_tautomers(k=0) silently dropped every tautomer (out_mols0[:0]); reject tauto_k < 1. Also use splitext for the output basename so an input like 'mol.v2.sdf' is not truncated to 'mol' (collision risk). Note: the '@' separator in select_tautomers/count_from_output is the real tautomer-grouping separator (id@tautN), not stale -- left intact (see test_select_tautomers_groups_by_id). Adds regression tests for all three (each fails without the fix).
…ometry re-read
torch_config.py: the deterministic flags were only ever set to True, so a
process that enabled a reproducible run could never restore fast mode. Set
them unconditionally from config.deterministic, and pass warn_only=True to
torch.use_deterministic_algorithms so AIMNet2/ANI scatter and masked
index-put ops -- which have no deterministic CUDA kernel -- warn instead of
raising and aborting the optimization loop the option is meant to make
reproducible.
ASE/geometry.py: the post-optimization eV->Hartree re-read called
mol.GetProp('E_tot') with no guard, so a None record (failed re-parse) or one
missing E_tot raised and discarded the entire already-completed optimization.
Skip such records. Also use splitext for the output stem so 'batch.v2.sdf' is
not truncated to 'batch'.
Adds regression tests for both (both fail without the fix).
…lowing it The broad per-future 'except Exception' (added to catch RDKit's Boost.Python.ArgumentError) also caught concurrent.futures BrokenProcessPool. If a worker is killed (e.g. OOM on a large embedding) the pool breaks and every remaining future.result() raises BrokenProcessPool, each logged as a per-molecule warning -- silently dropping the whole tail of the batch with no failure signal. Catch BrokenProcessPool separately and re-raise so pool-level failures surface loudly; the broad except still isolates per-molecule errors. Adds a fork-guarded regression test that kills a worker and asserts the pool break propagates (fails without the fix: both molecules swallowed as warnings).
This was referenced Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two-phase hardening pass over the Auto3D pipeline. Every change is a bug fix, robustness guard, or non-destructive perf improvement; no functional API changes. Each fix ships a regression test that fails without it.
Phase 1 — per-module audit (8 commits)
fmaxat the final geometry instead of one FIRE step stale.input_formatas a realAuto3DOptionsfield (survivesdataclasses.replace()); tolerate ragged.smirows; accept aimnet registry engine names.ArgumentErrorno longer aborts the batch) and deterministic (submission order); guard the serial path against unparseable SMILES.@SP/@TH/@OH/@TB) into invalid SMILES that aborted enumeration; tolerate differing stereo-marker counts.auto3d run --jsonstdout stays clean, parseable JSON.aimnet2inmodels info; rejecttauto_k < 1(silently dropped all tautomers); add a not-found guard toconfig validate.warn_only=True); guard the post-optimization ASE geometry re-read againstNone/missing-E_totrecords.Phase 2 — deferred-gaps remediation (8 commits)
torchaniextra: lazy import inANI2xt_no_repsoAuto3D.ASE.thermo(and the pure-Python thermo helpers) import without torchani;pytest.importorskipon the genuinely-ANI tests. The suite now runs clean without the extra (previously 11 failures + 1 collection abort).fp32_precisionAPI on torch ≥2.9 (legacyallow_tf32is deprecated), guarded byhasattr.filtering.pyandchemistry.py. Non-destructive — returned conformers keep explicit hydrogens and exact H coordinates (verified independently and asserted in tests, since the MLIP and final geometries need explicit H).housekeeping.Implementation plan:
docs/superpowers/plans/2026-06-11-hardening-gaps-remediation.md.Test Plan
pytest tests/ -q→ 623 passed, 8 skipped (the ANI/torchani tests skip gracefully when the optionalaniextra is absent), 0 failed, 0 errors.ruff check src/Auto3D/— no new lint introduced by these changes.aniextra installed, the 8 skipped tests run and pass (ANI2xt construction/validation paths).