Hardening: per-module audit + deferred-gaps remediation by isayev · Pull Request #93 · isayevlab/Auto3D_pkg

isayev · 2026-06-11T22:33:56Z

Summary

Two-phase hardening pass over the Auto3D pipeline. Every change is a bug fix, robustness guard, or non-destructive perf improvement; no functional API changes. Each fix ships a regression test that fails without it.

Phase 1 — per-module audit (8 commits)

Optimizer core: guard FIRE against float32 force-norm underflow (was injecting inf/NaN); report fmax at the final geometry instead of one FIRE step stale.
Orchestration: isolate each optimizer chunk so one bad chunk no longer kills the worker and silently drops all remaining chunks; stop leaking the encoded temp file when a setup phase fails; reject unsupported extensions before encoding.
Config/validation: declare input_format as a real Auto3DOptions field (survives dataclasses.replace()); tolerate ragged .smi rows; accept aimnet registry engine names.
Isomer/embedding: make parallel embedding robust (broaden the per-worker except so RDKit ArgumentError no longer aborts the batch) and deterministic (submission order); guard the serial path against unparseable SMILES.
Stereochemistry: stop SMILES string-surgery from corrupting non-tetrahedral centers (@SP/@TH/@OH/@TB) into invalid SMILES that aborted enumeration; tolerate differing stereo-marker counts.
Logging: route library logging to stderr so auto3d run --json stdout stays clean, parseable JSON.
CLI/tautomer: resolve aimnet2 in models info; reject tauto_k < 1 (silently dropped all tautomers); add a not-found guard to config validate.
torch_config/ASE: make deterministic mode reversible and non-fatal (warn_only=True); guard the post-optimization ASE geometry re-read against None/missing-E_tot records.

Phase 2 — deferred-gaps remediation (8 commits)

Decouple the test suite from the optional torchani extra: lazy import in ANI2xt_no_rep so Auto3D.ASE.thermo (and the pure-Python thermo helpers) import without torchani; pytest.importorskip on the genuinely-ANI tests. The suite now runs clean without the extra (previously 11 failures + 1 collection abort).
TF32 precision: pin via the modern fp32_precision API on torch ≥2.9 (legacy allow_tf32 is deprecated), guarded by hasattr.
O(n²) → O(n) RMSD dedup: strip Hs once per conformer instead of per comparison, in both filtering.py and chemistry.py. Non-destructive — returned conformers keep explicit hydrogens and exact H coordinates (verified independently and asserted in tests, since the MLIP and final geometries need explicit H).
Concurrency: per-file guard on the omega/flipper temp-file sweep in housekeeping.
Docs: clarify ANI2xt float32 energy precision (self-energies cancel across conformers, so ranking is unaffected).

Implementation plan: docs/superpowers/plans/2026-06-11-hardening-gaps-remediation.md.

Test Plan

pytest tests/ -q → 623 passed, 8 skipped (the ANI/torchani tests skip gracefully when the optional ani extra is absent), 0 failed, 0 errors.
ruff check src/Auto3D/ — no new lint introduced by these changes.
With the ani extra installed, the 8 skipped tests run and pass (ANI2xt construction/validation paths).

…ometry batch_opt/fire_optimizer.py: clamp the force-norm denominator in the velocity mix. A conformer can be 'progressing' (v.f > 0) while its float32 force norm underflows to exactly 0 (forces ~1e-24, squares round to zero), so forces / f_norm produced inf/NaN that was selected into the velocity and permanently corrupted that conformer (and could even pass the fmax > tol convergence test as NaN). clamp_min(1e-30) is a no-op for any real force. batch_opt/optimization_engine.py: recompute fmax at the final geometry alongside the already-recomputed energy. The loop measures forces at the pre-step geometry then always takes one more FIRE step, so the reported fmax lagged the reported coordinates by a full step (verified: 6.73 vs 6.58 on a still-moving conformer). Padded-atom slots are zeroed before the reduction, matching the in-loop convergence check. Adds regression tests for both; also clears pre-existing lint in the two touched test files.

…etup failure workflow_workers.py: wrap optim_rank_wrapper's per-chunk loop body in try/except. Previously a single failing chunk (a molecule the optimizer chokes on, a CUDA OOM, an mkdir collision, or an isomer step that produced an empty/missing SDF -> RDKit OSError when ranking opens it) killed the whole optimizer process and silently dropped every chunk still queued behind it, while the parent went on to finalize the partial output as success. Each chunk is now isolated and logged; the worker drains the rest. Also skip ranking with a clear warning when optimizing produced no output, rather than letting RDKit raise on the missing file. workflow.py: move phase-1 setup (validate/job-dir/logging) inside run()'s try so the *_encoded temp file written by encode_ids is cleaned up when a later setup step raises; cleanup now uses is_file() so the Path() default (the cwd) is never unlinked. Validate the input suffix before encoding so an unsupported extension raises FileFormatError instead of a generic ValueError and leaves no orphaned encoded file. Adds regression tests for chunk isolation, pre-encode format rejection, and encoded-file cleanup on setup failure (all three fail without the fix).

… .smi rows, accept aimnet registry names config.py: promote input_format to a declared Auto3DOptions field. It was set as a dynamic attribute via __setitem__, which dataclasses.replace() silently dropped -- a latent AttributeError for any consumer reading it off a replace()'d copy. Now it survives replace()/pickling and is part of the dict-like API. utils/validation.py: - check_smi_format: tolerate ragged rows. The line was unpacked with 'smiles, id = line.split()', which crashed with 'too many values to unpack' on any line with a trailing column even though the chunk loader deliberately reads only the first two columns (usecols=[0,1]). Now take the first two tokens and raise the structured InputValidationError (not a raw ValueError) when a line lacks an ID. Also fix the SMILES count, which counted blank lines. - check_input: add an else-branch raising FileFormatError for an unexpected input_format instead of falling through to an UnboundLocalError. - check_valid_configuration: accept aimnet registry names (aimnet2-2025, aimnet2-nse, ...), which model_factory and the CLI schema already accept; the validator was wrongly rejecting them as unknown engines. Updates two tests that asserted the old buggy contract (extra columns must raise; missing-ID raised ValueError) to the corrected behavior.

…mbed against bad SMILES isomers/parallel_embed.py: - Broaden the per-future except from (ValueError, RuntimeError, KeyError, PicklingError) to Exception. RDKit raises Boost.Python.ArgumentError (a TypeError), which escaped the narrow except, propagated out of the generator, and aborted the whole batch -- silently dropping every not-yet-completed molecule (a nondeterministic set under as_completed). This is a per-molecule boundary, so the wider except cannot mask unrelated bugs. - Iterate futures in submission order instead of as_completed so the emitted molecule order is deterministic and matches the serial path; all futures are already running concurrently, so no parallelism is lost. isomer_engine.py: embed_conformer now returns None on an unparseable SMILES (instead of crashing on AddHs(None)) and _run_serial_embedding skips+warns, mirroring the parallel worker so a single bad SMILES no longer truncates the serial run. Adds regression tests for input-order preservation and the serial None-guard (all fail without the fix). The broadened except is exercised by the existing mixed valid/invalid test; direct subprocess fault injection is omitted as too fragile across mp start methods.

@sp

…ral centers and aborting enumeration utils/stereochemistry.py: - get_stereo_info: exclude multi-letter stereo-class descriptors (@sp, @th, @oh, @tb, @al) from the @ match via a negative lookahead. The old regex treated the '@' of @SP1/@th1 as a bare tetrahedral '@', and create_enantiomer then spliced in a second '@' to make the unparseable token @@sp1 -- which RDKit rejects, silently aborting the molecule's whole stereo enumeration. Such tokens (square-planar/octahedral/etc. metal centers) are now left intact; plain tetrahedral @/@@ inversion is unchanged. - no_enantiomer: skip comparisons between stereo-info lists of different lengths instead of calling no_enantiomer_helper (which raises on a length mismatch). A differing marker count means a different set of stereocenters, so the two cannot be enantiomers; the previous ValueError bubbled up and made amend_configuration abandon the molecule's enumeration. no_enantiomer_helper keeps its strict (and separately tested) contract. Adds regression tests for non-tetrahedral token preservation and the length-mismatch path (both fail without the fix).

logging_config.py attached the root Auto3D logger's StreamHandler to stdout. The workflow emits INFO lines (Output path, energy unit, timing) through child loggers during every run, so 'auto3d run --json' produced stdout that interleaved those log lines with the JSON document and was not parseable. Diagnostics belong on stderr regardless of --json; move the handler there. Adds a regression test asserting the handler targets stderr.

… validate on missing file cli/commands/models.py: 'auto3d models info aimnet2' (the canonical default engine name) reported 'Unknown engine' because ENGINE_INFO keys it as AIMNET; map AIMNET2 -> AIMNET. cli/commands/config.py: 'config validate' on a missing file fell through to 'Check YAML syntax and field values', a misleading hint for a not-found file; add an existence check mirroring 'config show'. tautomer.py: select_tautomers(k=0) silently dropped every tautomer (out_mols0[:0]); reject tauto_k < 1. Also use splitext for the output basename so an input like 'mol.v2.sdf' is not truncated to 'mol' (collision risk). Note: the '@' separator in select_tautomers/count_from_output is the real tautomer-grouping separator (id@tautN), not stale -- left intact (see test_select_tautomers_groups_by_id). Adds regression tests for all three (each fails without the fix).

…ometry re-read torch_config.py: the deterministic flags were only ever set to True, so a process that enabled a reproducible run could never restore fast mode. Set them unconditionally from config.deterministic, and pass warn_only=True to torch.use_deterministic_algorithms so AIMNet2/ANI scatter and masked index-put ops -- which have no deterministic CUDA kernel -- warn instead of raising and aborting the optimization loop the option is meant to make reproducible. ASE/geometry.py: the post-optimization eV->Hartree re-read called mol.GetProp('E_tot') with no guard, so a None record (failed re-parse) or one missing E_tot raised and discarded the entire already-completed optimization. Skip such records. Also use splitext for the output stem so 'batch.v2.sdf' is not truncated to 'batch'. Adds regression tests for both (both fail without the fix).

…ni extra

…O(n))

…ally

…ation

…lowing it The broad per-future 'except Exception' (added to catch RDKit's Boost.Python.ArgumentError) also caught concurrent.futures BrokenProcessPool. If a worker is killed (e.g. OOM on a large embedding) the pool breaks and every remaining future.result() raises BrokenProcessPool, each logged as a per-molecule warning -- silently dropping the whole tail of the batch with no failure signal. Catch BrokenProcessPool separately and re-raise so pool-level failures surface loudly; the broad except still isolates per-molecule errors. Adds a fork-guarded regression test that kills a worker and asserts the pool break propagates (fails without the fix: both molecules swallowed as warnings).

isayev added 17 commits June 11, 2026 14:21

fix: import torchani lazily in ANI2xt so thermo imports without the a…

b55b9dd

…ni extra

test: skip ANI2xt/ANI2x tests when the optional torchani extra is absent

8f436f6

fix: pin TF32 precision via the modern fp32_precision API on torch>=2.9

1705998

perf: strip Hs once per conformer in cluster dedup (O(n^2) -> O(n))

89ce710

perf: strip Hs once per conformer in legacy filter_unique (O(n^2) -> …

c117dea

…O(n))

fix: guard each omega/flipper temp-file move in housekeeping individu…

2c64097

…ally

docs: clarify ANI2xt float32 energy precision and self-energy cancell…

75e5f4a

…ation

style: sort imports and modernize Optional annotation in chemistry.py

d19f6d8

isayev merged commit b6c138a into main Jun 11, 2026
2 of 3 checks passed

isayev deleted the hardening/audit-round2 branch June 11, 2026 22:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hardening: per-module audit + deferred-gaps remediation#93

Hardening: per-module audit + deferred-gaps remediation#93
isayev merged 17 commits into
mainfrom
hardening/audit-round2

isayev commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

isayev commented Jun 11, 2026

Summary

Phase 1 — per-module audit (8 commits)

Phase 2 — deferred-gaps remediation (8 commits)

Test Plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant