Skip to content

test: cut fast-gate runtime ~104s->27s by removing redundant NNP model loads#98

Merged
isayev merged 1 commit into
mainfrom
test/fast-gate-speedup
Jun 12, 2026
Merged

test: cut fast-gate runtime ~104s->27s by removing redundant NNP model loads#98
isayev merged 1 commit into
mainfrom
test/fast-gate-speedup

Conversation

@isayev

@isayev isayev commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Speeds up the default test gate (pytest -m "not slow") from ~104s to ~27s (3.85×) and eliminates the run-to-run variance.

Root cause (from pytest --durations)

~12 tests accounted for ~93 of the 104s — all loading the real AIMNet2 NNP (~4–10s each). The cache-semantics tests call ModelFactory.clear_cache(), evicting the shared model so redundant reloads landed on different tests each run (the 32–38s variance).

Changes (one decision per slow test)

  • test_model_caching — fake AIMNet2Adapter via monkeypatch; caching is generic dict logic and the identity/size assertions stay meaningful (44s → 0.33s).
  • test_model_adapter — the 5 AIMNet2 correctness tests share the session aimnet_model fixture (one load, not five); verified no in-place mutation of the shared adapter.
  • test_model_factory — factory return-type tests reuse the session fixture (it survives clear_cache(), unlike a fresh create_model), so exactly one AIMNet2 load happens in the whole gate — this removes the variance.
  • test_workflow / test_batchopt — stub create_model in tests that only exercise file-handling, bucketing, or already-faked optimization (run() returns before the model is used, or the optimization is monkeypatched).
  • test_thermo_helpers — mark the two AIMNET Hessian-load checks @pytest.mark.slow (a separate ~9–10s model load, the single biggest remaining item; optional thermo path).

Coverage note

The fast gate now does exactly one real AIMNet2 load, kept deliberately for the adapter energy/forces correctness smoke test — so real-NNP numerical coverage still runs in CI (tests.yml runs -m "not slow"). Only the heaviest, lowest-value item (the Hessian load) moved to the slow suite.

Verification

  • pytest -m "not slow": 629 passed, 8 skipped, 47 deselected — deterministic at 26.4 / 27.0 / 27.3s across three runs.
  • Slow-marked Hessian tests still pass under -m slow.

…l loads

Profiling (pytest --durations) showed ~12 tests accounted for ~93 of the 104s
fast-gate runtime, all from loading the real AIMNet2 NNP (~4-10s each). The
cache-semantics tests also call ModelFactory.clear_cache(), evicting the shared
model so redundant reloads landed on different tests run-to-run (the source of
the 32-38s variance).

Fixes, one decision per slow test:
- test_model_caching: fake AIMNet2Adapter via monkeypatch -- caching is generic
  dict logic; the identity/size assertions stay meaningful (44s -> 0.33s).
- test_model_adapter: the 5 AIMNet2 correctness tests share the session
  aimnet_model fixture (one load, not five); verified no in-place mutation.
- test_model_factory: the factory return-type tests reuse the session fixture
  (which survives clear_cache(), unlike a fresh create_model), so exactly one
  AIMNet2 load happens in the whole gate -- this removes the run-to-run variance.
- test_workflow / test_batchopt: stub create_model in tests that only exercise
  file-handling, bucketing, or already-faked optimization (run() returns before
  the model is used, or the optimization is monkeypatched), so they never load
  the real model.
- test_thermo_helpers: mark the two AIMNET Hessian-load checks slow (a separate
  ~9-10s model load, the single biggest remaining item; optional thermo path).

The fast gate now performs exactly one real AIMNet2 load (kept for the adapter
energy/forces correctness smoke test, which still runs in CI) and is
deterministic at ~27s. 629 passed, 8 skipped.
@isayev isayev merged commit c59d4f2 into main Jun 12, 2026
4 of 5 checks passed
@isayev isayev deleted the test/fast-gate-speedup branch June 12, 2026 05:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant