test: cut fast-gate runtime ~104s->27s by removing redundant NNP model loads#98
Merged
Conversation
…l loads Profiling (pytest --durations) showed ~12 tests accounted for ~93 of the 104s fast-gate runtime, all from loading the real AIMNet2 NNP (~4-10s each). The cache-semantics tests also call ModelFactory.clear_cache(), evicting the shared model so redundant reloads landed on different tests run-to-run (the source of the 32-38s variance). Fixes, one decision per slow test: - test_model_caching: fake AIMNet2Adapter via monkeypatch -- caching is generic dict logic; the identity/size assertions stay meaningful (44s -> 0.33s). - test_model_adapter: the 5 AIMNet2 correctness tests share the session aimnet_model fixture (one load, not five); verified no in-place mutation. - test_model_factory: the factory return-type tests reuse the session fixture (which survives clear_cache(), unlike a fresh create_model), so exactly one AIMNet2 load happens in the whole gate -- this removes the run-to-run variance. - test_workflow / test_batchopt: stub create_model in tests that only exercise file-handling, bucketing, or already-faked optimization (run() returns before the model is used, or the optimization is monkeypatched), so they never load the real model. - test_thermo_helpers: mark the two AIMNET Hessian-load checks slow (a separate ~9-10s model load, the single biggest remaining item; optional thermo path). The fast gate now performs exactly one real AIMNet2 load (kept for the adapter energy/forces correctness smoke test, which still runs in CI) and is deterministic at ~27s. 629 passed, 8 skipped.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Speeds up the default test gate (
pytest -m "not slow") from ~104s to ~27s (3.85×) and eliminates the run-to-run variance.Root cause (from
pytest --durations)~12 tests accounted for ~93 of the 104s — all loading the real AIMNet2 NNP (~4–10s each). The cache-semantics tests call
ModelFactory.clear_cache(), evicting the shared model so redundant reloads landed on different tests each run (the 32–38s variance).Changes (one decision per slow test)
test_model_caching— fakeAIMNet2Adaptervia monkeypatch; caching is generic dict logic and the identity/size assertions stay meaningful (44s → 0.33s).test_model_adapter— the 5 AIMNet2 correctness tests share the sessionaimnet_modelfixture (one load, not five); verified no in-place mutation of the shared adapter.test_model_factory— factory return-type tests reuse the session fixture (it survivesclear_cache(), unlike a freshcreate_model), so exactly one AIMNet2 load happens in the whole gate — this removes the variance.test_workflow/test_batchopt— stubcreate_modelin tests that only exercise file-handling, bucketing, or already-faked optimization (run()returns before the model is used, or the optimization is monkeypatched).test_thermo_helpers— mark the two AIMNET Hessian-load checks@pytest.mark.slow(a separate ~9–10s model load, the single biggest remaining item; optional thermo path).Coverage note
The fast gate now does exactly one real AIMNet2 load, kept deliberately for the adapter energy/forces correctness smoke test — so real-NNP numerical coverage still runs in CI (
tests.ymlruns-m "not slow"). Only the heaviest, lowest-value item (the Hessian load) moved to the slow suite.Verification
pytest -m "not slow": 629 passed, 8 skipped, 47 deselected — deterministic at 26.4 / 27.0 / 27.3s across three runs.-m slow.