Make model tests more scalable by kmaziarz · Pull Request #154 · microsoft/syntheseus

kmaziarz · 2026-06-09T12:03:01Z

With the upcoming integration of new models from RetroChimera, I found that the way we currently run single-step models tests does not scale, especially when larger models are included. First, we've been running tests for all models in a single process, which meant all models are loaded simultaneously; this is inherently bounded by total memory available on the CI runners. Second, downloading many heavy model checkpoints is slow, and has a large variance. This PR attempts to resolve both issues to enable scaling to more single-step models in the future.

To avoid keeping all models in memory simultaneously, one could try forcing garbage collection, but this does not clean up all state, especially for models that involve multiprocessing or call into non-Python-native libraries. This PR instead proposes to run the tests in separate processes, which ensures all state is cleaned up. Previously, in test_models.py we used to test every model twice, to also test that loading one model doesn't make another model unusable; in the new setup, this wouldn't make sense, as models are now completely separate. We do lose a bit of test coverage as we do not test interactions between the models, but doing so exhaustively would not scale anyway; also, for the majority of usecases, only one model is used at a time. Finally, to reduce the burden of downloading model checkpoints, I use the cache action to cache the checkpoint directory (keyed on the contents of default_checkpoint_ids.yml), so that CI runs that don't add new models or change existing ones can benefit from much faster checkpoint download.

kmaziarz added 3 commits June 5, 2026 13:29

feat(test_models): Merge test_misc into test_call

6ec31f6

feat(tests): Run each model and CLI test in a separate process

604d66e

feat(ci): Cache model checkpoints

f8ee181

kmaziarz requested a review from AustinT June 9, 2026 12:03

jla-gardner approved these changes Jun 9, 2026

View reviewed changes

kmaziarz requested a review from jla-gardner June 9, 2026 15:34

jla-gardner approved these changes Jun 9, 2026

View reviewed changes

kmaziarz requested review from fiberleif, jla-gardner and mrwnmsr June 9, 2026 15:39

jla-gardner approved these changes Jun 9, 2026

View reviewed changes

mrwnmsr approved these changes Jun 9, 2026

View reviewed changes

kmaziarz merged commit 5e26cdb into main Jun 10, 2026
37 checks passed

kmaziarz deleted the kmaziarz/test-models-in-separate-process branch June 10, 2026 10:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make model tests more scalable#154

Make model tests more scalable#154
kmaziarz merged 3 commits into
mainfrom
kmaziarz/test-models-in-separate-process

kmaziarz commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kmaziarz commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants