feat: add in-memory seismogram data cache#220
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #220 +/- ##
==========================================
+ Coverage 92.96% 93.61% +0.64%
==========================================
Files 45 45
Lines 1862 1879 +17
==========================================
+ Hits 1731 1759 +28
+ Misses 131 120 -11 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Pull request overview
This PR introduces an in-memory cache for seismogram waveform reads (keyed by datasource path + datatype), ties cache invalidation to active-event switching, and adjusts settings documentation generation to reflect true field defaults (not env-influenced runtime settings).
Changes:
- Add an in-memory waveform cache to
read_seismogram_data, with explicit cache clear + invalidation on write. - Update
set_active_eventto no-op (and preserve cache) when re-activating the currently active event by ID, and clear the cache on genuine event switches. - Generate the settings defaults markdown table from a settings instance that ignores external sources (env / dotenv).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| tests/integration/test_active_event.py | Adds integration coverage asserting cache is preserved on re-activating the active event and cleared when switching events. |
| src/aimbat/io/_base.py | Implements the in-memory seismogram data cache, plus clear/invalidate hooks. |
| src/aimbat/core/_active_event.py | Clears seismogram cache on active-event changes and avoids clearing for same-event re-activation (ID-based comparison). |
| src/aimbat/_config.py | Adjusts settings markdown generation to use true defaults by ignoring external settings sources. |
Comments suppressed due to low confidence (3)
tests/integration/test_active_event.py:6
- Import grouping/order in this test is inconsistent with the pattern used elsewhere in the integration suite (stdlib imports like
uuidtypically come before third-party imports likepytest). Consider reordering into standard library → third-party → local imports to keep the file consistent and avoid potential import-order linting if enabled later.
import pytest
import uuid
from unittest.mock import patch
from aimbat.core import set_active_event, set_active_event_by_id, get_active_event
src/aimbat/io/_base.py:140
- The new global in-memory cache will retain waveform arrays for every unique (datasource, datatype) read until either an active-event switch occurs or the specific datasource is written. This can cause significant memory growth in code paths that read seismograms across multiple events without switching the active event (e.g. listing/printing seismograms for all events reads
seismogram.data), because nothing clears the cache afterwards. Consider scoping the cache to the active event (e.g. include active event id in the key and clear on activation), adding an upper bound/LRU eviction, or providing an opt-out/bypass for bulk "all events" operations.
key = (str(datasource), datatype)
if key not in _cache:
_cache[key] = data_reader_fn(datasource)
return _cache[key]
src/aimbat/core/_active_event.py:89
- The early-return branch skips setting
event.active = True/ committing when the passed-in event has the same ID as the currently-active DB row. If a caller passes an instance withactive=False(e.g. a detached/stale object with the same id), this becomes a no-op and leaves the caller’s object in an incorrect state relative to the DB. Consider still synchronising the passed instance (setevent.active = Trueand optionallysession.add(event)/session.refresh(event)) before returning, while keeping the cache intact.
with suppress(NoResultFound):
if event.id == get_active_event(session).id:
return
bfb8986 to
da6c434
Compare
Cache seismogram data in memory on read, keyed by source path and datatype. The cache is cleared when the active event changes, so it effectively holds only the active event's data at any given time. Re-activating the already-active event is a no-op (cache preserved). Also fixes set_active_event to compare events by id rather than identity, and generates the settings docs table from true field defaults rather than the env-influenced module-level settings instance.
da6c434 to
fbe28f8
Compare
Cache seismogram data in memory on read, keyed by source path and datatype. The cache is cleared when the active event changes, so it effectively holds only the active event's data at any given time. Re-activating the already-active event is a no-op (cache preserved).
Also fixes set_active_event to compare events by id rather than identity, and generates the settings docs table from true field defaults rather than the env-influenced module-level settings instance.
📚 Documentation preview 📚: https://aimbat--220.org.readthedocs.build/en/220/