test: eliminate flaky time.sleep() tests with time-machine#94
test: eliminate flaky time.sleep() tests with time-machine#94
Conversation
Adds time-machine (thread-safe C extension) to eliminate CI flakes from time-dependent test assertions. Circuit breaker tests (14 sleeps → instant shifts): - test_circuit_breaker.py, test_circuit_breaker_state_machine.py, test_circuit_breaker_race_conditions.py In-process TTL tests (4 sleeps → instant shifts): - test_file_backend_critical.py, test_head_to_head.py, test_file_backend.py (replaces unittest.mock approach) External-backend TTL tests (8 files, wider margins): - Redis/Memcached tests must sleep (server owns the TTL clock) - Margins widened from TTL+0.5s to TTL+2s
📝 WalkthroughWalkthroughThe pull request replaces real-time sleep-based waits in test suites with deterministic time mocking using the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
cargo-binstall action installs binaries to a temp PATH that doesn't persist across steps on self-hosted runners. cargo install puts them in $CARGO_HOME/bin which is already on PATH.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/competitive/test_lru_cache_comparison.py (1)
94-114:⚠️ Potential issue | 🟡 MinorTest has no post-sleep assertion — the 3s sleep is wasted and the "TTL support" claim is not actually validated.
After
time.sleep(3)there is no second invocation ofget_data(1)and no assertion comparing call counts or results. The test trailing comment ("Data should be expired (in production with Redis)") is aspirational, not verified. This test neither proves the claim in the docstring nor justifies the real-time sleep that was just widened from 1.2s to 3s — it pure overhead per run.Either assert expiration, or drop the sleep.
🛠️ Proposed fix: actually validate TTL expiration
def test_cachekit_ttl_support(self): """ CLAIM: cachekit has TTL support. Items expire after configured TTL. This is TRUE and is a feature advantage over lru_cache. """ + call_count = 0 `@cache`(ttl=1) # 1 second TTL def get_data(x: int) -> str: + nonlocal call_count + call_count += 1 return f"value_{x}" # First call - cache miss result = get_data(1) assert result == "value_1" + assert call_count == 1 + + # Immediate second call - cache hit + assert get_data(1) == "value_1" + assert call_count == 1 # Wait for TTL to expire # Server-side TTL — must sleep (time-machine can't mock Redis/Memcached clock) time.sleep(3) - # Data should be expired (in production with Redis) - # Note: Exact behavior depends on Redis persistence + # Data should be expired — function must execute again + assert get_data(1) == "value_1" + assert call_count == 2🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/competitive/test_lru_cache_comparison.py` around lines 94 - 114, The test test_cachekit_ttl_support currently sleeps but never verifies expiration; after time.sleep(3) invoke get_data(1) again and assert that the cached entry was recomputed (i.e., TTL expired) by verifying the underlying function ran again—implement this by turning get_data into a closure that increments a local counter or by using a mock/spy to assert it was called twice; also consider reducing the sleep to just over the TTL (e.g., ~1.1s) to avoid unnecessary delay while keeping ttl=1 in the cache(...) decorator.
🧹 Nitpick comments (2)
.github/workflows/security-deep.yml (1)
49-50: Consider pinningcargo-fuzz(and the othercargo installtools in this PR) to a specific version.
cargo install --locked cargo-fuzzwith no--versionwill pick up whatever version is latest on crates.io at job time, so security-tool behavior can drift silently between runs. Thekani-verifierstep at line 29 already demonstrates the pinning pattern (--version 0.67.0). Worth applying the same to:
.github/workflows/security-deep.ymlL50 —cargo-fuzz.github/workflows/fuzz-smoke.ymlL35 —cargo-fuzz.github/workflows/security-fast.ymlL70 —cargo-machete.github/workflows/security-medium.ymlL29 —cargo-geigerPinning also lets cargo's "already installed" fast-path kick in reliably on the persisted
$CARGO_HOME/bin, avoiding occasional from-source rebuilds when upstream cuts a new release.♻️ Example for this file
- name: Install cargo-fuzz - run: cargo install --locked cargo-fuzz + run: cargo install --locked cargo-fuzz --version <pinned-version>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/security-deep.yml around lines 49 - 50, The cargo install invocations should be pinned to fixed versions to avoid drift; update the `cargo install --locked cargo-fuzz`, `cargo install --locked cargo-machete`, and `cargo install --locked cargo-geiger` commands to include a concrete `--version <x.y.z>` flag (follow the same pattern used for `kani-verifier` like `--version 0.67.0`), choosing the appropriate stable version for each tool so the workflow reliably reuses the cached binary and prevents silent upstream changes.tests/integration/test_redis_integration.py (1)
91-98: Margin is TTL+3s here vs. TTL+2s elsewhere — consider normalizing.The PR description states external-backend TTL margins were widened to
TTL+2s, but this one usestime.sleep(4)againstttl=1(TTL+3s). Harmless, but aligning withtime.sleep(3)keeps the suite consistent and shaves ~1s off CI. If3swas chosen intentionally because this path is slower/flakier than other Redis tests, a brief comment would help future maintainers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_redis_integration.py` around lines 91 - 98, This test currently waits time.sleep(4) before asserting the TTL-expiration behavior for short_lived()/call_count, which is TTL+3s for ttl=1; change the sleep to time.sleep(3) to match the repo-wide TTL+2s margin used elsewhere (or, if the extra second was intentional due to flakiness, replace the hardcoded sleep with time.sleep(3) and add a short comment on why this path needs extra slack). Update the sleep in the test surrounding short_lived() and keep references to short_lived and call_count intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/critical/test_file_backend_critical.py`:
- Around line 64-72: The test test_ttl_enforced in
tests/critical/test_file_backend_critical.py no longer checks that the
"temporary" key expired; restore an assertion that verifies expiry (e.g., assert
backend.get("temporary") is None or assert not backend.exists("temporary"))
immediately after traveller.shift(timedelta(seconds=5)), and remove or update
the stale "file handle bug in FileBackend" workaround/comment (or link a bug if
one actually exists) so the test actually covers TTL enforcement using
backend.get or backend.exists as in the unit backend tests.
---
Outside diff comments:
In `@tests/competitive/test_lru_cache_comparison.py`:
- Around line 94-114: The test test_cachekit_ttl_support currently sleeps but
never verifies expiration; after time.sleep(3) invoke get_data(1) again and
assert that the cached entry was recomputed (i.e., TTL expired) by verifying the
underlying function ran again—implement this by turning get_data into a closure
that increments a local counter or by using a mock/spy to assert it was called
twice; also consider reducing the sleep to just over the TTL (e.g., ~1.1s) to
avoid unnecessary delay while keeping ttl=1 in the cache(...) decorator.
---
Nitpick comments:
In @.github/workflows/security-deep.yml:
- Around line 49-50: The cargo install invocations should be pinned to fixed
versions to avoid drift; update the `cargo install --locked cargo-fuzz`, `cargo
install --locked cargo-machete`, and `cargo install --locked cargo-geiger`
commands to include a concrete `--version <x.y.z>` flag (follow the same pattern
used for `kani-verifier` like `--version 0.67.0`), choosing the appropriate
stable version for each tool so the workflow reliably reuses the cached binary
and prevents silent upstream changes.
In `@tests/integration/test_redis_integration.py`:
- Around line 91-98: This test currently waits time.sleep(4) before asserting
the TTL-expiration behavior for short_lived()/call_count, which is TTL+3s for
ttl=1; change the sleep to time.sleep(3) to match the repo-wide TTL+2s margin
used elsewhere (or, if the extra second was intentional due to flakiness,
replace the hardcoded sleep with time.sleep(3) and add a short comment on why
this path needs extra slack). Update the sleep in the test surrounding
short_lived() and keep references to short_lived and call_count intact.
🪄 Autofix (Beta)
✅ Autofix completed
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 52ccbd8e-1fe5-4612-bff2-d176c0e459c1
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (19)
.github/workflows/fuzz-smoke.yml.github/workflows/security-deep.yml.github/workflows/security-fast.yml.github/workflows/security-medium.ymlpyproject.tomltests/competitive/test_head_to_head.pytests/competitive/test_lru_cache_comparison.pytests/critical/test_basic_cache_works.pytests/critical/test_cache_reliability.pytests/critical/test_circuit_breaker_state_machine.pytests/critical/test_encryption_integration.pytests/critical/test_encryption_wrapper.pytests/critical/test_file_backend_critical.pytests/integration/test_memcached_integration.pytests/integration/test_redis_integration.pytests/integration/test_tdd_cache_implementation.pytests/unit/backends/test_file_backend.pytests/unit/test_circuit_breaker.pytests/unit/test_circuit_breaker_race_conditions.py
| # Advance clock past TTL | ||
| traveller.shift(timedelta(seconds=5)) | ||
|
|
||
| # Permanent still exists, temporary is gone | ||
| assert backend.get("permanent") == b"stays" | ||
| # Skip reading expired key directly due to file handle bug in FileBackend | ||
| # Instead verify by setting a new key (proves cleanup didn't affect backend) | ||
| backend.set("new_key", b"new_value") | ||
| assert backend.get("new_key") == b"new_value" |
There was a problem hiding this comment.
test_ttl_enforced no longer actually verifies TTL enforcement.
After shifting the clock past the TTL, the test only asserts that the non-TTL key still exists and that a fresh set/get works — it never asserts that "temporary" expired. That is the sole behavior this test is meant to cover, so this is a coverage regression for the critical path.
The "file handle bug in FileBackend" comment also looks stale: tests/unit/backends/test_file_backend.py::test_get_expired_ttl_deletes_file and test_exists_expired_ttl_deletes_file in this same PR successfully call backend.get(...) is None / backend.exists(...) is False after an identical traveller.shift(...) on the same class. If a real bug exists, please link the tracking issue; otherwise drop the workaround.
🧪 Proposed fix — restore the expiry assertion
# Permanent still exists, temporary is gone
assert backend.get("permanent") == b"stays"
- # Skip reading expired key directly due to file handle bug in FileBackend
- # Instead verify by setting a new key (proves cleanup didn't affect backend)
- backend.set("new_key", b"new_value")
- assert backend.get("new_key") == b"new_value"
+ assert backend.exists("temporary") is False
+ assert backend.get("temporary") is None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/critical/test_file_backend_critical.py` around lines 64 - 72, The test
test_ttl_enforced in tests/critical/test_file_backend_critical.py no longer
checks that the "temporary" key expired; restore an assertion that verifies
expiry (e.g., assert backend.get("temporary") is None or assert not
backend.exists("temporary")) immediately after
traveller.shift(timedelta(seconds=5)), and remove or update the stale "file
handle bug in FileBackend" workaround/comment (or link a bug if one actually
exists) so the test actually covers TTL enforcement using backend.get or
backend.exists as in the unit backend tests.
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 1 file(s) based on 1 unresolved review comment. A stacked PR containing fixes has been created.
Time taken: |
Summary
time-machine(thread-safe C extension) as dev dependencytime.sleep()calls with instant clock shifts in circuit breaker + file backend testsWhat changed
Circuit breaker tests (14 sleeps eliminated):
test_circuit_breaker.py— 5 sleeps →traveller.shift()test_circuit_breaker_state_machine.py— 5 sleeps →traveller.shift()test_circuit_breaker_race_conditions.py— 4 sleeps →traveller.shift()In-process TTL tests (4 sleeps eliminated):
test_file_backend_critical.py— 3.5s sleep → instant shifttest_head_to_head.py— 2.5s sleep → instant shifttest_file_backend.py— 2 unittest.mock patches → time-machineExternal-backend TTL tests (8 files, margins widened):
Impact
Test plan
Summary by CodeRabbit
Tests
Chores