feat(worker): Optimize flake processing by filtering relevant testruns#831
feat(worker): Optimize flake processing by filtering relevant testruns#831sentry[bot] wants to merge 1 commit intomainfrom
Conversation
| flaky_pass_filter = Q(outcome="pass") & Q(test_id__in=curr_flakes.keys()) | ||
|
|
There was a problem hiding this comment.
Bug: When a new test becomes flaky within a single upload (e.g., a pass then a failure), the initial pass is not counted because the database query for passes is evaluated before the new flake is created.
Severity: MEDIUM
Suggested Fix
Instead of pre-filtering 'pass' testruns at the database level based on curr_flakes, fetch all 'pass' testruns for the upload. Rely on the existing in-loop check if test_id not in curr_flakes: continue to skip passes for tests that are not and do not become flaky during the upload processing. This ensures that if a new flake is created, its preceding passes from the same upload are correctly processed.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: apps/worker/services/test_analytics/ta_process_flakes.py#L43-L44
Potential issue: When processing a single upload, the database query to fetch 'pass'
testruns is built using the `curr_flakes` dictionary at the start of the operation. If
an upload contains a 'pass' followed by a 'failure' for a test that is not yet tracked
as flaky, the 'failure' will create a new flake entry. However, the preceding 'pass'
testrun was already omitted from the initial database query because its `test_id` was
not in `curr_flakes` at that time. As a result, the pass is never processed, and the
`recent_passes_count` for the newly created flake is not incremented, which can
interfere with the flake expiry logic.
Did we get this right? 👍 / 👎 to inform future reviews.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #831 +/- ##
==========================================
- Coverage 92.25% 92.25% -0.01%
==========================================
Files 1306 1306
Lines 48012 48014 +2
Branches 1636 1636
==========================================
Hits 44294 44294
- Misses 3407 3409 +2
Partials 311 311
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit aba61aa. Configure here.
| Q(timestamp__gte=timezone.now() - timedelta(days=1)) & upload_filter | ||
| Q(timestamp__gte=timezone.now() - timedelta(days=1)) | ||
| & upload_filter | ||
| & (FAIL_FILTER | flaky_pass_filter) |
There was a problem hiding this comment.
Query filter misses passes for newly created flakes
Medium Severity
The flaky_pass_filter in get_testruns captures curr_flakes.keys() at query build time, so pass testruns for tests that aren't already active flakes are excluded from the query. When handle_failure creates a new flake during iteration and adds it to curr_flakes, any subsequent pass testruns for that same test in the same upload were never fetched, so they're silently dropped. This causes recent_passes_count and count to be undercounted for newly created flakes — a behavioral regression from the old unfiltered get_testruns. This is common in CI systems with test retries, where a test fails and then passes within the same upload.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit aba61aa. Configure here.
| return { | ||
| bytes(flake.test_id): flake for flake in Flake.objects.filter(repoid=repo_id) | ||
| bytes(flake.test_id): flake | ||
| for flake in Flake.objects.filter(repoid=repo_id, end_date__isnull=True) |
There was a problem hiding this comment.
Re-fetch query returns ambiguous results for resolved flakes
Medium Severity
The fetch_current_flakes change to filter by end_date__isnull=True affects detect_flakes.py, which also imports this function. Previously, a test with a resolved flake would find it in curr_flakes and update it in-place; no new flake was created. Now, handle_failure creates a new Flake for tests with resolved flakes. The subsequent re-fetch query (Flake.objects.filter(repoid=repo_id, test_id__in=new_test_ids)) then returns both the old resolved flake and the new active one. Without an order_by or end_date__isnull=True filter, curr_flakes non-deterministically ends up referencing either flake, potentially corrupting counts on the wrong record.
Reviewed by Cursor Bugbot for commit aba61aa. Configure here.


Fixes WORKER-Y80. The issue was that:
fetch_current_flakesandget_testrunsqueries lack optimization, causing N+1 database calls during iteration.FAIL_FILTERconstant to consolidate failure outcome conditions.fetch_current_flakesto only retrieve currently active flakes (those without anend_date).get_testrunsto filter for test runs that are either failures (failure, flaky_fail, error) or passes of currently active flakes, improving processing efficiency.get_testrunsto enable targeted filtering.This fix was generated by Seer in Sentry, triggered automatically. 👁️ Run ID: 13404484
Not quite right? Click here to continue debugging with Seer.
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.
Note
Medium Risk
Changes the DB filtering criteria for which testruns are processed, so an incorrect filter could cause flakes to be missed or not expired as expected despite being a performance-oriented change.
Overview
Optimizes test-analytics flake processing to reduce unnecessary DB work by only loading active flakes and only scanning testruns that can affect flake state.
fetch_current_flakesnow excludes ended flakes (end_dateset), andget_testrunsadds a consolidatedFAIL_FILTERplus a targeted pass filter (passes only for currently-flaky test IDs), withprocess_single_uploadpassingcurr_flakesinto the query.Reviewed by Cursor Bugbot for commit aba61aa. Bugbot is set up for automated code reviews on this repo. Configure here.