Skip to content

feat(worker): Optimize flake processing by filtering relevant testruns#831

Open
sentry[bot] wants to merge 1 commit intomainfrom
seer/feat/optimize-flake-processing
Open

feat(worker): Optimize flake processing by filtering relevant testruns#831
sentry[bot] wants to merge 1 commit intomainfrom
seer/feat/optimize-flake-processing

Conversation

@sentry
Copy link
Copy Markdown
Contributor

@sentry sentry bot commented Apr 16, 2026

Fixes WORKER-Y80. The issue was that: fetch_current_flakes and get_testruns queries lack optimization, causing N+1 database calls during iteration.

  • Introduced FAIL_FILTER constant to consolidate failure outcome conditions.
  • Modified fetch_current_flakes to only retrieve currently active flakes (those without an end_date).
  • Updated get_testruns to filter for test runs that are either failures (failure, flaky_fail, error) or passes of currently active flakes, improving processing efficiency.
  • Passed current flakes to get_testruns to enable targeted filtering.

This fix was generated by Seer in Sentry, triggered automatically. 👁️ Run ID: 13404484

Not quite right? Click here to continue debugging with Seer.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.


Note

Medium Risk
Changes the DB filtering criteria for which testruns are processed, so an incorrect filter could cause flakes to be missed or not expired as expected despite being a performance-oriented change.

Overview
Optimizes test-analytics flake processing to reduce unnecessary DB work by only loading active flakes and only scanning testruns that can affect flake state.

fetch_current_flakes now excludes ended flakes (end_date set), and get_testruns adds a consolidated FAIL_FILTER plus a targeted pass filter (passes only for currently-flaky test IDs), with process_single_upload passing curr_flakes into the query.

Reviewed by Cursor Bugbot for commit aba61aa. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment on lines +43 to 44
flaky_pass_filter = Q(outcome="pass") & Q(test_id__in=curr_flakes.keys())

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: When a new test becomes flaky within a single upload (e.g., a pass then a failure), the initial pass is not counted because the database query for passes is evaluated before the new flake is created.
Severity: MEDIUM

Suggested Fix

Instead of pre-filtering 'pass' testruns at the database level based on curr_flakes, fetch all 'pass' testruns for the upload. Rely on the existing in-loop check if test_id not in curr_flakes: continue to skip passes for tests that are not and do not become flaky during the upload processing. This ensures that if a new flake is created, its preceding passes from the same upload are correctly processed.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: apps/worker/services/test_analytics/ta_process_flakes.py#L43-L44

Potential issue: When processing a single upload, the database query to fetch 'pass'
testruns is built using the `curr_flakes` dictionary at the start of the operation. If
an upload contains a 'pass' followed by a 'failure' for a test that is not yet tracked
as flaky, the 'failure' will create a new flake entry. However, the preceding 'pass'
testrun was already omitted from the initial database query because its `test_id` was
not in `curr_flakes` at that time. As a result, the pass is never processed, and the
`recent_passes_count` for the newly created flake is not incremented, which can
interfere with the flake expiry logic.

Did we get this right? 👍 / 👎 to inform future reviews.

@sentry
Copy link
Copy Markdown
Contributor Author

sentry bot commented Apr 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.25%. Comparing base (9eed0bb) to head (aba61aa).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #831      +/-   ##
==========================================
- Coverage   92.25%   92.25%   -0.01%     
==========================================
  Files        1306     1306              
  Lines       48012    48014       +2     
  Branches     1636     1636              
==========================================
  Hits        44294    44294              
- Misses       3407     3409       +2     
  Partials      311      311              
Flag Coverage Δ
workerintegration 58.54% <50.00%> (-0.01%) ⬇️
workerunit 90.38% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-notifications
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit aba61aa. Configure here.

Q(timestamp__gte=timezone.now() - timedelta(days=1)) & upload_filter
Q(timestamp__gte=timezone.now() - timedelta(days=1))
& upload_filter
& (FAIL_FILTER | flaky_pass_filter)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query filter misses passes for newly created flakes

Medium Severity

The flaky_pass_filter in get_testruns captures curr_flakes.keys() at query build time, so pass testruns for tests that aren't already active flakes are excluded from the query. When handle_failure creates a new flake during iteration and adds it to curr_flakes, any subsequent pass testruns for that same test in the same upload were never fetched, so they're silently dropped. This causes recent_passes_count and count to be undercounted for newly created flakes — a behavioral regression from the old unfiltered get_testruns. This is common in CI systems with test retries, where a test fails and then passes within the same upload.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit aba61aa. Configure here.

return {
bytes(flake.test_id): flake for flake in Flake.objects.filter(repoid=repo_id)
bytes(flake.test_id): flake
for flake in Flake.objects.filter(repoid=repo_id, end_date__isnull=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-fetch query returns ambiguous results for resolved flakes

Medium Severity

The fetch_current_flakes change to filter by end_date__isnull=True affects detect_flakes.py, which also imports this function. Previously, a test with a resolved flake would find it in curr_flakes and update it in-place; no new flake was created. Now, handle_failure creates a new Flake for tests with resolved flakes. The subsequent re-fetch query (Flake.objects.filter(repoid=repo_id, test_id__in=new_test_ids)) then returns both the old resolved flake and the new active one. Without an order_by or end_date__isnull=True filter, curr_flakes non-deterministically ends up referencing either flake, potentially corrupting counts on the wrong record.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit aba61aa. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants