perf(worker): Bulk update ended flakes in test analytics#812
perf(worker): Bulk update ended flakes in test analytics#812sentry[bot] wants to merge 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit a165ae1. Configure here.
| if ended_flakes: | ||
| Flake.objects.bulk_update( | ||
| ended_flakes, ["end_date", "count", "recent_passes_count"] | ||
| ) |
There was a problem hiding this comment.
Missing fail_count in ended flakes bulk update
High Severity
The bulk_update for ended_flakes updates ["end_date", "count", "recent_passes_count"] but omits "fail_count". A flake can have its fail_count incremented by handle_failure (which also resets recent_passes_count to 0) and then later end via 30 consecutive passes in handle_pass within the same commit processing. The original .save() call persisted all fields including fail_count, so this is a data-loss regression. The bulk_create for non-ended flakes correctly includes "fail_count" in its update_fields, making this an inconsistency.
Reviewed by Cursor Bugbot for commit a165ae1. Configure here.
| if ended_flakes: | ||
| Flake.objects.bulk_update( | ||
| ended_flakes, ["end_date", "count", "recent_passes_count"] | ||
| ) |
There was a problem hiding this comment.
Bulk update crashes on unsaved new flake objects
Medium Severity
handle_failure creates a new Flake instance (never saved, id=None) and puts it in curr_flakes. If this flake later accumulates 30 passes and ends in the same commit processing, it gets popped into ended_flakes. Calling bulk_update on objects with pk=None raises a ValueError. The original .save() handled this correctly by performing an INSERT. The flake is also no longer in curr_flakes, so bulk_create won't pick it up either.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit a165ae1. Configure here.
| if ended_flakes: | ||
| Flake.objects.bulk_update( | ||
| ended_flakes, ["end_date", "count", "recent_passes_count"] | ||
| ) |
There was a problem hiding this comment.
Bug: A new Flake object that expires in the same run is lost because bulk_update silently ignores objects with a null primary key, causing data loss.
Severity: HIGH
Suggested Fix
Separate the ended_flakes list into two groups: one for existing flakes with a database id and another for new flakes where id is None. Use Flake.objects.bulk_update() for the existing flakes and Flake.objects.bulk_create() for the new ones to ensure all are persisted correctly.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: apps/worker/services/test_analytics/ta_process_flakes.py#L143-L146
Potential issue: A newly created `Flake` object, which exists only in memory and has
`id=None`, will be permanently lost if it also expires within the same processing batch.
The logic adds the expired flake to the `ended_flakes` list for a `bulk_update`
operation. However, Django's `bulk_update` requires a primary key to identify which row
to update and silently ignores objects where `id=None`. Since the flake was also removed
from the list of current flakes, it is never persisted to the database, resulting in
data loss.
Did we get this right? 👍 / 👎 to inform future reviews.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #812 +/- ##
=======================================
Coverage 92.25% 92.25%
=======================================
Files 1306 1306
Lines 48004 48011 +7
Branches 1636 1636
=======================================
+ Hits 44286 44293 +7
Misses 3407 3407
Partials 311 311
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |


Fixes WORKER-Y7H. The issue was that: ReportSession objects fetched without prefetching related
reportandcommitobjects, causing N+1 queries during iteration.handle_passto return an endedFlakeobject instead of saving it individually and deleting it from the dictionary.process_single_uploadto collect allFlakeobjects that have met their ending criteria into a list.bulk_updateoperation inprocess_flakes_for_committo efficiently update all ended flakes in a single database transaction.This fix was generated by Seer in Sentry, triggered automatically. 👁️ Run ID: 13313490
Not quite right? Click here to continue debugging with Seer.
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.
Note
Medium Risk
Touches flake lifecycle persistence logic and changes how/when
Flakerows are written, which could affect correctness if the ended-vs-active set is mishandled; scope is contained to the test analytics worker.Overview
Reduces per-test DB writes during flake processing by stopping individual
Flake.save()calls when a flake “ends” and instead collecting ended flakes and persisting them via a singleFlake.objects.bulk_update()inprocess_flakes_for_commit.handle_passnow returns the endedFlake(popping it fromcurr_flakes) when the pass threshold is reached,process_single_uploadaggregates these ended flakes, and the remaining active/new flakes continue to be upserted viabulk_create(..., update_conflicts=True).Reviewed by Cursor Bugbot for commit a165ae1. Bugbot is set up for automated code reviews on this repo. Configure here.