Fix N+1 query performance in system upsert validation#8009
Draft
Fix N+1 query performance in system upsert validation#8009
Conversation
…tion Replace per-label get_resource calls in validate_data_labels with a single batch SELECT ... WHERE fides_key IN (...) query. Collect all labels across declarations in validate_privacy_declarations before validating, reducing the total from D×(1+C+S) queries per system to 3 queries per system. Also skip redundant re-validation in create_system when called from upsert_system which already validated up front. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The bare db.execute() fails when a transaction is already active on the session (e.g. load_samples path). Wrapping in db.begin() creates a savepoint in that case, matching the old get_resource behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
❌ Your project check has failed because the head coverage (84.97%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #8009 +/- ##
=======================================
Coverage 84.96% 84.97%
=======================================
Files 631 631
Lines 41239 41252 +13
Branches 4787 4790 +3
=======================================
+ Hits 35040 35054 +14
+ Misses 5113 5112 -1
Partials 1086 1086 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes an N+1 query performance issue in
POST /api/v1/system/upsertthat caused memory spikes and 13+ second response times on client deployments.The problem
validate_data_labelscalledget_resource()individually for every data_use, data_category, and data_subject label on every privacy declaration. Each call opened its own transaction (async_session.begin()), executed aSELECT *, and loaded the full model row. For a typicalfidesctl pushwith 50 systems averaging 3 declarations each, this generated ~1,800 individual queries in a single request.The fix
validate_data_labels: Replace the per-labelget_resourceloop with a singleSELECT fides_key, active WHERE fides_key IN (...)batch query. Same error messages preserved (unknown vs inactive).validate_privacy_declarations: Collect all labels across all declarations first, then make exactly 3 batch queries (one per label type) instead of3 × Dcalls.upsert_system→create_system: Skip redundant re-validation sinceupsert_systemalready validated everything up front. Directcreate_systemcallers (single-system POST endpoint) still validate.Code Changes
src/fides/api/db/system.py— batch query invalidate_data_labels, aggregated collection invalidate_privacy_declarations,skip_validationparam oncreate_systemSteps to Confirm
POST /api/v1/system/upsertwith multiple systems having privacy declarations — should succeedPOST /api/v1/systemstill validates (noskip_validation)Pre-Merge Checklist