Skip to content

Fix N+1 query performance in system upsert validation#8009

Draft
erosselli wants to merge 4 commits intomainfrom
erosselli/fix-system-upsert-n-plus-1
Draft

Fix N+1 query performance in system upsert validation#8009
erosselli wants to merge 4 commits intomainfrom
erosselli/fix-system-upsert-n-plus-1

Conversation

@erosselli
Copy link
Copy Markdown
Contributor

Description

Fixes an N+1 query performance issue in POST /api/v1/system/upsert that caused memory spikes and 13+ second response times on client deployments.

The problem

validate_data_labels called get_resource() individually for every data_use, data_category, and data_subject label on every privacy declaration. Each call opened its own transaction (async_session.begin()), executed a SELECT *, and loaded the full model row. For a typical fidesctl push with 50 systems averaging 3 declarations each, this generated ~1,800 individual queries in a single request.

The fix

  • validate_data_labels: Replace the per-label get_resource loop with a single SELECT fides_key, active WHERE fides_key IN (...) batch query. Same error messages preserved (unknown vs inactive).
  • validate_privacy_declarations: Collect all labels across all declarations first, then make exactly 3 batch queries (one per label type) instead of 3 × D calls.
  • upsert_systemcreate_system: Skip redundant re-validation since upsert_system already validated everything up front. Direct create_system callers (single-system POST endpoint) still validate.

Code Changes

  • src/fides/api/db/system.py — batch query in validate_data_labels, aggregated collection in validate_privacy_declarations, skip_validation param on create_system

Steps to Confirm

  1. POST /api/v1/system/upsert with multiple systems having privacy declarations — should succeed
  2. System with unknown data_category returns 400: "Invalid privacy declaration referencing unknown DataCategory ..."
  3. System with inactive data_category returns 400: "Invalid privacy declaration referencing inactive DataCategory ..."
  4. Single system POST /api/v1/system still validates (no skip_validation)

Pre-Merge Checklist

  • All CI checks pass
  • Existing test suite passes (creates, updates, inactive/unknown label validation)
  • Manual verification of error messages unchanged

erosselli and others added 2 commits April 22, 2026 17:00
…tion

Replace per-label get_resource calls in validate_data_labels with a single
batch SELECT ... WHERE fides_key IN (...) query. Collect all labels across
declarations in validate_privacy_declarations before validating, reducing
the total from D×(1+C+S) queries per system to 3 queries per system.
Also skip redundant re-validation in create_system when called from
upsert_system which already validated up front.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
fides-plus-nightly Ignored Ignored Preview Apr 22, 2026 8:44pm
fides-privacy-center Ignored Ignored Apr 22, 2026 8:44pm

Request Review

erosselli and others added 2 commits April 22, 2026 17:15
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The bare db.execute() fails when a transaction is already active on the
session (e.g. load_samples path). Wrapping in db.begin() creates a
savepoint in that case, matching the old get_resource behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 91.30435% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.97%. Comparing base (0272c6a) to head (824b5f9).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/fides/api/db/system.py 91.30% 0 Missing and 2 partials ⚠️

❌ Your project check has failed because the head coverage (84.97%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8009   +/-   ##
=======================================
  Coverage   84.96%   84.97%           
=======================================
  Files         631      631           
  Lines       41239    41252   +13     
  Branches     4787     4790    +3     
=======================================
+ Hits        35040    35054   +14     
+ Misses       5113     5112    -1     
  Partials     1086     1086           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant