Fix N+1 query performance in system upsert validation by erosselli · Pull Request #8009 · ethyca/fides

erosselli · 2026-04-22T20:11:07Z

Description

Fixes an N+1 query performance issue in POST /api/v1/system/upsert that caused memory spikes and 13+ second response times on client deployments.

The problem

validate_data_labels called get_resource() individually for every data_use, data_category, and data_subject label on every privacy declaration. Each call opened its own transaction (async_session.begin()), executed a SELECT *, and loaded the full model row. For a typical fidesctl push with 50 systems averaging 3 declarations each, this generated ~1,800 individual queries in a single request.

The fix

validate_data_labels: Replace the per-label get_resource loop with a single SELECT fides_key, active WHERE fides_key IN (...) batch query. Same error messages preserved (unknown vs inactive).
validate_privacy_declarations: Collect all labels across all declarations first, then make exactly 3 batch queries (one per label type) instead of 3 × D calls.
upsert_system → create_system: Skip redundant re-validation since upsert_system already validated everything up front. Direct create_system callers (single-system POST endpoint) still validate.

Code Changes

src/fides/api/db/system.py — batch query in validate_data_labels, aggregated collection in validate_privacy_declarations, skip_validation param on create_system

Steps to Confirm

POST /api/v1/system/upsert with multiple systems having privacy declarations — should succeed
System with unknown data_category returns 400: "Invalid privacy declaration referencing unknown DataCategory ..."
System with inactive data_category returns 400: "Invalid privacy declaration referencing inactive DataCategory ..."
Single system POST /api/v1/system still validates (no skip_validation)

Pre-Merge Checklist

All CI checks pass
Existing test suite passes (creates, updates, inactive/unknown label validation)
Manual verification of error messages unchanged

…tion Replace per-label get_resource calls in validate_data_labels with a single batch SELECT ... WHERE fides_key IN (...) query. Collect all labels across declarations in validate_privacy_declarations before validating, reducing the total from D×(1+C+S) queries per system to 3 queries per system. Also skip redundant re-validation in create_system when called from upsert_system which already validated up front. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vercel · 2026-04-22T20:11:13Z

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
fides-plus-nightly	Ignored	Preview	Apr 22, 2026 8:44pm
fides-privacy-center	Ignored		Apr 22, 2026 8:44pm

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The bare db.execute() fails when a transaction is already active on the session (e.g. load_samples path). Wrapping in db.begin() creates a savepoint in that case, matching the old get_resource behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-04-22T20:53:54Z

Codecov Report

❌ Patch coverage is 91.30435% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.97%. Comparing base (0272c6a) to head (824b5f9).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/fides/api/db/system.py	91.30%	0 Missing and 2 partials ⚠️

❌ Your project check has failed because the head coverage (84.97%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #8009   +/-   ##
=======================================
  Coverage   84.96%   84.97%           
=======================================
  Files         631      631           
  Lines       41239    41252   +13     
  Branches     4787     4790    +3     
=======================================
+ Hits        35040    35054   +14     
+ Misses       5113     5112    -1     
  Partials     1086     1086

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

erosselli and others added 2 commits April 22, 2026 17:00

Add changelog entry for system upsert N+1 fix

925e1dc

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

erosselli and others added 2 commits April 22, 2026 17:15

Fix changelog PR number to 8009

fc6087b

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix N+1 query performance in system upsert validation#8009

Fix N+1 query performance in system upsert validation#8009
erosselli wants to merge 4 commits intomainfrom
erosselli/fix-system-upsert-n-plus-1

erosselli commented Apr 22, 2026

Uh oh!

vercel Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erosselli commented Apr 22, 2026

Description

The problem

The fix

Code Changes

Steps to Confirm

Pre-Merge Checklist

Uh oh!

vercel Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Apr 22, 2026 •

edited

Loading

codecov Bot commented Apr 22, 2026 •

edited

Loading