Backend: Optimize leaderboard query with two-stage fetch to reduce payload#5087
Backend: Optimize leaderboard query with two-stage fetch to reduce payload#5087akanshajain231999 wants to merge 4 commits into
Conversation
Refactor calculate_distinct_sorted_leaderboard_data to avoid transferring heavy JSON columns (result, error, submission_metadata) for rows that get discarded by per-team deduplication. Stage 1 fetches a lightweight projection used for sort + dedup; stage 2 re-fetches full rows only for the retained IDs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Reformat unique_team_ids set comprehension to satisfy black. - Add "id" key to mocked test_data in five TestReorderSubmissionsComparator cases so stage-2 fetch (filter id__in=...) does not KeyError. - Switch test_multiple_banned_participants_in_team to use _create_mock_leaderboard_chain helper for the stage-2 chain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5087 +/- ##
==========================================
- Coverage 91.72% 91.69% -0.03%
==========================================
Files 96 96
Lines 8155 8168 +13
==========================================
+ Hits 7480 7490 +10
- Misses 675 678 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
📝 WalkthroughWalkthroughThis PR refactors leaderboard data fetching from a single annotated queryset into a two-stage pipeline: stage 1 builds a lightweight projection with minimal fields to filter and deduplicate candidates, then stage 2 fetches only full heavy rows for retained IDs. Tests are updated to include ID fields and use a standardized mocking helper for the queryset chain pattern. ChangesLeaderboard Two-Stage Pipeline Optimization
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/unit/jobs/test_utils.py (1)
1228-1245: ⚡ Quick winModel stage-2 queryset separately so tests actually validate
id__infiltering.
_create_mock_leaderboard_chaincurrently returns the same queryset/value payload for both stages, so regressions in stage-2filter(id__in=...)can slip through undetected.Proposed helper tightening
def _create_mock_leaderboard_chain( self, mock_leaderboard_data_objects, test_data ): @@ - mock_qs = MagicMock() - mock_filter_result = MagicMock() - mock_leaderboard_data_objects.filter.return_value = mock_filter_result - mock_filter_result.exclude.return_value = mock_qs - mock_filter_result.annotate.return_value = mock_qs - mock_qs.filter.return_value = mock_qs - mock_qs.order_by.return_value = mock_qs - mock_qs.annotate.return_value = mock_qs - mock_qs.values.return_value = test_data - return mock_qs + stage1_qs = MagicMock() + stage2_filter_qs = MagicMock() + stage2_qs = MagicMock() + captured_stage2_ids = {"value": []} + + def _filter_side_effect(*args, **kwargs): + if "id__in" in kwargs: + captured_stage2_ids["value"] = list(kwargs["id__in"]) + return stage2_filter_qs + return stage1_qs + + mock_leaderboard_data_objects.filter.side_effect = _filter_side_effect + + # Stage 1 chain + stage1_qs.exclude.return_value = stage1_qs + stage1_qs.filter.return_value = stage1_qs + stage1_qs.order_by.return_value = stage1_qs + stage1_qs.annotate.return_value = stage1_qs + stage1_qs.values.return_value = test_data + + # Stage 2 chain + stage2_filter_qs.annotate.return_value = stage2_qs + stage2_qs.values.side_effect = lambda *_, **__: [ + row for row in test_data if row["id"] in set(captured_stage2_ids["value"]) + ] + + return stage1_qs🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/jobs/test_utils.py` around lines 1228 - 1245, The helper _create_mock_leaderboard_chain currently returns the same mock queryset/result for both stage-1 and stage-2, so tests don't validate the stage-2 filter(id__in=...) behavior; change it to return distinct mocks depending on the filter call: keep the existing mock for the first filter chain (mock_filter_result -> mock_qs_stage1 with .exclude(), .annotate(), .order_by(), .values() -> test_data_stage1) and create a separate mock (mock_qs_stage2) for calls that include id__in (mock_leaderboard_data_objects.filter should be configured via side_effect or a small router function that inspects kwargs and returns mock_filter_result for normal filters but mock_qs_stage2 for filter(id__in=...); then wire mock_qs_stage2.annotate().values() to return the stage-2 test_data). Ensure you reference and update mocks named mock_leaderboard_data_objects, mock_filter_result, mock_qs (split into mock_qs_stage1 and mock_qs_stage2) and the .filter side_effect to make stage-2 filtering actually exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/jobs/utils.py`:
- Around line 517-574: The second-stage fetch re-queries LeaderboardData via
LeaderboardData.objects.filter(id__in=retained_ids) which loses the stage-1
predicates (the original leaderboard_data queryset) so rows that became
ineligible between queries can be reintroduced; fix by basing heavy_qs on the
stage-1 queryset (e.g., use leaderboard_data.filter(id__in=retained_ids) or
chain .filter(...) onto the original leaderboard_data) before calling
.annotate(...) and .values(...), ensuring retained_ids and retained_light are
still used but the original visibility/eligibility filters are preserved.
- Around line 432-449: In calculate_distinct_sorted_leaderboard_data change the
stage-1 projection (leaderboard_data_light) to stop including the full "error"
JSON—only include the extracted filtering_error (from RawSQL "error->>%s") and
other lightweight fields—so you don't fetch heavy JSON per row; then ensure
stage-2, where you currently do
LeaderboardData.objects.filter(id__in=retained_ids), re-applies the same stage-1
filters (e.g. submission__is_public, any other visibility/status filters used
when building leaderboard_data) before returning final rows so the final result
set is constrained identically to stage-1 and avoids TOCTOU issues.
---
Nitpick comments:
In `@tests/unit/jobs/test_utils.py`:
- Around line 1228-1245: The helper _create_mock_leaderboard_chain currently
returns the same mock queryset/result for both stage-1 and stage-2, so tests
don't validate the stage-2 filter(id__in=...) behavior; change it to return
distinct mocks depending on the filter call: keep the existing mock for the
first filter chain (mock_filter_result -> mock_qs_stage1 with .exclude(),
.annotate(), .order_by(), .values() -> test_data_stage1) and create a separate
mock (mock_qs_stage2) for calls that include id__in
(mock_leaderboard_data_objects.filter should be configured via side_effect or a
small router function that inspects kwargs and returns mock_filter_result for
normal filters but mock_qs_stage2 for filter(id__in=...); then wire
mock_qs_stage2.annotate().values() to return the stage-2 test_data). Ensure you
reference and update mocks named mock_leaderboard_data_objects,
mock_filter_result, mock_qs (split into mock_qs_stage1 and mock_qs_stage2) and
the .filter side_effect to make stage-2 filtering actually exercised.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 11eda65a-5774-4032-a607-fb8c5e74f12c
📒 Files selected for processing (2)
apps/jobs/utils.pytests/unit/jobs/test_utils.py
| leaderboard_data_light = leaderboard_data.annotate( | ||
| filtering_score=RawSQL( | ||
| "result->>%s", (default_order_by,), output_field=FloatField() | ||
| ), | ||
| filtering_error=RawSQL( | ||
| "error->>%s", | ||
| ("error_{0}".format(default_order_by),), | ||
| output_field=FloatField(), | ||
| ), | ||
| ).values( | ||
| "id", | ||
| "submission__participant_team", | ||
| "submission__participant_team__team_name", | ||
| "submission__is_baseline", | ||
| "error", | ||
| "filtering_score", | ||
| "filtering_error", | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate and print the relevant section in apps/jobs/utils.py
FILE="apps/jobs/utils.py"
echo "== Lines around 420-620 in $FILE =="
sed -n '400,620p' "$FILE" | nl -ba | sed -n '1,240p'
echo
echo "== Extract stage-1 light projection (.annotate/.values) =="
python3 - <<'PY'
import re, itertools
path="apps/jobs/utils.py"
with open(path,'r',encoding='utf-8') as f:
lines=f.readlines()
start=0
for i,l in enumerate(lines):
if "leaderboard_data_light = leaderboard_data.annotate" in l:
start=i-5
break
else:
raise SystemExit("Could not find leaderboard_data_light annotate")
end=start+200
for idx in range(start, min(end,len(lines))):
print(f"{idx+1:4d}: {lines[idx].rstrip()}")
PY
echo
echo "== Check whether stage-2 uses base manager or the filtered leaderboard_data queryset =="
rg -n "LeaderboardData\.objects\.filter|leaderboard_data\.filter\(" "$FILE" || true
rg -n "retained_ids|leaderboard_data_light|leaderboard_data_heavy|leaderboard_data =" "$FILE" || true
echo
echo "== Print the stage-2 heavy fetch block using retained_ids =="
python3 - <<'PY'
import re
path="apps/jobs/utils.py"
with open(path,'r',encoding='utf-8') as f:
s=f.read()
m=re.search(r"retained_ids.*?LeaderboardData\.objects\.filter.*?(\.values|\.annotate|\))", s, flags=re.S)
if not m:
# fallback: print around retained_ids occurrence
lines=s.splitlines()
for i,l in enumerate(lines):
if "retained_ids" in l:
a=max(0,i-40); b=min(len(lines),i+160)
for j in range(a,b):
print(f"{j+1:4d}: {lines[j]}")
break
else:
raise SystemExit("Could not find retained_ids")
else:
# show broader context around match start
start=s[:m.start()].count("\n")
lines=s.splitlines()
a=max(0,start-40); b=min(len(lines), start+220)
for j in range(a,b):
print(f"{j+1:4d}: {lines[j]}")
PY
echo
echo "== Check apps/challenges/models.py for LeaderboardData.error JSONField =="
rg -n "class LeaderboardData|error\s*=\s*models\.JSONField" apps/challenges/models.py
python3 - <<'PY'
import re
path="apps/challenges/models.py"
with open(path,'r',encoding='utf-8') as f:
lines=f.readlines()
for i,l in enumerate(lines):
if re.search(r"error\s*=\s*models\.JSONField", l):
a=max(0,i-10); b=min(len(lines), i+20)
for j in range(a,b):
print(f"{j+1:4d}: {lines[j].rstrip()}")
break
else:
raise SystemExit("Could not find error JSONField")
PYRepository: Cloud-CV/EvalAI
Length of output: 169
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# run the scriptRepository: Cloud-CV/EvalAI
Length of output: 41
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="apps/jobs/utils.py"
echo "== Excerpt apps/jobs/utils.py:420-620 =="
python3 - <<'PY'
path="apps/jobs/utils.py"
lo,hi=420,620
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if lo <= i <= hi:
print(f"{i:4d}: {line.rstrip()}")
if i > hi:
break
PY
echo
echo "== Find stage-2 heavy fetch block around retained_ids =="
python3 - <<'PY'
import re
path="apps/jobs/utils.py"
s=open(path,'r',encoding='utf-8').read().splitlines()
# locate first occurrence of retained_ids
idx=None
for i,l in enumerate(s):
if "retained_ids" in l:
idx=i
break
if idx is None:
raise SystemExit("No retained_ids found")
start=max(0,idx-60)
end=min(len(s),idx+220)
for j in range(start,end):
print(f"{j+1:4d}: {s[j]}")
PY
echo
echo "== Quick searches for key query sources =="
rg -n "LeaderboardData\.objects\.filter|leaderboard_data\.filter|leaderboard_data_light|leaderboard_data_heavy|retained_ids" "$FILE" || true
echo
echo "== Check apps/challenges/models.py for LeaderboardData.error JSONField =="
python3 - <<'PY'
import re
path="apps/challenges/models.py"
s=open(path,'r',encoding='utf-8').read().splitlines()
# find line containing "error = models.JSONField"
for i,l in enumerate(s):
if re.search(r"\berror\b\s*=\s*models\.JSONField", l):
start=max(0,i-15); end=min(len(s),i+15)
for j in range(start,end):
print(f"{j+1:4d}: {s[j]}")
break
else:
raise SystemExit("Could not find LeaderboardData.error = models.JSONField")
PYRepository: Cloud-CV/EvalAI
Length of output: 19990
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Excerpt apps/jobs/utils.py:330-460 (filters before stage 1) =="
python3 - <<'PY'
path="apps/jobs/utils.py"
lo,hi=330,460
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if lo <= i <= hi:
print(f"{i:4d}: {line.rstrip()}")
if i > hi:
break
PY
echo
echo "== Locate LeaderboardData model definition and 'error' field in apps/challenges/models.py =="
python3 - <<'PY'
import re
path="apps/challenges/models.py"
lines=open(path,'r',encoding='utf-8').read().splitlines()
# find class LeaderboardData
cls=None
for i,l in enumerate(lines):
if re.search(r'^class\s+LeaderboardData\b', l):
cls=i
break
if cls is None:
raise SystemExit("class LeaderboardData not found")
start=cls
end=min(len(lines), cls+220)
for i in range(start,end):
if 'error' in lines[i]:
pass
print(f"-- class LeaderboardData starts at line {cls+1} --")
for i in range(start, min(end, start+220)):
print(f"{i+1:4d}: {lines[i].rstrip()}")
PY
echo
echo "== Search for JSONField usage with 'error' near it =="
rg -n "error" apps/challenges/models.py || true
rg -n "JSONField" apps/challenges/models.py || trueRepository: Cloud-CV/EvalAI
Length of output: 15273
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="apps/jobs/utils.py"
echo "== Top of apps/jobs/utils.py (imports) =="
python3 - <<'PY'
path="apps/jobs/utils.py"
with open(path,'r',encoding='utf-8') as f:
for i in range(1,120):
line=f.readline()
if not line: break
print(f"{i:4d}: {line.rstrip()}")
PY
echo
echo "== Search for Coalesce import/usages =="
rg -n "Coalesce" "$FILE" || trueRepository: Cloud-CV/EvalAI
Length of output: 5418
Fix stage-1 projection to not fetch full error JSON and ensure stage-2 reapplies stage-1 filters
calculate_distinct_sorted_leaderboard_datastage 1 includes"error"in.values(...), contradicting the “no heavy JSON columns” goal; this fetches the fullLeaderboardData.errorJSON for every candidate row.- Stage 2 starts from
LeaderboardData.objects.filter(id__in=retained_ids)and does not reapply the stage-1 filters (includingsubmission__is_public), so visibility/status constraints are only guaranteed via TOCTOU luck.
Suggested change
@@
leaderboard_data_light = leaderboard_data.annotate(
filtering_score=RawSQL(
"result->>%s", (default_order_by,), output_field=FloatField()
),
filtering_error=RawSQL(
"error->>%s",
("error_{0}".format(default_order_by),),
output_field=FloatField(),
),
).values(
"id",
"submission__participant_team",
"submission__participant_team__team_name",
"submission__is_baseline",
- "error",
"filtering_score",
"filtering_error",
)
@@
for leaderboard_item in leaderboard_data_light:
@@
- if leaderboard_item["error"] is None:
+ if leaderboard_item["filtering_error"] is None:
leaderboard_item["filtering_error"] = 0
if leaderboard_item["filtering_score"] is None:
leaderboard_item["filtering_score"] = 0
@@
# Stage 2: fetch full row data (with heavy JSON columns) only for the
# retained leaderboard rows, then reapply the order from stage 1.
retained_ids = [item["id"] for item in retained_light]
- heavy_qs = LeaderboardData.objects.filter(id__in=retained_ids).annotate(
+ heavy_qs = leaderboard_data.filter(id__in=retained_ids).annotate(
filtering_score=RawSQL(
"result->>%s", (default_order_by,), output_field=FloatField()
),
filtering_error=RawSQL(
"error->>%s",
("error_{0}".format(default_order_by),),
output_field=FloatField(),
),
)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/jobs/utils.py` around lines 432 - 449, In
calculate_distinct_sorted_leaderboard_data change the stage-1 projection
(leaderboard_data_light) to stop including the full "error" JSON—only include
the extracted filtering_error (from RawSQL "error->>%s") and other lightweight
fields—so you don't fetch heavy JSON per row; then ensure stage-2, where you
currently do LeaderboardData.objects.filter(id__in=retained_ids), re-applies the
same stage-1 filters (e.g. submission__is_public, any other visibility/status
filters used when building leaderboard_data) before returning final rows so the
final result set is constrained identically to stage-1 and avoids TOCTOU issues.
| retained_ids = [item["id"] for item in retained_light] | ||
| heavy_qs = LeaderboardData.objects.filter(id__in=retained_ids).annotate( | ||
| filtering_score=RawSQL( | ||
| "result->>%s", (default_order_by,), output_field=FloatField() | ||
| ), | ||
| filtering_error=RawSQL( | ||
| "error->>%s", | ||
| ("error_{0}".format(default_order_by),), | ||
| output_field=FloatField(), | ||
| ), | ||
| ) | ||
| if challenge_phase_split.show_execution_time: | ||
| time_diff_expression = ExpressionWrapper( | ||
| F("submission__completed_at") - F("submission__started_at"), | ||
| output_field=fields.DurationField(), | ||
| ) | ||
| heavy_qs = heavy_qs.annotate( | ||
| submission__execution_time=time_diff_expression, | ||
| ).values( | ||
| "id", | ||
| "submission__participant_team", | ||
| "submission__participant_team__team_name", | ||
| "submission__participant_team__team_url", | ||
| "submission__is_baseline", | ||
| "submission__is_public", | ||
| "challenge_phase_split", | ||
| "result", | ||
| "error", | ||
| "filtering_score", | ||
| "filtering_error", | ||
| "leaderboard__schema", | ||
| "submission__submitted_at", | ||
| "submission__method_name", | ||
| "submission__id", | ||
| "submission__submission_metadata", | ||
| "submission__execution_time", | ||
| "submission__is_verified_by_host", | ||
| ) | ||
| else: | ||
| heavy_qs = heavy_qs.values( | ||
| "id", | ||
| "submission__participant_team", | ||
| "submission__participant_team__team_name", | ||
| "submission__participant_team__team_url", | ||
| "submission__is_baseline", | ||
| "submission__is_public", | ||
| "challenge_phase_split", | ||
| "result", | ||
| "error", | ||
| "filtering_score", | ||
| "filtering_error", | ||
| "leaderboard__schema", | ||
| "submission__submitted_at", | ||
| "submission__method_name", | ||
| "submission__id", | ||
| "submission__submission_metadata", | ||
| "submission__is_verified_by_host", | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate and show the relevant code block
FILE="apps/jobs/utils.py"
python - <<'PY'
import os, itertools
path="apps/jobs/utils.py"
start=480
end=640
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if start <= i <= end:
print(f"{i:4d}:{line.rstrip()}")
if i > end:
break
PY
# Also capture where the stage-1 queryset is defined (look for likely variable names)
rg -n "leaderboard_data|retained_light|stage.?1|is_disabled|host.*exclude|public" apps/jobs/utils.py || true
# Show definition of any queryset variable used in the snippet (e.g., leaderboard_data)
rg -n "leaderboard_data\s*=" apps/jobs/utils.py || trueRepository: Cloud-CV/EvalAI
Length of output: 9444
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="apps/jobs/utils.py"
python - <<'PY'
path="apps/jobs/utils.py"
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if 360 <= i <= 470:
print(f"{i:4d}:{line.rstrip()}")
PY
rg -n "only_public_entries|exclude\(|is_disabled|is_public|host|flag|banned|all_banned|participant" apps/jobs/utils.pyRepository: Cloud-CV/EvalAI
Length of output: 9110
Keep stage-2 fetch scoped to the stage-1 filtered leaderboard_data queryset
Stage 1 builds leaderboard_data with is_disabled=False, host exclusion, submission__is_flagged=False, valid submission statuses, challenge_phase_split=..., and (when only_public_entries) submission__is_public=True; stage 2 re-fetches from LeaderboardData.objects by id__in=retained_ids without reapplying those predicates, so DB changes between queries can reintroduce entries that should no longer appear (the code only skips rows that disappeared, not rows whose eligibility/visibility changed).
Suggested change
-retained_ids = [item["id"] for item in retained_light]
-heavy_qs = LeaderboardData.objects.filter(id__in=retained_ids).annotate(
+retained_ids = [item["id"] for item in retained_light]
+heavy_qs = leaderboard_data.filter(id__in=retained_ids).annotate(
filtering_score=RawSQL(
"result->>%s", (default_order_by,), output_field=FloatField()
),🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/jobs/utils.py` around lines 517 - 574, The second-stage fetch re-queries
LeaderboardData via LeaderboardData.objects.filter(id__in=retained_ids) which
loses the stage-1 predicates (the original leaderboard_data queryset) so rows
that became ineligible between queries can be reintroduced; fix by basing
heavy_qs on the stage-1 queryset (e.g., use
leaderboard_data.filter(id__in=retained_ids) or chain .filter(...) onto the
original leaderboard_data) before calling .annotate(...) and .values(...),
ensuring retained_ids and retained_light are still used but the original
visibility/eligibility filters are preserved.
Summary
calculate_distinct_sorted_leaderboard_datainapps/jobs/utils.pyinto a two-stage query to avoid transferring heavy JSON columns (result,error,submission_metadata) for rows discarded by per-team deduplication.submission__execution_timeannotation) only for the retained IDs, preserving stage-1 order.tests/unit/jobs/test_utils.pyto includeidon test rows and extend the mock query chain for the stage-2 fetch.Test plan
pytest tests/unit/jobs/test_utils.pypasses locally🤖 Generated with Claude Code
Summary by CodeRabbit
Refactor
Tests