Skip to content

Backend: Optimize leaderboard query with two-stage fetch to reduce payload#5087

Open
akanshajain231999 wants to merge 4 commits into
Cloud-CV:masterfrom
akanshajain231999:sentry_fix4
Open

Backend: Optimize leaderboard query with two-stage fetch to reduce payload#5087
akanshajain231999 wants to merge 4 commits into
Cloud-CV:masterfrom
akanshajain231999:sentry_fix4

Conversation

@akanshajain231999

@akanshajain231999 akanshajain231999 commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Refactor calculate_distinct_sorted_leaderboard_data in apps/jobs/utils.py into a two-stage query to avoid transferring heavy JSON columns (result, error, submission_metadata) for rows discarded by per-team deduplication.
  • Stage 1 fetches a lightweight projection (id + dedup/sort fields) and runs the sort + per-team dedup in Python.
  • Stage 2 re-fetches full row data (including JSON columns and the optional submission__execution_time annotation) only for the retained IDs, preserving stage-1 order.
  • Update tests/unit/jobs/test_utils.py to include id on test rows and extend the mock query chain for the stage-2 fetch.

Test plan

  • pytest tests/unit/jobs/test_utils.py passes locally
  • Manually hit the leaderboard endpoint on a populated challenge and confirm row contents (scores, error, metadata, execution time when enabled) are unchanged
  • Confirm Sentry query duration drops on the previously slow challenge phase split

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Refactor

    • Optimized leaderboard data calculation to use a two-stage data retrieval pipeline, reducing data transfer and improving performance.
  • Tests

    • Updated leaderboard test fixtures and mocking patterns to align with the refactored data calculation logic.

Review Change Stack

akanshajain231999 and others added 2 commits May 19, 2026 23:58
Refactor calculate_distinct_sorted_leaderboard_data to avoid transferring
heavy JSON columns (result, error, submission_metadata) for rows that get
discarded by per-team deduplication. Stage 1 fetches a lightweight
projection used for sort + dedup; stage 2 re-fetches full rows only for
the retained IDs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Reformat unique_team_ids set comprehension to satisfy black.
- Add "id" key to mocked test_data in five TestReorderSubmissionsComparator
  cases so stage-2 fetch (filter id__in=...) does not KeyError.
- Switch test_multiple_banned_participants_in_team to use
  _create_mock_leaderboard_chain helper for the stage-2 chain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codecov

codecov Bot commented May 20, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 88.46154% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.69%. Comparing base (7d19d31) to head (9e33e6c).

Files with missing lines Patch % Lines
apps/jobs/utils.py 88.46% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5087      +/-   ##
==========================================
- Coverage   91.72%   91.69%   -0.03%     
==========================================
  Files          96       96              
  Lines        8155     8168      +13     
==========================================
+ Hits         7480     7490      +10     
- Misses        675      678       +3     
Flag Coverage Δ
backend 95.16% <88.46%> (-0.04%) ⬇️
frontend 87.41% <ø> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Accounts & Authentication 97.31% <ø> (ø)
Challenges Management 95.52% <ø> (ø)
Job Processing 89.71% <88.46%> (-0.09%) ⬇️
Participants & Teams 100.00% <ø> (ø)
Challenge Hosts 100.00% <ø> (ø)
Analytics 100.00% <ø> (ø)
Web Interface 100.00% <ø> (ø)
Frontend (Gulp) 87.41% <ø> (-0.03%) ⬇️
All Models 97.80% <ø> (ø)
All Views 100.00% <ø> (ø)
All Serializers 98.60% <ø> (ø)
Utility Functions 96.57% <88.46%> (-0.10%) ⬇️
Core Configuration 82.35% <ø> (ø)
Files with missing lines Coverage Δ
apps/jobs/utils.py 93.20% <88.46%> (-0.48%) ⬇️

... and 1 file with indirect coverage changes

Files with missing lines Coverage Δ
apps/jobs/utils.py 93.20% <88.46%> (-0.48%) ⬇️

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d19d31...9e33e6c. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR refactors leaderboard data fetching from a single annotated queryset into a two-stage pipeline: stage 1 builds a lightweight projection with minimal fields to filter and deduplicate candidates, then stage 2 fetches only full heavy rows for retained IDs. Tests are updated to include ID fields and use a standardized mocking helper for the queryset chain pattern.

Changes

Leaderboard Two-Stage Pipeline Optimization

Layer / File(s) Summary
Stage 1 lightweight projection and slicing
apps/jobs/utils.py
Annotates filtering_score and filtering_error from JSON fields using RawSQL; projects minimal fields (id, team, baseline, score/error); slices by MAX_LEADERBOARD_QUERY_LIMIT; derives unique_team_ids for bulk prefetch.
Light data iteration and normalization
apps/jobs/utils.py
Iterates over lightweight records; normalizes filtering_error/filtering_score nulls to 0; selects sort order (latest-submission or score/error-based with conditional ascending/descending); builds retained_light accumulator with team-name deduplication while preserving baseline rows.
Stage 2 heavy data fetch and result reconstruction
apps/jobs/utils.py
Fetches full heavy rows only for retained IDs (with optional execution-time annotation); materializes row lookup; reconstructs distinct_sorted_leaderboard_data in stage-1 order; skips rows removed between stages; fills missing filtering_score/filtering_error defaults.
Test fixture synchronization and mocking helper updates
tests/unit/jobs/test_utils.py
Adds explicit id fields to leaderboard test entries across test_bulk_prefetch_avoids_n_plus_one_queries, test_multiple_banned_participants_in_team, test_empty_banned_email_ids_includes_all_teams, test_none_banned_email_ids_includes_all_teams, test_distinct_team_list_with_many_duplicates, and test_baseline_entries_always_included; refactors tests to use shared _create_mock_leaderboard_chain helper; updates helper's docstring and .annotate() mocking to support two-stage queryset chain.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A leaderboard hops in two swift bounds,
Stage one gathers light, stripped of weighty grounds,
Stage two fetches full, for keepers alone,
Dedup and reorder—a pipeline well known!
Tests now include IDs, mocking chains align,
Two stages dance through the queryset line.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main optimization: a two-stage fetch strategy for leaderboard queries designed to reduce payload by avoiding unnecessary data transfer.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@RishabhJain2018

Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/unit/jobs/test_utils.py (1)

1228-1245: ⚡ Quick win

Model stage-2 queryset separately so tests actually validate id__in filtering.

_create_mock_leaderboard_chain currently returns the same queryset/value payload for both stages, so regressions in stage-2 filter(id__in=...) can slip through undetected.

Proposed helper tightening
 def _create_mock_leaderboard_chain(
     self, mock_leaderboard_data_objects, test_data
 ):
@@
-    mock_qs = MagicMock()
-    mock_filter_result = MagicMock()
-    mock_leaderboard_data_objects.filter.return_value = mock_filter_result
-    mock_filter_result.exclude.return_value = mock_qs
-    mock_filter_result.annotate.return_value = mock_qs
-    mock_qs.filter.return_value = mock_qs
-    mock_qs.order_by.return_value = mock_qs
-    mock_qs.annotate.return_value = mock_qs
-    mock_qs.values.return_value = test_data
-    return mock_qs
+    stage1_qs = MagicMock()
+    stage2_filter_qs = MagicMock()
+    stage2_qs = MagicMock()
+    captured_stage2_ids = {"value": []}
+
+    def _filter_side_effect(*args, **kwargs):
+        if "id__in" in kwargs:
+            captured_stage2_ids["value"] = list(kwargs["id__in"])
+            return stage2_filter_qs
+        return stage1_qs
+
+    mock_leaderboard_data_objects.filter.side_effect = _filter_side_effect
+
+    # Stage 1 chain
+    stage1_qs.exclude.return_value = stage1_qs
+    stage1_qs.filter.return_value = stage1_qs
+    stage1_qs.order_by.return_value = stage1_qs
+    stage1_qs.annotate.return_value = stage1_qs
+    stage1_qs.values.return_value = test_data
+
+    # Stage 2 chain
+    stage2_filter_qs.annotate.return_value = stage2_qs
+    stage2_qs.values.side_effect = lambda *_, **__: [
+        row for row in test_data if row["id"] in set(captured_stage2_ids["value"])
+    ]
+
+    return stage1_qs
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/jobs/test_utils.py` around lines 1228 - 1245, The helper
_create_mock_leaderboard_chain currently returns the same mock queryset/result
for both stage-1 and stage-2, so tests don't validate the stage-2
filter(id__in=...) behavior; change it to return distinct mocks depending on the
filter call: keep the existing mock for the first filter chain
(mock_filter_result -> mock_qs_stage1 with .exclude(), .annotate(), .order_by(),
.values() -> test_data_stage1) and create a separate mock (mock_qs_stage2) for
calls that include id__in (mock_leaderboard_data_objects.filter should be
configured via side_effect or a small router function that inspects kwargs and
returns mock_filter_result for normal filters but mock_qs_stage2 for
filter(id__in=...); then wire mock_qs_stage2.annotate().values() to return the
stage-2 test_data). Ensure you reference and update mocks named
mock_leaderboard_data_objects, mock_filter_result, mock_qs (split into
mock_qs_stage1 and mock_qs_stage2) and the .filter side_effect to make stage-2
filtering actually exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/jobs/utils.py`:
- Around line 517-574: The second-stage fetch re-queries LeaderboardData via
LeaderboardData.objects.filter(id__in=retained_ids) which loses the stage-1
predicates (the original leaderboard_data queryset) so rows that became
ineligible between queries can be reintroduced; fix by basing heavy_qs on the
stage-1 queryset (e.g., use leaderboard_data.filter(id__in=retained_ids) or
chain .filter(...) onto the original leaderboard_data) before calling
.annotate(...) and .values(...), ensuring retained_ids and retained_light are
still used but the original visibility/eligibility filters are preserved.
- Around line 432-449: In calculate_distinct_sorted_leaderboard_data change the
stage-1 projection (leaderboard_data_light) to stop including the full "error"
JSON—only include the extracted filtering_error (from RawSQL "error->>%s") and
other lightweight fields—so you don't fetch heavy JSON per row; then ensure
stage-2, where you currently do
LeaderboardData.objects.filter(id__in=retained_ids), re-applies the same stage-1
filters (e.g. submission__is_public, any other visibility/status filters used
when building leaderboard_data) before returning final rows so the final result
set is constrained identically to stage-1 and avoids TOCTOU issues.

---

Nitpick comments:
In `@tests/unit/jobs/test_utils.py`:
- Around line 1228-1245: The helper _create_mock_leaderboard_chain currently
returns the same mock queryset/result for both stage-1 and stage-2, so tests
don't validate the stage-2 filter(id__in=...) behavior; change it to return
distinct mocks depending on the filter call: keep the existing mock for the
first filter chain (mock_filter_result -> mock_qs_stage1 with .exclude(),
.annotate(), .order_by(), .values() -> test_data_stage1) and create a separate
mock (mock_qs_stage2) for calls that include id__in
(mock_leaderboard_data_objects.filter should be configured via side_effect or a
small router function that inspects kwargs and returns mock_filter_result for
normal filters but mock_qs_stage2 for filter(id__in=...); then wire
mock_qs_stage2.annotate().values() to return the stage-2 test_data). Ensure you
reference and update mocks named mock_leaderboard_data_objects,
mock_filter_result, mock_qs (split into mock_qs_stage1 and mock_qs_stage2) and
the .filter side_effect to make stage-2 filtering actually exercised.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 11eda65a-5774-4032-a607-fb8c5e74f12c

📥 Commits

Reviewing files that changed from the base of the PR and between 7d19d31 and 9e33e6c.

📒 Files selected for processing (2)
  • apps/jobs/utils.py
  • tests/unit/jobs/test_utils.py

Comment thread apps/jobs/utils.py
Comment on lines +432 to +449
leaderboard_data_light = leaderboard_data.annotate(
filtering_score=RawSQL(
"result->>%s", (default_order_by,), output_field=FloatField()
),
filtering_error=RawSQL(
"error->>%s",
("error_{0}".format(default_order_by),),
output_field=FloatField(),
),
).values(
"id",
"submission__participant_team",
"submission__participant_team__team_name",
"submission__is_baseline",
"error",
"filtering_score",
"filtering_error",
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate and print the relevant section in apps/jobs/utils.py
FILE="apps/jobs/utils.py"
echo "== Lines around 420-620 in $FILE =="
sed -n '400,620p' "$FILE" | nl -ba | sed -n '1,240p'

echo
echo "== Extract stage-1 light projection (.annotate/.values) =="
python3 - <<'PY'
import re, itertools
path="apps/jobs/utils.py"
with open(path,'r',encoding='utf-8') as f:
    lines=f.readlines()

start=0
for i,l in enumerate(lines):
    if "leaderboard_data_light = leaderboard_data.annotate" in l:
        start=i-5
        break
else:
    raise SystemExit("Could not find leaderboard_data_light annotate")

end=start+200
for idx in range(start, min(end,len(lines))):
    print(f"{idx+1:4d}: {lines[idx].rstrip()}")
PY

echo
echo "== Check whether stage-2 uses base manager or the filtered leaderboard_data queryset =="
rg -n "LeaderboardData\.objects\.filter|leaderboard_data\.filter\(" "$FILE" || true
rg -n "retained_ids|leaderboard_data_light|leaderboard_data_heavy|leaderboard_data =" "$FILE" || true

echo
echo "== Print the stage-2 heavy fetch block using retained_ids =="
python3 - <<'PY'
import re
path="apps/jobs/utils.py"
with open(path,'r',encoding='utf-8') as f:
    s=f.read()

m=re.search(r"retained_ids.*?LeaderboardData\.objects\.filter.*?(\.values|\.annotate|\))", s, flags=re.S)
if not m:
    # fallback: print around retained_ids occurrence
    lines=s.splitlines()
    for i,l in enumerate(lines):
        if "retained_ids" in l:
            a=max(0,i-40); b=min(len(lines),i+160)
            for j in range(a,b):
                print(f"{j+1:4d}: {lines[j]}")
            break
    else:
        raise SystemExit("Could not find retained_ids")
else:
    # show broader context around match start
    start=s[:m.start()].count("\n")
    lines=s.splitlines()
    a=max(0,start-40); b=min(len(lines), start+220)
    for j in range(a,b):
        print(f"{j+1:4d}: {lines[j]}")
PY

echo
echo "== Check apps/challenges/models.py for LeaderboardData.error JSONField =="
rg -n "class LeaderboardData|error\s*=\s*models\.JSONField" apps/challenges/models.py
python3 - <<'PY'
import re
path="apps/challenges/models.py"
with open(path,'r',encoding='utf-8') as f:
    lines=f.readlines()
for i,l in enumerate(lines):
    if re.search(r"error\s*=\s*models\.JSONField", l):
        a=max(0,i-10); b=min(len(lines), i+20)
        for j in range(a,b):
            print(f"{j+1:4d}: {lines[j].rstrip()}")
        break
else:
    raise SystemExit("Could not find error JSONField")
PY

Repository: Cloud-CV/EvalAI

Length of output: 169


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# run the script

Repository: Cloud-CV/EvalAI

Length of output: 41


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="apps/jobs/utils.py"

echo "== Excerpt apps/jobs/utils.py:420-620 =="
python3 - <<'PY'
path="apps/jobs/utils.py"
lo,hi=420,620
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if lo <= i <= hi:
            print(f"{i:4d}: {line.rstrip()}")
        if i > hi:
            break
PY

echo
echo "== Find stage-2 heavy fetch block around retained_ids =="
python3 - <<'PY'
import re
path="apps/jobs/utils.py"
s=open(path,'r',encoding='utf-8').read().splitlines()
# locate first occurrence of retained_ids
idx=None
for i,l in enumerate(s):
    if "retained_ids" in l:
        idx=i
        break
if idx is None:
    raise SystemExit("No retained_ids found")
start=max(0,idx-60)
end=min(len(s),idx+220)
for j in range(start,end):
    print(f"{j+1:4d}: {s[j]}")
PY

echo
echo "== Quick searches for key query sources =="
rg -n "LeaderboardData\.objects\.filter|leaderboard_data\.filter|leaderboard_data_light|leaderboard_data_heavy|retained_ids" "$FILE" || true

echo
echo "== Check apps/challenges/models.py for LeaderboardData.error JSONField =="
python3 - <<'PY'
import re
path="apps/challenges/models.py"
s=open(path,'r',encoding='utf-8').read().splitlines()
# find line containing "error = models.JSONField"
for i,l in enumerate(s):
    if re.search(r"\berror\b\s*=\s*models\.JSONField", l):
        start=max(0,i-15); end=min(len(s),i+15)
        for j in range(start,end):
            print(f"{j+1:4d}: {s[j]}")
        break
else:
    raise SystemExit("Could not find LeaderboardData.error = models.JSONField")
PY

Repository: Cloud-CV/EvalAI

Length of output: 19990


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Excerpt apps/jobs/utils.py:330-460 (filters before stage 1) =="
python3 - <<'PY'
path="apps/jobs/utils.py"
lo,hi=330,460
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if lo <= i <= hi:
            print(f"{i:4d}: {line.rstrip()}")
        if i > hi:
            break
PY

echo
echo "== Locate LeaderboardData model definition and 'error' field in apps/challenges/models.py =="
python3 - <<'PY'
import re
path="apps/challenges/models.py"
lines=open(path,'r',encoding='utf-8').read().splitlines()

# find class LeaderboardData
cls=None
for i,l in enumerate(lines):
    if re.search(r'^class\s+LeaderboardData\b', l):
        cls=i
        break
if cls is None:
    raise SystemExit("class LeaderboardData not found")

start=cls
end=min(len(lines), cls+220)
for i in range(start,end):
    if 'error' in lines[i]:
        pass

print(f"-- class LeaderboardData starts at line {cls+1} --")
for i in range(start, min(end, start+220)):
    print(f"{i+1:4d}: {lines[i].rstrip()}")
PY

echo
echo "== Search for JSONField usage with 'error' near it =="
rg -n "error" apps/challenges/models.py || true
rg -n "JSONField" apps/challenges/models.py || true

Repository: Cloud-CV/EvalAI

Length of output: 15273


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="apps/jobs/utils.py"

echo "== Top of apps/jobs/utils.py (imports) =="
python3 - <<'PY'
path="apps/jobs/utils.py"
with open(path,'r',encoding='utf-8') as f:
    for i in range(1,120):
        line=f.readline()
        if not line: break
        print(f"{i:4d}: {line.rstrip()}")
PY

echo
echo "== Search for Coalesce import/usages =="
rg -n "Coalesce" "$FILE" || true

Repository: Cloud-CV/EvalAI

Length of output: 5418


Fix stage-1 projection to not fetch full error JSON and ensure stage-2 reapplies stage-1 filters

  • calculate_distinct_sorted_leaderboard_data stage 1 includes "error" in .values(...), contradicting the “no heavy JSON columns” goal; this fetches the full LeaderboardData.error JSON for every candidate row.
  • Stage 2 starts from LeaderboardData.objects.filter(id__in=retained_ids) and does not reapply the stage-1 filters (including submission__is_public), so visibility/status constraints are only guaranteed via TOCTOU luck.
Suggested change
@@
     leaderboard_data_light = leaderboard_data.annotate(
         filtering_score=RawSQL(
             "result->>%s", (default_order_by,), output_field=FloatField()
         ),
         filtering_error=RawSQL(
             "error->>%s",
             ("error_{0}".format(default_order_by),),
             output_field=FloatField(),
         ),
     ).values(
         "id",
         "submission__participant_team",
         "submission__participant_team__team_name",
         "submission__is_baseline",
-        "error",
         "filtering_score",
         "filtering_error",
     )
@@
     for leaderboard_item in leaderboard_data_light:
@@
-        if leaderboard_item["error"] is None:
+        if leaderboard_item["filtering_error"] is None:
             leaderboard_item["filtering_error"] = 0
         if leaderboard_item["filtering_score"] is None:
             leaderboard_item["filtering_score"] = 0
@@
     # Stage 2: fetch full row data (with heavy JSON columns) only for the
     # retained leaderboard rows, then reapply the order from stage 1.
     retained_ids = [item["id"] for item in retained_light]
-    heavy_qs = LeaderboardData.objects.filter(id__in=retained_ids).annotate(
+    heavy_qs = leaderboard_data.filter(id__in=retained_ids).annotate(
         filtering_score=RawSQL(
             "result->>%s", (default_order_by,), output_field=FloatField()
         ),
         filtering_error=RawSQL(
             "error->>%s",
             ("error_{0}".format(default_order_by),),
             output_field=FloatField(),
         ),
     )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/jobs/utils.py` around lines 432 - 449, In
calculate_distinct_sorted_leaderboard_data change the stage-1 projection
(leaderboard_data_light) to stop including the full "error" JSON—only include
the extracted filtering_error (from RawSQL "error->>%s") and other lightweight
fields—so you don't fetch heavy JSON per row; then ensure stage-2, where you
currently do LeaderboardData.objects.filter(id__in=retained_ids), re-applies the
same stage-1 filters (e.g. submission__is_public, any other visibility/status
filters used when building leaderboard_data) before returning final rows so the
final result set is constrained identically to stage-1 and avoids TOCTOU issues.

Comment thread apps/jobs/utils.py
Comment on lines +517 to +574
retained_ids = [item["id"] for item in retained_light]
heavy_qs = LeaderboardData.objects.filter(id__in=retained_ids).annotate(
filtering_score=RawSQL(
"result->>%s", (default_order_by,), output_field=FloatField()
),
filtering_error=RawSQL(
"error->>%s",
("error_{0}".format(default_order_by),),
output_field=FloatField(),
),
)
if challenge_phase_split.show_execution_time:
time_diff_expression = ExpressionWrapper(
F("submission__completed_at") - F("submission__started_at"),
output_field=fields.DurationField(),
)
heavy_qs = heavy_qs.annotate(
submission__execution_time=time_diff_expression,
).values(
"id",
"submission__participant_team",
"submission__participant_team__team_name",
"submission__participant_team__team_url",
"submission__is_baseline",
"submission__is_public",
"challenge_phase_split",
"result",
"error",
"filtering_score",
"filtering_error",
"leaderboard__schema",
"submission__submitted_at",
"submission__method_name",
"submission__id",
"submission__submission_metadata",
"submission__execution_time",
"submission__is_verified_by_host",
)
else:
heavy_qs = heavy_qs.values(
"id",
"submission__participant_team",
"submission__participant_team__team_name",
"submission__participant_team__team_url",
"submission__is_baseline",
"submission__is_public",
"challenge_phase_split",
"result",
"error",
"filtering_score",
"filtering_error",
"leaderboard__schema",
"submission__submitted_at",
"submission__method_name",
"submission__id",
"submission__submission_metadata",
"submission__is_verified_by_host",
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate and show the relevant code block
FILE="apps/jobs/utils.py"
python - <<'PY'
import os, itertools
path="apps/jobs/utils.py"
start=480
end=640
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if start <= i <= end:
            print(f"{i:4d}:{line.rstrip()}")
        if i > end:
            break
PY

# Also capture where the stage-1 queryset is defined (look for likely variable names)
rg -n "leaderboard_data|retained_light|stage.?1|is_disabled|host.*exclude|public" apps/jobs/utils.py || true

# Show definition of any queryset variable used in the snippet (e.g., leaderboard_data)
rg -n "leaderboard_data\s*=" apps/jobs/utils.py || true

Repository: Cloud-CV/EvalAI

Length of output: 9444


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="apps/jobs/utils.py"

python - <<'PY'
path="apps/jobs/utils.py"
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if 360 <= i <= 470:
            print(f"{i:4d}:{line.rstrip()}")
PY

rg -n "only_public_entries|exclude\(|is_disabled|is_public|host|flag|banned|all_banned|participant" apps/jobs/utils.py

Repository: Cloud-CV/EvalAI

Length of output: 9110


Keep stage-2 fetch scoped to the stage-1 filtered leaderboard_data queryset
Stage 1 builds leaderboard_data with is_disabled=False, host exclusion, submission__is_flagged=False, valid submission statuses, challenge_phase_split=..., and (when only_public_entries) submission__is_public=True; stage 2 re-fetches from LeaderboardData.objects by id__in=retained_ids without reapplying those predicates, so DB changes between queries can reintroduce entries that should no longer appear (the code only skips rows that disappeared, not rows whose eligibility/visibility changed).

Suggested change
-retained_ids = [item["id"] for item in retained_light]
-heavy_qs = LeaderboardData.objects.filter(id__in=retained_ids).annotate(
+retained_ids = [item["id"] for item in retained_light]
+heavy_qs = leaderboard_data.filter(id__in=retained_ids).annotate(
     filtering_score=RawSQL(
         "result->>%s", (default_order_by,), output_field=FloatField()
     ),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/jobs/utils.py` around lines 517 - 574, The second-stage fetch re-queries
LeaderboardData via LeaderboardData.objects.filter(id__in=retained_ids) which
loses the stage-1 predicates (the original leaderboard_data queryset) so rows
that became ineligible between queries can be reintroduced; fix by basing
heavy_qs on the stage-1 queryset (e.g., use
leaderboard_data.filter(id__in=retained_ids) or chain .filter(...) onto the
original leaderboard_data) before calling .annotate(...) and .values(...),
ensuring retained_ids and retained_light are still used but the original
visibility/eligibility filters are preserved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants