Skip to content

Security: Prevent zip slip during archive extraction#5112

Draft
RishabhJain2018 wants to merge 2 commits into
masterfrom
cursor/fix-zip-slip-extraction-9812
Draft

Security: Prevent zip slip during archive extraction#5112
RishabhJain2018 wants to merge 2 commits into
masterfrom
cursor/fix-zip-slip-extraction-9812

Conversation

@RishabhJain2018

@RishabhJain2018 RishabhJain2018 commented Jun 3, 2026

Copy link
Copy Markdown
Member

Summary

Mitigates zip slip (path traversal) when extracting challenge and submission archives.

Changes

  • Add safe_extract_zip_file() in apps/base/utils.py to validate member paths before extraction.
  • Use safe extraction in challenge views, config utilities, and submission workers.

Security impact

Prevents malicious zip archives from writing files outside the intended extraction directory.

Slack Thread

Open in Web Open in Cursor 

Summary by CodeRabbit

  • Bug Fixes
    • Improved zip file extraction security across the application to prevent potential path traversal vulnerabilities.

Validate zip member paths before extractall in the API, config utilities,
and submission workers so archives cannot write outside the target directory.

Co-authored-by: Rishabh Jain <rishabhjain2018@gmail.com>
@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 5c33323d-5d93-4901-9fb6-0315adcec7e1

📥 Commits

Reviewing files that changed from the base of the PR and between 13924d6 and 71416be.

📒 Files selected for processing (5)
  • apps/base/zip_utils.py
  • apps/challenges/challenge_config_utils.py
  • apps/challenges/views.py
  • scripts/workers/remote_submission_worker.py
  • scripts/workers/submission_worker.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • apps/challenges/views.py
  • scripts/workers/remote_submission_worker.py
  • apps/challenges/challenge_config_utils.py
  • scripts/workers/submission_worker.py

Walkthrough

This PR hardens ZIP extraction security by introducing a centralized path-traversal validation utility and routing all ZIP extractions through it. A new safe_extract_zip_file() function validates that each extracted member's absolute path stays within the destination root, preventing zip slip attacks. Four existing modules—challenge configuration utilities, challenge views, and submission workers—are updated to use this safer helper instead of direct extractall() calls.

Changes

ZIP Extraction Path-Traversal Security

Layer / File(s) Summary
ZIP extraction security validation
apps/base/zip_utils.py
safe_extract_zip_file() validates absolute member paths, ensures each normalized path remains within the destination root, and raises BadZipFile for any path-traversal attempts before extraction.
Safe ZIP extraction adoption
apps/challenges/challenge_config_utils.py, apps/challenges/views.py, scripts/workers/remote_submission_worker.py, scripts/workers/submission_worker.py
Four modules import and adopt the safer helper: challenge configuration uses try/finally wrapping, views routes challenge zip creation through the helper, and worker modules replace direct extractall() with validated extraction calls.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Security: Prevent zip slip during archive extraction' directly and clearly summarizes the main change: adding security protection against zip slip vulnerabilities.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cursor/fix-zip-slip-extraction-9812

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/base/utils.py`:
- Around line 503-517: The ZIP extraction is protected but YAML-sourced paths
can still traverse out of the extracted tree; add a helper (e.g.,
safe_join_under_root(root, *parts)) that resolves and verifies the final path is
inside root (similar check to safe_extract_zip_file), then replace every place
that builds filesystem paths from YAML values (references to description,
evaluation_details, terms_and_conditions, submission_guidelines,
evaluation_script, test_annotation_file) to call
safe_join_under_root(extracted_root, yaml_value) before any
os.path.isfile/open/stat calls; ensure the helper raises on unsafe targets and
is used everywhere YAML-derived values are passed to isfile() or open().

In `@apps/challenges/challenge_config_utils.py`:
- Around line 96-103: The extract_zip_file function currently opens
zipfile.ZipFile(file_path, mode) into zip_ref and calls
safe_extract_zip_file(zip_ref, output_path) but only calls zip_ref.close() after
that, so if safe_extract_zip_file raises the handle remains open; wrap the
extraction call in a try/finally (or use a context manager) so zip_ref.close()
is always executed, i.e., ensure zip_ref is closed in the finally block even on
exceptions and return only after closing; reference symbols: extract_zip_file,
safe_extract_zip_file, zipfile.ZipFile, zip_ref.close().

In `@scripts/workers/remote_submission_worker.py`:
- Around line 137-140: The import of safe_extract_zip_file at module top-level
causes apps/base/utils.py (with Django/DRF top-level imports) to execute and
crash this worker; to fix, remove the top-level line "from base.utils import
safe_extract_zip_file" and instead perform a lazy/local import inside the
function that uses it (the code around zip_ref =
zipfile.ZipFile(download_location, "r") and safe_extract_zip_file(zip_ref,
extract_location)), or refactor safe_extract_zip_file into a small standalone
helper module with no Django dependencies and import that helper here; update
references to use the newly imported/local function name so extraction occurs
without importing Django during module import.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: dd256f7a-b836-45ba-9cb2-0e94968634f5

📥 Commits

Reviewing files that changed from the base of the PR and between 4ae73db and 13924d6.

📒 Files selected for processing (5)
  • apps/base/utils.py
  • apps/challenges/challenge_config_utils.py
  • apps/challenges/views.py
  • scripts/workers/remote_submission_worker.py
  • scripts/workers/submission_worker.py

Comment thread apps/base/utils.py Outdated
Comment on lines +503 to +517
def safe_extract_zip_file(zip_ref, destination):
"""
Extract zip archive members while preventing path traversal (zip slip).
"""
destination_path = os.path.abspath(destination)
for member in zip_ref.namelist():
member_path = os.path.abspath(os.path.join(destination_path, member))
if not (
member_path == destination_path
or member_path.startswith(destination_path + os.sep)
):
raise zipfile.BadZipFile(
"Zip archive contains unsafe file paths."
)
zip_ref.extractall(destination_path)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

This still leaves a traversal path via YAML-referenced files.

safe_extract_zip_file() only validates archive member paths. In the challenge import flows, the extracted YAML is still used to build paths for files like description, evaluation_details, terms_and_conditions, submission_guidelines, evaluation_script, and test_annotation_file with plain join(..., yaml_value) calls. A config value containing ../ can still escape the extracted tree and turn this into an arbitrary file read even though extraction itself is now bounded. Please add a shared safe_join_under_root()-style helper and use it everywhere a YAML value becomes a filesystem path.

🔐 Suggested direction
+def safe_join_under_root(root, relative_path):
+    candidate = os.path.abspath(os.path.join(root, relative_path))
+    if not (
+        candidate == root
+        or candidate.startswith(root + os.sep)
+    ):
+        raise ValueError("Path escapes extraction root.")
+    return candidate

Use that helper before every isfile() / open() on YAML-derived paths.

As per coding guidelines, apps/**: Django backend. Pay attention to: Safe handling of user-submitted files and Docker image references.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def safe_extract_zip_file(zip_ref, destination):
"""
Extract zip archive members while preventing path traversal (zip slip).
"""
destination_path = os.path.abspath(destination)
for member in zip_ref.namelist():
member_path = os.path.abspath(os.path.join(destination_path, member))
if not (
member_path == destination_path
or member_path.startswith(destination_path + os.sep)
):
raise zipfile.BadZipFile(
"Zip archive contains unsafe file paths."
)
zip_ref.extractall(destination_path)
def safe_extract_zip_file(zip_ref, destination):
"""
Extract zip archive members while preventing path traversal (zip slip).
"""
destination_path = os.path.abspath(destination)
for member in zip_ref.namelist():
member_path = os.path.abspath(os.path.join(destination_path, member))
if not (
member_path == destination_path
or member_path.startswith(destination_path + os.sep)
):
raise zipfile.BadZipFile(
"Zip archive contains unsafe file paths."
)
zip_ref.extractall(destination_path)
def safe_join_under_root(root, relative_path):
candidate = os.path.abspath(os.path.join(root, relative_path))
if not (
candidate == root
or candidate.startswith(root + os.sep)
):
raise ValueError("Path escapes extraction root.")
return candidate
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/base/utils.py` around lines 503 - 517, The ZIP extraction is protected
but YAML-sourced paths can still traverse out of the extracted tree; add a
helper (e.g., safe_join_under_root(root, *parts)) that resolves and verifies the
final path is inside root (similar check to safe_extract_zip_file), then replace
every place that builds filesystem paths from YAML values (references to
description, evaluation_details, terms_and_conditions, submission_guidelines,
evaluation_script, test_annotation_file) to call
safe_join_under_root(extracted_root, yaml_value) before any
os.path.isfile/open/stat calls; ensure the helper raises on unsafe targets and
is used everywhere YAML-derived values are passed to isfile() or open().

Comment thread apps/challenges/challenge_config_utils.py
Comment thread scripts/workers/remote_submission_worker.py Outdated
Move safe_extract_zip_file into base.zip_utils for workers without Django
setup and close zip handles on extraction failure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants