fix(security): sanitize CheckpointStore seed_id to prevent path traversal#413
fix(security): sanitize CheckpointStore seed_id to prevent path traversal#413
Conversation
…aversal (Q00#397) Add _sanitize_seed_id() that strips null bytes, removes '..' sequences, replaces path separators with underscores, and limits length to 255 chars. Add _validate_path_containment() to verify resolved paths stay within the checkpoint base directory. Both checks are applied in _get_checkpoint_path(), which is used by save(), load(), and all rollback operations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: APPROVE
Reviewing commit
72485a7for PR #413
Blocking Findings
No in-scope blocking findings remained after policy filtering.
Non-blocking Suggestions
| # | File:Line | Category | Finding |
|---|
Design Notes
The path-traversal goal is valid, but folding raw identifiers directly into sanitized filenames changes the storage key contract. This needs a collision-resistant mapping with an explicit filename-length budget, rather than lossy replacement plus truncation.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: APPROVE
Reviewing commit
38b3aa4for PR #413
Blocking Findings
No in-scope blocking findings remained after policy filtering.
Non-blocking Suggestions
| # | File:Line | Category | Finding |
|---|
Design Notes
The traversal hardening is directionally correct, but it was added at the filename layer without preserving a one-to-one mapping between logical seed IDs and on-disk keys. For persistence code, the key derivation has to remain collision-free or be explicitly rejected.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
|
Labels: |
Q00
left a comment
There was a problem hiding this comment.
Requesting changes because the filename-length hardening is still incomplete.
_sanitize_seed_id() truncates the raw seed id to 255 characters, but _get_checkpoint_path() then wraps it in checkpoint_<seed>.json / .json.<level>. That still produces basenames longer than 255 bytes on common filesystems (len("checkpoint_") + 255 + len(".json") = 271), so long seed ids can still fail with ENAMETOOLONG on save/load.
Please cap the sanitized id to the remaining filename budget after the prefix/suffix, and add a regression test that actually writes a long-id checkpoint instead of only asserting on the generated path string.
_sanitize_seed_id() previously truncated to 255 characters, but _get_checkpoint_path() wraps the seed in "checkpoint_<seed>.json[.N]" and file_lock appends ".lock", producing basenames up to 273 bytes which triggers ENAMETOOLONG on common filesystems. Now the seed is capped to 232 characters (255 minus worst-case prefix+suffix+lock overhead). When truncation is needed, a SHA-256 hash fragment is appended to keep the mapping collision-resistant. Replaces the path-string-only long-id test with regression tests that actually write and read back checkpoints with long seed ids. Addresses review feedback from Q00 on PR Q00#413. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: APPROVE
Reviewing commit
17b5c70for PR #413
Blocking Findings
No in-scope blocking findings remained after policy filtering.
Non-blocking Suggestions
| # | File:Line | Category | Finding |
|---|
Design Notes
The hardening is heading in the right direction, but it currently treats filename safety as a lossy string rewrite. For persistence keys, the safer shape is a reversible or collision-resistant mapping from raw seed_id to filename, with byte-based length enforcement rather than character-count enforcement.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Summary
CheckpointStore._get_checkpoint_path()directly interpolatedseed_idinto filenames without sanitization, allowing crafted IDs likex/../../PWNEDto read, write, or delete files outside the checkpoint directory._sanitize_seed_id()static method that strips null bytes, removes..traversal sequences, replaces/and\with underscores, limits length to 255 chars, and raisesValueErrorfor empty results._validate_path_containment()that verifies the resolved checkpoint path stays withinself._base_path._get_checkpoint_path(), which is used bysave(),load(), and all rollback operations.Test plan
ValueErrorValueError..-only seed_id raisesValueErrorload()with traversal seed_id resolves consistently withsave()ruff checkpasses clean