Skip to content

fix(snapshot): truncate oversized symbol detail instead of panicking#306

Open
justrach wants to merge 1 commit intomainfrom
fix/snapshot-u16-detail-overflow
Open

fix(snapshot): truncate oversized symbol detail instead of panicking#306
justrach wants to merge 1 commit intomainfrom
fix/snapshot-u16-detail-overflow

Conversation

@justrach
Copy link
Copy Markdown
Owner

Summary

`writeSnapshot` at `snapshot.zig:213` crashes when a symbol's `detail` field exceeds 65,535 bytes (u16 length-prefix limit). Truncate with a `std.log.warn` instead of panicking.

Repro

  • Clone `vercel/next.js` and run `codedb snapshot`:
    ```
    thread panic: integer does not fit in destination type
    /.../snapshot.zig:213:60: writeSnapshot
    std.mem.writeInt(u16, &detail_len_buf, @intcast(detail.len), .little);
    ^
    ```
  • Same bug kills `nodejs/node`.
  • Minified JS / bundled monorepos produce synthetic "symbols" whose detail holds a multi-kilobyte inline body. Observed `detail.len` values up to 290,774 bytes during the next.js reproducer.

Fix

Single-site clamp in `writeSnapshot`:

```zig
const max_detail: usize = std.math.maxInt(u16);
const clipped = if (detail.len > max_detail) detail[0..max_detail] else detail;
if (detail.len > max_detail) {
std.log.warn("snapshot: truncating symbol detail from {d}B to {d}B (u16 length-prefix limit)", .{ detail.len, max_detail });
}
```

No format-version bump needed — the reader at L644 already caps `readSectionString` at `std.math.maxInt(u16)`, so the truncated payload round-trips fine.

Scope kept minimal

The same u16 length-prefix pattern also applies to `sym.name` (L196), imports (L186), and OUTLINE_STATE path (L167). Those haven't crashed in practice, and truncating `name` would break symbol identity; skipping them properly needs a bigger refactor. Filed as a follow-up concern, not included here.

Test plan

  • Local `zig build` clean
  • `codedb snapshot` on vercel/next.js clone: 176 ms, 26,530 files, 21 MB, 12 warnings, previously crashed at 71 s
  • Re-ingest `nodejs/node` and `vercel/next.js` through codedb-cloud once binary is redeployed there
  • No regressions on already-working repos (react, godot, llvm, torvalds/linux)

🤖 Generated with Claude Code

The snapshot format uses a u16 length prefix for symbol.detail in the
OUTLINE_STATE section. writeSnapshot at snapshot.zig:213 did
@intcast(detail.len) → u16 with no bounds check, so any symbol whose
detail exceeds 64KB crashed the whole snapshot with:

  thread panic: integer does not fit in destination type
  snapshot.zig:213:60: ... std.mem.writeInt(u16, ..., @intcast(detail.len), ...)

Found reproducing against vercel/next.js: the parser produced
symbols with details up to 290,774 bytes — bundled/minified JS
where a single "symbol" holds a multi-kilobyte inline body.
nodejs/node hits the same crash.

Fix: when detail exceeds u16 max, truncate to 65535 bytes and
emit a std.log.warn. Keeps format compatibility — reader at L644
already caps readSectionString at maxInt(u16) — so no snapshot
version bump is needed. The only cost is a truncated detail preview
on pathological symbols, which is strictly better than losing the
whole snapshot.

Verified locally: next.js now snapshots in 176ms (26,530 files,
21MB) with 12 truncation warnings; previously crashed at 71s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Benchmark Regression Report

Threshold: 10.00%

Tool Base (ns) Head (ns) Delta Status
codedb_bundle 484910 474702 -2.11% OK
codedb_changes 54366 56512 +3.95% OK
codedb_deps 8739 12440 +42.35% FAIL
codedb_edit 5740 6527 +13.71% FAIL
codedb_find 63308 59651 -5.78% OK
codedb_hot 95173 102011 +7.18% OK
codedb_outline 236434 236548 +0.05% OK
codedb_read 83147 90781 +9.18% OK
codedb_search 173206 176918 +2.14% OK
codedb_snapshot 2557458 2508029 -1.93% OK
codedb_status 214556 216511 +0.91% OK
codedb_symbol 55631 59515 +6.98% OK
codedb_tree 75765 84806 +11.93% FAIL
codedb_word 67989 68711 +1.06% OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant