feat(watcher): walk directory symlinks with cycle + escape detection (#405)#420
feat(watcher): walk directory symlinks with cycle + escape detection (#405)#420
Conversation
#389 only handled file symlinks; sym_link entries whose target is a directory still hit `if (target_stat.kind != .file) continue;` at src/watcher.zig:319-323 and are silently skipped. Real repos use directory symlinks (monorepo package links, vendored deps, dotfile configs), so the indexer must walk them — but only safely. The fixture pins three behaviours: 1. A file inside a symlinked subdirectory is indexed under the symlinked path (`app/linked_pkg/inside.zig`). 2. A symlink that introduces a cycle (`app/loop -> ..`) does not produce duplicated paths or hang the scan. 3. The scan terminates in bounded time on the fixture. This test fails on main because directory-symlinks are dropped, so `app/linked_pkg/inside.zig` is never yielded by FilteredWalker.next(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…405) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c016404d47
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (shouldSkipDir(entry.name)) continue; | ||
| if (self.ignore_patterns.items.len > 0) { | ||
| var check_buf: [std.fs.max_path_bytes]u8 = undefined; |
There was a problem hiding this comment.
Enforce skip_dirs on symlink target directories
When handling a directory symlink, the walker only applies shouldSkipDir to the link name (entry.name) before descending, so a link like app/cache -> ../.git or deps -> ../node_modules bypasses the skip list and gets fully traversed. Because this commit newly enables directory-symlink traversal, that introduces a regression where excluded/sensitive trees can be indexed and large directories can be scanned, hurting both data-scope guarantees and performance.
Useful? React with 👍 / 👎.
| if (!std.mem.startsWith(u8, real_target, self.real_root)) continue; | ||
| if (real_target.len != self.real_root.len and real_target[self.real_root.len] != '/') continue; |
There was a problem hiding this comment.
Handle native separator in real-root boundary check
The escape guard assumes / as the path boundary (real_target[self.real_root.len] != '/'), but realPathFile can produce native-separated paths (e.g., C:\repo\sub on Windows). In that case, valid in-root symlink targets fail this check and are skipped, so directory symlink traversal silently does not work on Windows environments where symlinks are permitted.
Useful? React with 👍 / 👎.
If openDir succeeds but a subsequent name_buffer or stack.append errors, the new sub handle was never closed. errdefer the close until the frame lands on the stack. Applies to both the .directory branch and the new symlink-points-to-directory branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
Closes #405. Extends the file-symlink fix from #389 to also walk directory symlinks. Cycle detection via realpath-keyed visited set; escape detection via real_root prefix check. Tests pass; suite green.