Skip to content

Incremental sync (git-status fast path) bypasses the ignore matcher — tracked files in excluded dirs leak back into the index #766

@monochrome3694

Description

@monochrome3694

Summary

getGitVisibleFiles filters the file list through buildDefaultIgnore (built-in defaults + root .gitignore) — that's the #407 design: committed dependency dirs are excluded even though they're tracked. But the incremental path doesn't apply the same matcher: getChangedFiles' git fast path consumes git status --porcelain output unfiltered, on the assumption that "git status already omits .gitignored paths".

That assumption has two holes:

  1. The built-in default excludes aren't gitignore at all. A committed vendor/ or node_modules/ dir is excluded from a full index by DEFAULT_IGNORE_PATTERNS, but git status happily reports changes to tracked files inside it.
  2. gitignore is a no-op for tracked files. The documented way to exclude a committed dependency dir is a .gitignore entry (tracking-blind by design, per v0.9.4 indexes node_modules in non-git projects, causing noisy context results #407) — but git status still reports modifications to those tracked files, because ignoring only applies to untracked paths.

In both cases the changed file isn't in the DB (!tracked) → pushed to added → parsed and indexed. The index now disagrees with what a full index --force produces; the next force re-index silently drops the file again (flip-flop).

Repro

  1. Project with a committed vendor/ dir (tracked files). Full codegraph index --force → vendor excluded ✓ (per v0.9.4 indexes node_modules in non-git projects, causing noisy context results #407).
  2. Modify a tracked file inside vendor/.
  3. codegraph sync → the file is hashed, found missing from the DB, and indexed.
  4. codegraph index --force → it disappears again.

Same repro with a tracked dir listed in .gitignore instead of the built-in defaults.

(The FSEvents watcher path is NOT affected — watcher.js builds this.ignoreMatcher from buildDefaultIgnore and filters events. It's specifically the git-status fast path in getChangedFiles.)

Suggested fix (verified locally)

Build the shared matcher in getChangedFiles and skip matched paths in the modified+added loop:

const ig = buildDefaultIgnore(this.rootDir);
for (const filePath of [...gitChanges.modified, ...gitChanges.added]) {
    if (ig.ignores(filePath)) continue;
    // ... existing hash/compare logic
}

Deleted files can stay unfiltered — they only act when the path is already tracked in the DB, where removal is always correct (and it lets a newly-excluded dir's stale entries clean up).

We're running this patch locally on a ~1,100-file Swift + TS workspace and verified: a new file in an excluded tracked dir shows up in git status but does not enter the index via codegraph sync; full index and sync now agree.

Relevance to #699

Any implementation of the proposed .ignore overlay inherits this leak unless the sync path consults the same matcher as enumeration — with an extra ignore layer that git knows nothing about, every exclusion (not just tracked-file edge cases) would leak back in through git status. Worth fixing together.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions