Skip to content

Commit fd76b52

Browse files
zlavclaude
andcommitted
Surface walker-pruned subtrees and add walker filter test coverage
Walker-level path-filter prunes were previously silent: pathFilterIntercept short-circuits before any strategy reaches the directory, and there was no log trail explaining why a project the user expected didn't appear in the analyze summary. The post-summary note even pointed at `fossa list-targets` as a workaround, which deliberately ignores all filters. Wire `Has Logger sig m` through `walkWithFilters'` and `pathFilterIntercept` so the walker can speak. Per-prune log lines fire at debug level (one per strategy, ~28 strategies = noisy at info). Add `enumeratePrunedSubtrees`, a one-shot pre-discovery walk that returns the list of subtrees the filter will reject; analyze invokes it once before strategies run and logs each pruned path at info level. Result for a `.fossa.yml` with `paths.exclude: ["**/zip/**"]`: Active exclude glob filters: **/zip/** Skipping path "zip/" (excluded by paths filter) The Has Logger ripple touches every strategy that uses walkWithFilters' (~32 single-line constraint sites, ~7 multi-line). Each carrier already provides Logger via DiscoverTaskEffs, so the change is purely a constraint propagation — no new effects, no runtime cost. Add three Walker spec tests (test/Discovery/WalkSpec.hs): - include-path filter: mirror of the existing exclude test, asserts the walker accepts ancestors + included subtree and prunes siblings. - WalkSkipSome merge: strategy returns WalkSkipSome ["a"], filter excludes "b", both should be pruned. Catches the `pathFilterIntercept`/`skipDisallowed` merge logic. - YAML-to-walker end-to-end: parses a YAML config string with a glob exclude, runs it through `collectConfigFileFilters`, executes the walker, asserts pruning. Catches "globs parse but never reach the walker" wiring regressions — the exact class of bug we hit earlier. Cannot exercise the new tests locally because the test binary's startup reads test/Container/testdata/emptypath.tar (a git-LFS pointer not materialized in this environment); CI's Linux/macOS/Windows jobs will validate. Library and test-binary builds pass with no warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 39b220b commit fd76b52

45 files changed

Lines changed: 317 additions & 74 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Changelog.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@
22

33
## 3.17.4
44

5-
- Config: `paths.only` and `paths.exclude` in `.fossa.yml` now accept glob patterns (e.g. `**/vendor/**`, `node_modules/*`). Entries containing `*`, `?`, or `[` are parsed as globs; other entries keep their existing directory-tree semantics. ([#1703](https://github.com/fossas/fossa-cli/pull/1703))
5+
- Config: `paths.only` and `paths.exclude` in `.fossa.yml` now accept glob patterns (e.g. `**/vendor/**`, `node_modules/*`). An entry is treated as a glob if it contains `*`; other entries keep their existing directory-tree semantics. Glob matching follows [`System.FilePattern`](https://hackage.haskell.org/package/filepattern) semantics: `*` matches any sequence of characters within a single path segment, and `**` matches any number of segments. Patterns use forward slashes; backslashes are normalized so Windows-native patterns also work. ([#1703](https://github.com/fossas/fossa-cli/pull/1703))
6+
- Analyze: At startup, `fossa analyze` now prints (a) the active `paths.only`/`paths.exclude` filters from `.fossa.yml` and (b) the directories the walker will prune as a result. Each pruned subtree is reported once at info level so users can correlate a missing project with a configured filter. Per-prune trace logging during discovery is at debug level and visible with `--debug`. ([#1703](https://github.com/fossas/fossa-cli/pull/1703))
67

78
## 3.17.3
89

src/App/Fossa/Analyze.hs

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ import Data.Aeson qualified as Aeson
121121
import Data.ByteString.Lazy qualified as BL
122122
import Data.Error (createBody)
123123
import Data.Flag (Flag, fromFlag)
124-
import Data.Foldable (traverse_)
124+
import Data.Foldable (for_, traverse_)
125125
import Data.Functor (($>))
126126
import Data.Glob (unGlob)
127127
import Data.List.NonEmpty qualified as NE
@@ -137,6 +137,7 @@ import Diag.Result (resultToMaybe)
137137
import Discovery.Archive qualified as Archive
138138
import Discovery.Filters (AllFilters (..), MavenScopeFilters, applyFilters, combinedPathGlobs, combinedPaths, filterIsVSIOnly, ignoredPaths, isDefaultNonProductionPath)
139139
import Discovery.Projects (withDiscoveredProjects)
140+
import Discovery.Walk (enumeratePrunedSubtrees)
140141
import Effect.Exec (Exec)
141142
import Effect.Logger (
142143
Logger,
@@ -145,6 +146,7 @@ import Effect.Logger (
145146
logInfo,
146147
logStdout,
147148
renderIt,
149+
viaShow,
148150
)
149151
import Effect.ReadFS (ReadFS)
150152
import Errata (Errata (..))
@@ -317,6 +319,23 @@ logActivePathFilters AllFilters{includeFilters = include, excludeFilters = exclu
317319
logInfo $
318320
"Active " <> pretty label <> " filters: " <> pretty (Text.intercalate ", " items)
319321

322+
-- | Walk the tree once at startup and surface every directory the path
323+
-- filters will prune. Each prune is logged once at info level here, instead
324+
-- of emitting per-strategy duplicates from inside the walker (~28 strategies
325+
-- would otherwise each report the same prune).
326+
logPrunedSubtrees ::
327+
( Has Logger sig m
328+
, Has ReadFS sig m
329+
, Has Diag.Diagnostics sig m
330+
) =>
331+
AllFilters ->
332+
Path Abs Dir ->
333+
m ()
334+
logPrunedSubtrees filters basedir = do
335+
pruned <- enumeratePrunedSubtrees filters basedir
336+
for_ pruned $ \p ->
337+
logInfo $ "Skipping path " <> viaShow p <> " (excluded by paths filter)"
338+
320339
analyze ::
321340
( Has Debug sig m
322341
, Has Diag.Diagnostics sig m
@@ -353,6 +372,7 @@ analyze cfg = Diag.context "fossa-analyze" $ do
353372
enableVendetta = Config.xVendetta cfg
354373

355374
logActivePathFilters filters
375+
logPrunedSubtrees filters basedir
356376

357377
manualDepsResult <-
358378
Diag.errorBoundaryIO . diagToDebug $

src/Discovery/Walk.hs

Lines changed: 53 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ module Discovery.Walk (
55
walkWithFilters',
66
WalkStep (..),
77
findFileInAncestor,
8+
enumeratePrunedSubtrees,
89

910
-- * Helpers
1011
fileName,
@@ -19,7 +20,7 @@ import Control.Effect.Reader (Reader, ask)
1920
import Control.Monad.Trans
2021
import Control.Monad.Trans.Maybe
2122
import Data.Bifunctor (second)
22-
import Data.Foldable (find)
23+
import Data.Foldable (find, traverse_)
2324
import Data.Functor (void)
2425
import Data.Glob qualified as Glob
2526
import Data.List ((\\))
@@ -28,6 +29,7 @@ import Data.Set qualified as Set
2829
import Data.String.Conversion (toString, toText)
2930
import Data.Text (Text)
3031
import Discovery.Filters (AllFilters, pathAllowed)
32+
import Effect.Logger (Logger, logDebug, viaShow)
3133
import Effect.ReadFS
3234
import Path
3335

@@ -68,7 +70,8 @@ walk f = walkDir $ \dir subdirs files -> do
6870
WalkStop -> pure WalkFinish
6971

7072
pathFilterIntercept ::
71-
( Applicative m
73+
( Has Logger sig m
74+
, Monad m
7275
, Monoid o
7376
) =>
7477
AllFilters ->
@@ -84,17 +87,34 @@ pathFilterIntercept filters base dir subdirs act = do
8487
Nothing -> act
8588
Just relative ->
8689
if pathAllowed filters relative
87-
then (fmap . second) skipDisallowed act
88-
else pure (mempty, WalkSkipAll)
90+
then do
91+
traverse_ logSkip disallowedRelativeSubdirs
92+
(fmap . second) skipDisallowed act
93+
else do
94+
logSkip relative
95+
pure (mempty, WalkSkipAll)
8996
where
90-
disallowedSubdirs :: [Text]
91-
disallowedSubdirs = do
97+
-- Returns the list of immediate subdirectories that the filter rejects,
98+
-- paired with their relative-to-base paths (for logging) and their bare
99+
-- directory names (for the WalkStep skip list the walker consumes).
100+
disallowedRelativeSubdirs :: [Path Rel Dir]
101+
disallowedRelativeSubdirs = do
92102
subdir <- subdirs
93103
stripped <- stripProperPrefix base subdir
94-
let isAllowed = pathAllowed filters stripped
95-
if isAllowed
104+
if pathAllowed filters stripped
96105
then mempty
97-
else pure $ (toText . toFilePath . dirname) subdir
106+
else pure stripped
107+
108+
disallowedSubdirs :: [Text]
109+
disallowedSubdirs = map (toText . toFilePath . dirname) disallowedRelativeSubdirs
110+
111+
-- Per-prune events fire once per strategy walk, so emitting at info-level
112+
-- here would surface N copies of every prune (one per strategy that walks
113+
-- the tree). 'enumeratePrunedSubtrees' surfaces each prune once at info
114+
-- before discovery begins; this debug line is for trace-level diagnostics.
115+
logSkip :: Has Logger sig m => Path Rel Dir -> m ()
116+
logSkip relPath =
117+
logDebug $ "Skipping " <> viaShow relPath <> " (excluded by paths filter)"
98118

99119
-- skipDisallowed needs to look at either:
100120
-- * WalkStep.WalkContinue
@@ -130,10 +150,12 @@ walk' f base = do
130150
tell res
131151
pure step
132152

133-
-- | Like @walk'@, but ignores paths that don't match the provided filters.
153+
-- | Like @walk'@, but ignores paths that don't match the provided filters and
154+
-- emits a log line for each subdirectory pruned by the filters.
134155
walkWithFilters' ::
135156
( Has ReadFS sig m
136157
, Has Diagnostics sig m
158+
, Has Logger sig m
137159
, Has (Reader AllFilters) sig m
138160
, Monoid o
139161
) =>
@@ -145,6 +167,27 @@ walkWithFilters' f root = do
145167
let f' dir subdirs files = pathFilterIntercept filters root dir subdirs $ f dir subdirs files
146168
walk' f' root
147169

170+
-- | Return the relative-to-root paths of every directory pruned by the
171+
-- 'AllFilters' include/exclude path rules. Useful for surfacing pruned dirs
172+
-- to the user once at startup, before any per-strategy walks (which would
173+
-- otherwise emit one log per strategy that reaches the same prune).
174+
enumeratePrunedSubtrees ::
175+
( Has ReadFS sig m
176+
, Has Diagnostics sig m
177+
) =>
178+
AllFilters ->
179+
Path Abs Dir ->
180+
m [Path Rel Dir]
181+
enumeratePrunedSubtrees filters root = walk' visit root
182+
where
183+
visit _dir subdirs _files = do
184+
let pruned = do
185+
subdir <- subdirs
186+
stripped <- maybe [] pure (stripProperPrefix root subdir)
187+
if pathAllowed filters stripped then [] else [stripped]
188+
skipNames = map (toText . toFilePath . dirname) pruned
189+
pure (pruned, WalkSkipSome skipNames)
190+
148191
-- | Search upwards in the directory tree for the existence of the supplied file.
149192
findFileInAncestor :: (Has ReadFS sig m, Has Diagnostics sig m) => Path Abs Dir -> Text -> m (Path Abs File)
150193
findFileInAncestor dir file = do

src/Strategy/ApkDatabase.hs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ import Discovery.Walk (
1818
findFileNamed,
1919
walkWithFilters',
2020
)
21+
import Effect.Logger (Logger)
2122
import Effect.ReadFS (Has, ReadFS)
2223
import GHC.Generics (Generic)
2324
import Path (Abs, Dir, File, Path)
@@ -44,6 +45,7 @@ instance AnalyzeProject AlpineDatabase where
4445
discover ::
4546
( Has ReadFS sig m
4647
, Has Diagnostics sig m
48+
, Has Logger sig m
4749
, Has (Reader AllFilters) sig m
4850
) =>
4951
OsInfo ->
@@ -54,6 +56,7 @@ discover osInfo = simpleDiscover (findProjects osInfo) mkProject AlpineDatabaseP
5456
findProjects ::
5557
( Has ReadFS sig m
5658
, Has Diagnostics sig m
59+
, Has Logger sig m
5760
, Has (Reader AllFilters) sig m
5861
) =>
5962
OsInfo ->

src/Strategy/BerkeleyDB.hs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ instance AnalyzeProject BerkeleyDatabase where
5353
discover ::
5454
( Has ReadFS sig m
5555
, Has Diagnostics sig m
56+
, Has Logger sig m
5657
, Has (Reader AllFilters) sig m
5758
) =>
5859
OsInfo ->
@@ -63,6 +64,7 @@ discover osInfo = simpleDiscover (findProjects osInfo) mkProject BerkeleyDBProje
6364
findProjects ::
6465
( Has ReadFS sig m
6566
, Has Diagnostics sig m
67+
, Has Logger sig m
6668
, Has (Reader AllFilters) sig m
6769
) =>
6870
OsInfo ->

src/Strategy/Bundler.hs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ import Discovery.Walk (
3737
walkWithFilters',
3838
)
3939
import Effect.Exec (Exec, GetDepsEffs, Has)
40+
import Effect.Logger (Logger)
4041
import Effect.ReadFS (ReadFS, readContentsParser)
4142
import GHC.Generics (Generic)
4243
import Path (Abs, Dir, File, Path, toFilePath)
@@ -58,10 +59,10 @@ import Types (
5859
LicenseType (UnknownType),
5960
)
6061

61-
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject BundlerProject]
62+
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject BundlerProject]
6263
discover = simpleDiscover findProjects mkProject BundlerProjectType
6364

64-
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [BundlerProject]
65+
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [BundlerProject]
6566
findProjects = walkWithFilters' $ \dir _ files -> do
6667
let maybeGemfile = findFileNamed "Gemfile" files
6768
gemfileLock = findFileNamed "Gemfile.lock" files

src/Strategy/Cargo.hs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ import Effect.Grapher (
8080
label,
8181
withLabeling,
8282
)
83+
import Effect.Logger (Logger)
8384
import Effect.ReadFS (ReadFS, doesFileExist, readContentsToml)
8485
import Errata (Errata (..))
8586
import GHC.Generics (Generic)
@@ -220,10 +221,10 @@ instance FromJSON CargoMetadata where
220221
<*> (obj .: "workspace_members" >>= traverse parsePkgId)
221222
<*> obj .: "resolve"
222223

223-
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject CargoProject]
224+
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject CargoProject]
224225
discover = simpleDiscover findProjects mkProject CargoProjectType
225226

226-
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [CargoProject]
227+
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [CargoProject]
227228
findProjects = walkWithFilters' $ \dir _ files -> do
228229
case findFileNamed "Cargo.toml" files of
229230
Nothing -> pure ([], WalkContinue)

src/Strategy/Carthage.hs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ import Discovery.Walk (
4646
walkWithFilters',
4747
)
4848
import Effect.Grapher (Grapher, direct, edge, evalGrapher)
49+
import Effect.Logger (Logger)
4950
import Effect.ReadFS (ReadFS, readContentsParser)
5051
import Errata (Errata (..))
5152
import GHC.Generics (Generic)
@@ -81,10 +82,10 @@ import Types (
8182
GraphBreadth (Complete),
8283
)
8384

84-
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject CarthageProject]
85+
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject CarthageProject]
8586
discover = simpleDiscover findProjects mkProject CarthageProjectType
8687

87-
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [CarthageProject]
88+
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [CarthageProject]
8889
findProjects = walkWithFilters' $ \dir _ files -> do
8990
case findFileNamed "Cartfile.resolved" files of
9091
Nothing -> pure ([], WalkContinue)

src/Strategy/Cocoapods.hs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,10 +48,10 @@ import Types (
4848
LicenseType (UnknownType),
4949
)
5050

51-
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject CocoapodsProject]
51+
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject CocoapodsProject]
5252
discover = simpleDiscover findProjects mkProject CocoapodsProjectType
5353

54-
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [CocoapodsProject]
54+
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [CocoapodsProject]
5555
findProjects = walkWithFilters' $ \dir _ files -> do
5656
let podfile = findFileNamed "Podfile" files
5757
podfileLock = findFileNamed "Podfile.lock" files

src/Strategy/Composer.hs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ import Effect.Grapher (
5252
label,
5353
withLabeling,
5454
)
55+
import Effect.Logger (Logger)
5556
import Effect.ReadFS (ReadFS, readContentsJson)
5657
import GHC.Generics (Generic)
5758
import Graphing (Graphing)
@@ -66,10 +67,10 @@ import Types (
6667
LicenseType (LicenseSPDX),
6768
)
6869

69-
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject ComposerProject]
70+
discover :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [DiscoveredProject ComposerProject]
7071
discover = simpleDiscover findProjects mkProject ComposerProjectType
7172

72-
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [ComposerProject]
73+
findProjects :: (Has ReadFS sig m, Has Diagnostics sig m, Has Logger sig m, Has (Reader AllFilters) sig m) => Path Abs Dir -> m [ComposerProject]
7374
findProjects = walkWithFilters' $ \dir _ files -> do
7475
case findFileNamed "composer.lock" files of
7576
Nothing -> pure ([], WalkContinue)

0 commit comments

Comments
 (0)