fix(sieve): depth-limited file discovery via opt-in max_depth (closes #221)#259
Merged
mlieberman85 merged 1 commit intoMay 14, 2026
Conversation
…usari-oss#221) The file_exists handler used to check only the repo root for non-glob patterns. Projects with nested manifests — microservices under app-code/<service>/, frontend/ + backend/ splits, monorepos with distinct components — failed dependency-detection controls like OSPS-BR-05.01 even when standard manifests were clearly present, just one or two directories deep. Fix introduces an opt-in `max_depth: int = 0` config field on the file_exists handler. When > 0, the handler walks up to that many levels deep checking each non-glob pattern. Glob patterns are unchanged. Default 0 preserves backward compatibility for every other control that does not opt in. The walk prunes well-known noise directories (.git, node_modules, __pycache__, .venv, target, build, dist, .tox, .mypy_cache, etc.) so performance stays bounded on monorepos. A misconfigured deep walk through a 50k-file node_modules tree was the biggest "make this the default" risk; explicit opt-in + pruning sidesteps that. Issue kusari-oss#226 (make depth-limited the default) remains a separate discussion. TOML side: OSPS-BR-05.01 (StandardizedDependencyTools) opts in with max_depth = 2, finding manifests in conventional nested layouts without descending into unrelated subtrees. Adds 7 regression tests in TestFileExistsHandlerDepthLimited covering: default-still-root-only (bug regression), depth-2-finds-nested, root-preferred-over-nested, noise-directory-pruning, depth-respects- limit, explicit-zero-equivalent-to-default, globs-still-unaffected. Verification: - ruff check: clean - full suite: 2115 passed (7 new) / 6 skipped / 0 failed - validate_sync.py: PASS Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #221. Projects with nested manifests — microservices under `app-code//`, `frontend/` + `backend/` splits, monorepos with distinct components — used to fail dependency-detection controls like OSPS-BR-05.01 even when standard manifests were clearly present one or two directories deep. The handler only checked the repo root.
What changed
Handler (`packages/darnit/src/darnit/sieve/builtin_handlers.py`)
New opt-in config field on `file_exists` handler:
```toml
[[controls."OSPS-BR-05.01".passes]]
handler = "file_exists"
max_depth = 2 # NEW — walks up to N levels deep for non-glob patterns
files = ["go.mod", "pyproject.toml", "package.json", ...]
```
TOML (`packages/darnit-baseline/openssf-baseline.toml`)
OSPS-BR-05.01 (`StandardizedDependencyTools`) opts in with `max_depth = 2`. No other control's behavior changes.
Issue #226 still open
#226 ("Evaluate making depth-limited file discovery the default in sieve handlers") remains a separate discussion. This PR is deliberately the smallest change that fixes the user-reported bug: opt-in only, conservative default. Flipping the global default would change behavior for every control in every implementation and deserves its own evaluation.
Test plan
7 new regression tests in `TestFileExistsHandlerDepthLimited`:
go.mod,pyproject.toml) fails if not at absolute repo root #221)Verification:
🤖 Generated with Claude Code