Skip to content

Workspace Discovery#50

Merged
mors119 merged 1 commit into
FrilLab:mainfrom
mors119:feature/Generalize-project-discovery
Jun 2, 2026
Merged

Workspace Discovery#50
mors119 merged 1 commit into
FrilLab:mainfrom
mors119:feature/Generalize-project-discovery

Conversation

@mors119
Copy link
Copy Markdown
Collaborator

@mors119 mors119 commented Jun 2, 2026

What

Discover Rust artifacts across the entire system.
Add path parameter to scan command
Find Cargo.toml recursively
Add custom scan path

Closes #41
Closes #48
Closes #42

Checklist

Required

  • cargo check passes
  • cargo fmt --check passes
  • cargo clippy --workspace --all-targets -- -D warnings passes
  • cargo test passes

Functional Validation

  • I verified the behavior locally
  • I added or updated tests when necessary

Documentation

  • README or docs were updated if needed
  • New configuration or behavior is documented

Safety

  • Cleanup behavior was reviewed for safety
  • No destructive behavior was introduced without confirmation

Summary by CodeRabbit

  • New Features
    • Scan and Analyze commands now accept optional path arguments to target specific directories
    • Added global scanning mode to choose between system-wide and workspace-scoped operations
    • Enhanced support for detecting and scanning multiple projects within a workspace
    • Clean command now supports the same path selection options

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

The PR enhances the dustfril CLI and detector to accept custom scan paths and recursively discover Cargo projects within workspaces. Path arguments (optional path, global flag) are added to Scan and Analyze commands and to CleanArgs; path utilities validate and resolve paths to concrete locations; a new Project-based discovery system replaces cargo_project helpers; scan_workspace aggregates artifacts from multiple discovered projects; command handlers wire path arguments through resolution logic and routing based on the global flag.

Changes

Path Arguments and Workspace Scanning

Layer / File(s) Summary
CLI argument contract and PathArgs struct
apps/dustfril-cli/src/cli.rs
Commands::Scan and Commands::Analyze enum variants now carry PathArgs payloads; PathArgs struct adds optional path: Option<PathBuf> and global: bool flag; CleanArgs extends with flattened path_args: PathArgs field.
Path resolution and validation helpers
apps/dustfril-cli/src/shared/mod.rs, apps/dustfril-cli/src/shared/path.rs
New shared module exports validate_path (checks Path existence, prints stderr on error) and resolve_path (converts Option<PathBuf> to concrete PathBuf, defaulting to std::env::current_dir()).
Project struct and recursive discovery
crates/dustfril-core/src/detector/project.rs
Introduces Project struct holding root PathBuf; find_projects recursively traverses directories collecting Cargo project roots; is_cargo_project checks for Cargo.toml presence; private visit function implements depth-first traversal.
Workspace aggregation and scan refactoring
crates/dustfril-core/src/detector/mod.rs, crates/dustfril-core/src/detector/rust/mod.rs, crates/dustfril-core/src/detector/scan.rs
Module declarations reorganized: cargo_project replaced by project and rust modules; scan_workspace added to public API; scan_workspace function recursively finds projects and aggregates artifacts; scan(root) refactored to compose scan_workspace and scan_global results.
Command handlers wiring path arguments
apps/dustfril-cli/src/main.rs, apps/dustfril-cli/src/commands/scan.rs, apps/dustfril-cli/src/commands/analyze.rs, apps/dustfril-cli/src/commands/clean.rs
Main dispatch passes command arguments into handlers; Scan and Analyze execute functions accept PathArgs, resolve/validate path, and conditionally call scan_global() or scan_workspace(); Clean handlers accept &CleanArgs and apply same path routing.
Test coverage for discovery and workspace scanning
crates/dustfril-core/src/detector/tests.rs
New test for find_projects discovers two projects recursively; new test for scan_workspace aggregates target artifacts from multiple detected projects.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • FrilLab/dustfril#45: Both modify apps/dustfril-cli/src/commands/clean.rs and extend CleanArgs to accept structured arguments instead of zero-arg entry points.
  • FrilLab/dustfril#39: Both update apps/dustfril-cli/src/commands/analyze.rs execute signature and the initial scan invocation flow within the analyze command.
  • FrilLab/dustfril#34: Main PR extends PR #34's artifact discovery by adding scan_workspace to crates/dustfril-core/src/detector/scan.rs and wiring recursive project aggregation.

Poem

🐰 A path through projects, now more clear,
Recursion finds what workspaces hold dear,
From global scans to focused ground,
All artifacts shall now be found! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.81% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Workspace Discovery' is concise and relates to the main objective of enabling recursive project discovery across directory structures.
Linked Issues check ✅ Passed All three linked issues are addressed: #41 adds path arguments to Scan/Analyze/Clean commands via PathArgs; #48 implements recursive Cargo.toml discovery via project::find_projects; #42 allows custom scan paths through CLI parameter handling and path resolution.
Out of Scope Changes check ✅ Passed All changes are directly scoped to workspace discovery objectives: CLI argument handling, path resolution utilities, recursive project detection, and scan reorganization. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
apps/dustfril-cli/src/shared/path.rs (1)

13-16: 💤 Low value

Consider graceful handling instead of panicking in resolve_path.

expect("Failed to get current directory") aborts the CLI with a raw panic/backtrace if current_dir() fails (e.g., the cwd was deleted or lacks permissions). A user-facing error and clean exit would be friendlier, though this path is rare.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/dustfril-cli/src/shared/path.rs` around lines 13 - 16, The function
resolve_path currently panics via expect when std::env::current_dir() fails;
change resolve_path to return Result<PathBuf, std::io::Error> (or your crate
error type) and propagate the error instead of calling expect so callers can
handle it gracefully. In the new implementation, if path is Some return
Ok(path.clone()), otherwise return the Result from std::env::current_dir() so
failures are returned to the caller for user-friendly reporting; update any
callers of resolve_path to handle the Result accordingly.
crates/dustfril-core/src/detector/project.rs (1)

27-27: 💤 Low value

Misplaced doc comment.

This /// doc comment describes module-level "detection and artifact scanning" but is attached to the is_cargo_project predicate, which only checks for Cargo.toml. Reword to describe the function or drop it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/dustfril-core/src/detector/project.rs` at line 27, The doc comment
"Cargo project detection and artifact scanning." is misplaced on the
is_cargo_project function; update it to accurately describe that
is_cargo_project only checks for the presence of Cargo.toml (or remove it) and
move the original module-level description to a module-level comment (e.g., at
the top of the file) or create a new module-level doc comment. Specifically edit
the is_cargo_project function's docstring to state it checks for Cargo.toml, and
put the broader "detection and artifact scanning" description in a module-level
/// comment above the module or file header.
apps/dustfril-cli/src/commands/analyze.rs (1)

58-69: ⚡ Quick win

Extract the duplicated resolve/validate/route block.

The same resolve_pathvalidate_pathscan_global/scan_workspace sequence is repeated in scan::execute, analyze::execute, and clean::build_cleanup_plan. Consider a shared helper (e.g. in shared::path) that returns the chosen ScanResult (or Option) from PathArgs. A helper can also short-circuit validate_path when args.global is set, since the resolved path is unused in global mode and an invalid path would otherwise abort a valid global scan.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/dustfril-cli/src/commands/analyze.rs` around lines 58 - 69, Extract the
duplicated resolve_path → validate_path → scan_global/scan_workspace logic into
a new helper (e.g., shared::path::choose_scan_result) that accepts PathArgs and
returns a ScanResult or Option<ScanResult>; replace the repeated blocks in
analyze::execute, scan::execute and clean::build_cleanup_plan to call this
helper. The helper should: 1) if args.global is true, call detector::scan_global
and return that without resolving/validating the path, 2) otherwise resolve the
path using resolve_path(&args.path), call validate_path(&path) and if valid call
detector::scan_workspace(&path) (returning None or an Err variant when
validation fails). Update callers to handle the helper’s return type accordingly
and remove the duplicated code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/dustfril-cli/src/commands/analyze.rs`:
- Around line 65-69: The flag semantics are inverted: when args.global is true
the code currently calls detector::scan_global() which only finds registry/git
artifacts under $HOME/.cargo and omits per-project artifacts; to fix, either (A)
make the global flag include both project and global artifacts by calling
detector::scan(&path) when args.global is true (so use detector::scan(&path)
instead of detector::scan_global()), or (B) if global should remain
registry-only, update the PathArgs.global docstring to explicitly state “global
registry/git artifacts only” and keep calling detector::scan_global(); update
the call site that now chooses between detector::scan_global() and
detector::scan_workspace(&path) accordingly to reflect the chosen behavior.

In `@apps/dustfril-cli/src/commands/clean.rs`:
- Around line 25-32: build_cleanup_plan currently resolves the path but never
calls validate_path, causing invalid/non-existent paths to silently produce an
empty CleanupPlan; either (A) add a validate_path(&path) guard before scanning
inside build_cleanup_plan and change its signature to return Option<CleanupPlan>
or Result<CleanupPlan, Error> so you can early-return on invalid paths, or (B)
keep build_cleanup_plan as-is and call validate_path(&path) in the callers
(dry_run and execute) before invoking build_cleanup_plan so the same
validation/error behavior is enforced as in scan::execute and analyze::execute;
update references to CleanArgs, resolve_path, and CleanupPlan accordingly to
propagate the validation outcome.

In `@apps/dustfril-cli/src/shared/path.rs`:
- Around line 3-11: The validate_path function currently accepts any existing
filesystem entry; change it to also require a directory by checking
path.is_dir() inside validate_path and returning false (with a clear eprintln!
message like "Path is not a directory: {}") when it is not a directory so
detector::scan_workspace and downstream project::find_projects(root) never
receive a non-directory workspace root.

In `@crates/dustfril-core/src/detector/project.rs`:
- Around line 32-40: The current visit function returns immediately when
is_cargo_project(dir) is true, which prevents discovering nested Cargo projects;
change the logic in visit so it still pushes Project { root: dir... } but does
not return unconditionally—only stop descent if the directory is a Cargo
workspace root (implement or use a helper like is_cargo_workspace(dir) that
checks Cargo.toml for a [workspace] table or workspace members). In practice:
update visit to call is_cargo_project(dir) to add the Project, then call
is_cargo_workspace(dir) and only return early when that helper is true;
otherwise continue recursing into subdirectories to detect nested projects.
Ensure references to Project, visit, is_cargo_project, and the new
is_cargo_workspace helper are updated accordingly.
- Around line 46-52: The recursion in visit currently uses path.is_dir() which
follows symlinks and can enter cycles; change the check to use the DirEntry's
file_type to detect directories without following symlinks (e.g., call
entry.file_type() and test file_type.is_dir()) and skip entries where
file_type.is_symlink() or where file_type indicates not a dir before recursing
into visit(&path, projects); update the loop that iterates entries.flatten() to
obtain and inspect entry.file_type() and only recurse for true directory file
types.

---

Nitpick comments:
In `@apps/dustfril-cli/src/commands/analyze.rs`:
- Around line 58-69: Extract the duplicated resolve_path → validate_path →
scan_global/scan_workspace logic into a new helper (e.g.,
shared::path::choose_scan_result) that accepts PathArgs and returns a ScanResult
or Option<ScanResult>; replace the repeated blocks in analyze::execute,
scan::execute and clean::build_cleanup_plan to call this helper. The helper
should: 1) if args.global is true, call detector::scan_global and return that
without resolving/validating the path, 2) otherwise resolve the path using
resolve_path(&args.path), call validate_path(&path) and if valid call
detector::scan_workspace(&path) (returning None or an Err variant when
validation fails). Update callers to handle the helper’s return type accordingly
and remove the duplicated code.

In `@apps/dustfril-cli/src/shared/path.rs`:
- Around line 13-16: The function resolve_path currently panics via expect when
std::env::current_dir() fails; change resolve_path to return Result<PathBuf,
std::io::Error> (or your crate error type) and propagate the error instead of
calling expect so callers can handle it gracefully. In the new implementation,
if path is Some return Ok(path.clone()), otherwise return the Result from
std::env::current_dir() so failures are returned to the caller for user-friendly
reporting; update any callers of resolve_path to handle the Result accordingly.

In `@crates/dustfril-core/src/detector/project.rs`:
- Line 27: The doc comment "Cargo project detection and artifact scanning." is
misplaced on the is_cargo_project function; update it to accurately describe
that is_cargo_project only checks for the presence of Cargo.toml (or remove it)
and move the original module-level description to a module-level comment (e.g.,
at the top of the file) or create a new module-level doc comment. Specifically
edit the is_cargo_project function's docstring to state it checks for
Cargo.toml, and put the broader "detection and artifact scanning" description in
a module-level /// comment above the module or file header.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f131ac20-1b84-4839-ac58-319ee6492d8f

📥 Commits

Reviewing files that changed from the base of the PR and between c8d556b and 467456c.

📒 Files selected for processing (23)
  • apps/dustfril-cli/src/cli.rs
  • apps/dustfril-cli/src/commands/analyze.rs
  • apps/dustfril-cli/src/commands/clean.rs
  • apps/dustfril-cli/src/commands/scan.rs
  • apps/dustfril-cli/src/main.rs
  • apps/dustfril-cli/src/shared/mod.rs
  • apps/dustfril-cli/src/shared/path.rs
  • crates/dustfril-core/src/detector/cargo_project.rs
  • crates/dustfril-core/src/detector/java/build.rs
  • crates/dustfril-core/src/detector/java/gradle.rs
  • crates/dustfril-core/src/detector/java/maven.rs
  • crates/dustfril-core/src/detector/java/mod.rs
  • crates/dustfril-core/src/detector/mod.rs
  • crates/dustfril-core/src/detector/node/mod.rs
  • crates/dustfril-core/src/detector/node/node_modules.rs
  • crates/dustfril-core/src/detector/node/package_json.rs
  • crates/dustfril-core/src/detector/project.rs
  • crates/dustfril-core/src/detector/rust/git.rs
  • crates/dustfril-core/src/detector/rust/mod.rs
  • crates/dustfril-core/src/detector/rust/registry.rs
  • crates/dustfril-core/src/detector/rust/target.rs
  • crates/dustfril-core/src/detector/scan.rs
  • crates/dustfril-core/src/detector/tests.rs
💤 Files with no reviewable changes (1)
  • crates/dustfril-core/src/detector/cargo_project.rs

Comment on lines +65 to +69
let scan_result = if args.global {
detector::scan_global()
} else {
detector::scan_workspace(&path)
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect scan_global / scan_workspace / scan to confirm what each mode returns.
fd -t f 'scan.rs' --exec cat {}

Repository: FrilLab/dustfril

Length of output: 2097


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Locate PathArgs and the doc comment for `global`
fd -t f '.*\.rs$' apps/dustfril-cli/src --exec rg -n "struct PathArgs|global:|global\\b" {}

# Narrow further to the file likely containing PathArgs
fd -t f 'PathArgs' apps/dustfril-cli/src --exec sed -n '1,200p' {} 

# 2) Find all call sites of detector::scan and detector::scan_global / scan_workspace
rg -n "detector::scan\\b|detector::scan_global\\b|detector::scan_workspace\\b" -S apps/dustfril-cli/src

Repository: FrilLab/dustfril

Length of output: 346


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 3) Inspect dustfril_core detector module entry points (scan/scan_global/scan_workspace)
# Find where detector is re-exported from / defines scan()
fd -t f 'detector.*\.rs$|mod\.rs$' . --exec rg -n "pub fn scan\\b|pub fn scan_global\\b|pub fn scan_workspace\\b" {}

Repository: FrilLab/dustfril

Length of output: 42


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 4) Inspect the CLI docs/README/help text for the `--global` flag
rg -n "--global|global\\b" apps/dustfril-cli -S

Repository: FrilLab/dustfril

Length of output: 502


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find PathArgs definition + any doc comment for `global`
rg -n "pub struct PathArgs" -S apps/dustfril-cli/src
rg -n "pub\\s+global:\\s+bool" -S apps/dustfril-cli/src
rg -n "global\\s*:\\s*bool|--global" -S apps/dustfril-cli/src

# Show the PathArgs struct block with context
rg -n "pub struct PathArgs" -S apps/dustfril-cli/src -C 6

# Find all detector calls in the CLI
rg -n "detector::scan\\b|detector::scan_global\\b|detector::scan_workspace\\b" -S apps/dustfril-cli/src

# Also search other crates (in case old behavior lived outside CLI)
rg -n "detector::scan\\b|detector::scan_global\\b|detector::scan_workspace\\b" -S . | head -n 80

Repository: FrilLab/dustfril

Length of output: 1827


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the CLI file that defines PathArgs and print it (or a bounded range)
# (Using fd to find likely file names, then cat with line bounds.)
path_args_file="$(fd -t f -g '*PathArgs*' apps/dustfril-cli/src | head -n 1 || true)"
echo "PathArgs file guess: ${path_args_file:-<none>}"

# If we didn't find it by name, derive from rg output by locating the file directly:
# (fallback) list matching rust files and then dump those small ones partially.
files="$(rg -l "pub struct PathArgs" -S apps/dustfril-cli/src || true)"
echo "$files"

for f in $files; do
  echo "---- $f (first 120 lines) ----"
  sed -n '1,120p' "$f"
done

Repository: FrilLab/dustfril

Length of output: 1010


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate detector module entry points in dustfril_core and print relevant parts
rg -n "pub fn scan\\b|pub fn scan_global\\b|pub fn scan_workspace\\b" -S . --glob '*detector*' --glob '*scan*.rs' --glob '*mod*.rs' | head -n 80

Repository: FrilLab/dustfril

Length of output: 330


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect CLI command routing
sed -n '1,200p' apps/dustfril-cli/src/commands/scan.rs
sed -n '1,220p' apps/dustfril-cli/src/commands/analyze.rs
sed -n '1,220p' apps/dustfril-cli/src/commands/clean.rs

# Find whether combined detector::scan(...) is used anywhere
rg -n "detector::scan\\(" -S . 

# Search docs/help for --global wording
rg -n "--global" -S . 
rg -n "Scan the entire system" -S .

Repository: FrilLab/dustfril

Length of output: 5937


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect dustfril_core detector scan implementation
sed -n '1,140p' crates/dustfril-core/src/detector/scan.rs

# Inspect how detector::scan is re-exported (if needed)
rg -n "mod scan|pub mod scan|pub use .*scan" crates/dustfril-core/src/detector -S

# Inspect registry + git detection logic (to see whether they discover projects too)
fd -t f 'registry*.rs|git*.rs' crates/dustfril-core/src/detector --exec sed -n '1,220p'

# Inspect path resolution/validation in CLI shared module
rg -n "fn resolve_path|fn validate_path" -S apps/dustfril-cli/src/shared/path.rs apps/dustfril-cli/src/shared -S
fd -t f 'path.rs' apps/dustfril-cli/src/shared --exec sed -n '1,220p'

Repository: FrilLab/dustfril

Length of output: 3395


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Also search for README/help text about --global semantics
rg -n "dustfril .*--global|--global|Scan the entire system|Scan the entire" -S .

Repository: FrilLab/dustfril

Length of output: 158


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Search old combined usage of detector::scan( ... ) in CLI (might still exist elsewhere)
rg -n "detector::scan\\(" -S apps/dustfril-cli/src crates/dustfril-core/src

Repository: FrilLab/dustfril

Length of output: 42


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Repo-wide: is detector::scan(...) called anywhere?
rg -n "detector::scan\\(" -S .

# Repo-wide: where is --global mentioned?
rg -n "(--global\\b|Scan the entire system)" -S .

Repository: FrilLab/dustfril

Length of output: 42


Fix --global vs workspace semantics to match flag documentation

  • --global currently dispatches to detector::scan_global() only (Cargo registry + git under $HOME/.cargo), while non---global dispatches to detector::scan_workspace(&path) only (per-project artifacts discovered under the provided path).
  • PathArgs.global says “Scan the entire system…”, but scan_global() doesn’t discover Cargo projects across the system; project artifacts only appear when --global is false.
  • If --global is meant to include both project + global artifacts, route to detector::scan(&path); otherwise update the docstring to reflect “global registry/git artifacts only”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/dustfril-cli/src/commands/analyze.rs` around lines 65 - 69, The flag
semantics are inverted: when args.global is true the code currently calls
detector::scan_global() which only finds registry/git artifacts under
$HOME/.cargo and omits per-project artifacts; to fix, either (A) make the global
flag include both project and global artifacts by calling detector::scan(&path)
when args.global is true (so use detector::scan(&path) instead of
detector::scan_global()), or (B) if global should remain registry-only, update
the PathArgs.global docstring to explicitly state “global registry/git artifacts
only” and keep calling detector::scan_global(); update the call site that now
chooses between detector::scan_global() and detector::scan_workspace(&path)
accordingly to reflect the chosen behavior.

Comment on lines +25 to +32
fn build_cleanup_plan(args: &CleanArgs) -> CleanupPlan {
let path = resolve_path(&args.path_args.path);

let scan_result = if args.path_args.global {
detector::scan_global()
} else {
detector::scan_workspace(&path)
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Inconsistent path validation in clean.

Unlike scan::execute and analyze::execute, build_cleanup_plan resolves the path but never calls validate_path. An invalid/non-existent path silently yields an empty plan ("No cleanup candidates found.") instead of the clearer validation error the other commands emit. Add the same validate_path guard for consistent UX.

🛠️ Proposed fix
-use crate::{cli::CleanArgs, shared::path::resolve_path};
+use crate::{
+    cli::CleanArgs,
+    shared::path::{resolve_path, validate_path},
+};

Note: build_cleanup_plan returns CleanupPlan, so propagating an early-return on invalid paths will require either changing its signature (e.g. to Option<CleanupPlan>) or validating in dry_run/execute before building the plan.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/dustfril-cli/src/commands/clean.rs` around lines 25 - 32,
build_cleanup_plan currently resolves the path but never calls validate_path,
causing invalid/non-existent paths to silently produce an empty CleanupPlan;
either (A) add a validate_path(&path) guard before scanning inside
build_cleanup_plan and change its signature to return Option<CleanupPlan> or
Result<CleanupPlan, Error> so you can early-return on invalid paths, or (B) keep
build_cleanup_plan as-is and call validate_path(&path) in the callers (dry_run
and execute) before invoking build_cleanup_plan so the same validation/error
behavior is enforced as in scan::execute and analyze::execute; update references
to CleanArgs, resolve_path, and CleanupPlan accordingly to propagate the
validation outcome.

Comment on lines +3 to +11
pub fn validate_path(path: &Path) -> bool {
if !path.exists() {
eprintln!("Path does not exist: {}", path.display());

return false;
}

true
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

❓ Verification inconclusive

🏁 Script executed:

#!/bin/bash
ast-grep --pattern 'pub fn scan_workspace($$$) { $$$ }'
rg -nP '\bscan_workspace\b' -C3

Repository: FrilLab/dustfril

Length of output: 3776


Reject non-directory paths in validate_path.

validate_path returns true for any existing filesystem entry (including regular files). The CLI then passes that path into detector::scan_workspace, which forwards it to project::find_projects(root), so failing fast on non-directories would prevent passing an invalid “workspace root” to the scanner.

🛡️ Proposed fix to also require a directory
 pub fn validate_path(path: &Path) -> bool {
     if !path.exists() {
         eprintln!("Path does not exist: {}", path.display());

         return false;
     }

+    if !path.is_dir() {
+        eprintln!("Path is not a directory: {}", path.display());
+
+        return false;
+    }
+
     true
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn validate_path(path: &Path) -> bool {
if !path.exists() {
eprintln!("Path does not exist: {}", path.display());
return false;
}
true
}
pub fn validate_path(path: &Path) -> bool {
if !path.exists() {
eprintln!("Path does not exist: {}", path.display());
return false;
}
if !path.is_dir() {
eprintln!("Path is not a directory: {}", path.display());
return false;
}
true
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/dustfril-cli/src/shared/path.rs` around lines 3 - 11, The validate_path
function currently accepts any existing filesystem entry; change it to also
require a directory by checking path.is_dir() inside validate_path and returning
false (with a clear eprintln! message like "Path is not a directory: {}") when
it is not a directory so detector::scan_workspace and downstream
project::find_projects(root) never receive a non-directory workspace root.

Comment on lines +32 to +40
fn visit(dir: &Path, projects: &mut Vec<Project>) {
if is_cargo_project(dir) {
projects.push(Project {
root: dir.to_path_buf(),
// ecosystem: Ecosystem::Rust,
});

return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Confirm that stopping descent at the first Cargo.toml is intended.

visit returns as soon as a Cargo.toml is found, so nested projects below a detected root are never discovered. This is correct for a Cargo workspace (members share the root target/), but a directory that itself contains a Cargo.toml and also holds unrelated nested Cargo projects would have those nested projects (and their own target/ dirs) silently skipped. Please confirm this matches the intended discovery semantics for issue #48.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/dustfril-core/src/detector/project.rs` around lines 32 - 40, The
current visit function returns immediately when is_cargo_project(dir) is true,
which prevents discovering nested Cargo projects; change the logic in visit so
it still pushes Project { root: dir... } but does not return
unconditionally—only stop descent if the directory is a Cargo workspace root
(implement or use a helper like is_cargo_workspace(dir) that checks Cargo.toml
for a [workspace] table or workspace members). In practice: update visit to call
is_cargo_project(dir) to add the Project, then call is_cargo_workspace(dir) and
only return early when that helper is true; otherwise continue recursing into
subdirectories to detect nested projects. Ensure references to Project, visit,
is_cargo_project, and the new is_cargo_workspace helper are updated accordingly.

Comment on lines +46 to +52
for entry in entries.flatten() {
let path = entry.path();

if path.is_dir() {
visit(&path, projects);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Does Rust std::fs::DirEntry::file_type FileType::is_dir return false for symlinks (i.e. it does not follow the link)?

💡 Result:

Yes, std::fs::DirEntry::file_type does not follow symlinks [1][2]. When called on a DirEntry that points to a symbolic link, it returns the file type of the link itself, rather than the file type of the link's target [1][2]. Because FileType::is_dir is mutually exclusive to FileType::is_symlink [3][4], it will return false if the entry is a symbolic link, even if that link points to a directory [3][4]. In summary: 1. DirEntry::file_type returns the file type of the symlink itself, not the target [1][2]. 2. FileType::is_dir checks if the file type is a directory [3]. Since a symlink is classified as a symlink (and not a directory) by the FileType methods, is_dir returns false for symlinks [3][4]. 3. To determine if a symlink points to a directory, you would typically need to check if the entry is a symlink, and if so, potentially resolve the path or use alternative methods to inspect the target [5].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="crates/dustfril-core/src/detector/project.rs"
echo "---- $FILE ----"
nl -ba "$FILE" | sed -n '1,140p'

echo "---- search for is_dir/metadata usage in file ----"
rg -n "is_dir\(|file_type\(|symlink|read_link|follow" "$FILE" || true

echo "---- locate visit() definition and how recursion is handled ----"
rg -n "fn visit|visit\(" "$FILE"

Repository: FrilLab/dustfril

Length of output: 176


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="crates/dustfril-core/src/detector/project.rs"
echo "---- $FILE (first 200 lines with line numbers) ----"
cat -n "$FILE" | sed -n '1,220p'

echo "---- search for is_dir/file_type/symlink usage in file ----"
rg -n "is_dir\(|file_type\(|symlink|read_link|follow" "$FILE" || true

echo "---- locate visit() definition and its recursion ----"
rg -n "fn visit|visit\(" "$FILE" || true

Repository: FrilLab/dustfril

Length of output: 1784


🌐 Web query:

Rust std::path::Path::is_dir follow symlinks? uses metadata vs symlink_metadata

💡 Result:

Rust’s std::path::Path::is_dir does follow symlinks. In the Rust docs for Path::is_dir, it states that the function “will traverse symbolic links to query information about the destination file” and returns true when the path exists and is pointing at a directory [1]. By contrast, std::path::Path::symlink_metadata queries metadata “without following symlinks” [1]. If you want to test whether a symlink itself (not its target) is a directory, the usual approach is: - call symlink_metadata, then inspect whether the resulting metadata represents a symlink (via Metadata::is_symlink / file_type), rather than using is_dir which follows the link [1][2]. Summary: - Path::is_dir -> follows symlinks (checks target) [1] - Path::symlink_metadata -> does not follow symlinks (checks link itself) [1]

Citations:


Avoid symlink-recursion in system-wide project scanning
visit recurses via path.is_dir(), and Path::is_dir traverses symlinks; a symlinked directory cycle (e.g., pointing to an ancestor) can cause unbounded recursion/stack overflow.
Skip symlinks by checking the DirEntry’s own file type instead of following the link.

🛡️ Proposed fix to skip symlinks
     for entry in entries.flatten() {
-        let path = entry.path();
-
-        if path.is_dir() {
-            visit(&path, projects);
-        }
+        let Ok(file_type) = entry.file_type() else {
+            continue;
+        };
+
+        if file_type.is_dir() {
+            visit(&entry.path(), projects);
+        }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for entry in entries.flatten() {
let path = entry.path();
if path.is_dir() {
visit(&path, projects);
}
}
for entry in entries.flatten() {
let Ok(file_type) = entry.file_type() else {
continue;
};
if file_type.is_dir() {
visit(&entry.path(), projects);
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/dustfril-core/src/detector/project.rs` around lines 46 - 52, The
recursion in visit currently uses path.is_dir() which follows symlinks and can
enter cycles; change the check to use the DirEntry's file_type to detect
directories without following symlinks (e.g., call entry.file_type() and test
file_type.is_dir()) and skip entries where file_type.is_symlink() or where
file_type indicates not a dir before recursing into visit(&path, projects);
update the loop that iterates entries.flatten() to obtain and inspect
entry.file_type() and only recurse for true directory file types.

@mors119 mors119 merged commit 78beb67 into FrilLab:main Jun 2, 2026
2 checks passed
@mors119 mors119 deleted the feature/Generalize-project-discovery branch June 2, 2026 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TASK] Find Cargo.toml recursively [FEATURE] Add custom scan path [TASK] Add path argument to scan command

1 participant