Skip to content

Add: Disk Usage Analysis#39

Merged
mors119 merged 2 commits into
FrilLab:mainfrom
mors119:feature/early-stage-analyzer
Jun 1, 2026
Merged

Add: Disk Usage Analysis#39
mors119 merged 2 commits into
FrilLab:mainfrom
mors119:feature/early-stage-analyzer

Conversation

@mors119
Copy link
Copy Markdown
Collaborator

@mors119 mors119 commented Jun 1, 2026

Description

Analyze artifact size and disk usage and recommend files that you want to delete based on the last use date.

아티팩트 크기와 디스크 사용량을 분석하고 마지막 사용 날짜를 기준으로 삭제할 만 파일을 추천합니다.

Expected Behavior

When using 'cargo run -p dustfril-cli--anyze', it outputs file analysis and 'Dustfril Analysis Summary'.

cargo run -p dustfril-cli -- analyze 사용 시 파일 분석과 'DustFril Analysis Summary'까지 출력합니다.

Additional Notes

Closes #16
Closes #17
Closes #18
Closes #19
Closes #36
Closes #37
Closes #38

Checklist

Required

  • cargo check passes
  • cargo fmt --check passes
  • cargo clippy --workspace --all-targets -- -D warnings passes
  • cargo test passes

Functional Validation

  • I verified the behavior locally
  • I added or updated tests when necessary

Documentation

  • README or docs were updated if needed
  • New configuration or behavior is documented

Safety

  • Cleanup behavior was reviewed for safety
  • No destructive behavior was introduced without confirmation

Summary by CodeRabbit

  • New Features

    • Analyze command now performs a scan→analyze→report flow with per-artifact details (type, path, size, modified time, age, recommendation), aggregated summary counts and sizes, and a clear message when no artifacts are found.
  • New Features (UX)

    • Human-friendly size and timestamp formatting and age-based cleanup recommendations (Keep / Review / Safe To Clean).
  • Documentation

    • Issue template updated with a structured verification checklist and artifact size/disk-usage guidance.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 2bd022e9-9754-46cb-b35c-dd1ed1d2b841

📥 Commits

Reviewing files that changed from the base of the PR and between f18b68e and d0d419b.

📒 Files selected for processing (7)
  • .github/ISSUE_TEMPLATE/FEATURE_REQUEST.md
  • apps/dustfril-cli/src/commands/analyze.rs
  • crates/dustfril-core/src/analyzer/age.rs
  • crates/dustfril-core/src/analyzer/analyze.rs
  • crates/dustfril-core/src/analyzer/metadata.rs
  • crates/dustfril-core/src/analyzer/mod.rs
  • crates/dustfril-core/src/analyzer/tests.rs
✅ Files skipped from review due to trivial changes (1)
  • .github/ISSUE_TEMPLATE/FEATURE_REQUEST.md
🚧 Files skipped from review as they are similar to previous changes (5)
  • crates/dustfril-core/src/analyzer/analyze.rs
  • crates/dustfril-core/src/analyzer/metadata.rs
  • crates/dustfril-core/src/analyzer/mod.rs
  • crates/dustfril-core/src/analyzer/age.rs
  • apps/dustfril-cli/src/commands/analyze.rs

📝 Walkthrough

Walkthrough

Implements artifact analysis: new models and recommendation enum; utilities for directory size, latest-modified, age calculation, formatting; analyze orchestration that produces sorted AnalysisResult; CLI command to scan/analyze/report; tests and a Chrono dependency; and an updated feature-request template.

Changes

Artifact Analysis Pipeline

Layer / File(s) Summary
Analysis Data Models
crates/dustfril-core/src/models/cleanup_recommendation.rs, crates/dustfril-core/src/models/artifact_analysis.rs, crates/dustfril-core/src/models/analysis_result.rs, crates/dustfril-core/src/models/mod.rs
Introduces CleanupRecommendation enum (with Display), ArtifactAnalysis struct, AnalysisResult struct, and re-exports to expose them publicly.
Analyzer Utilities & Dependencies
crates/dustfril-core/Cargo.toml, crates/dustfril-core/src/analyzer/age.rs, crates/dustfril-core/src/analyzer/metadata.rs, crates/dustfril-core/src/analyzer/size.rs, crates/dustfril-core/src/analyzer/format.rs, crates/dustfril-core/src/analyzer/recommendation.rs
Adds chrono = "0.4" and implements helpers: calculate_age_days, find_latest_modified, calculate_directory_size, format_size, format_modified, and recommend_cleanup with unit tests for recommendation ranges.
Analysis Orchestration
crates/dustfril-core/src/analyzer/analyze.rs
Adds analyze(scan_result: ScanResult) -> AnalysisResult that computes per-artifact size/mtime/age/recommendation, accumulates total_size_bytes, and sorts artifacts by descending size.
Analyzer Module & Test Coverage
crates/dustfril-core/src/analyzer/mod.rs, crates/dustfril-core/src/analyzer/tests.rs
Structures the analyzer module with re-exports and comprehensive tests covering empty results, directory size computation, per-artifact analysis, metadata extraction, formatting precision, artifact sorting, and age calculation.
CLI Command Implementation
apps/dustfril-cli/src/commands/analyze.rs
Replaces placeholder with a scan→analyze→report flow: scans .; prints per-artifact details (type, path, size, modified, age, recommendation); prints aggregated counts and sizes via print_summary; early-returns when no artifacts.
Feature Request Template
.github/ISSUE_TEMPLATE/FEATURE_REQUEST.md
Adds description guidance for artifact size/disk usage and a structured “Checklist” requiring cargo check, cargo fmt --check, cargo clippy ... -D warnings, cargo test, plus functional, docs, and safety checkboxes.

Sequence Diagrams

sequenceDiagram
    participant CLI as execute()
    participant Scanner as Detector
    participant Analyzer
    participant FS as FileSystem
    participant Output
    CLI->>Scanner: scan(current directory)
    Scanner-->>CLI: ScanResult
    alt artifacts found
        CLI->>Analyzer: analyze(ScanResult)
        Analyzer->>FS: calculate_directory_size(path)
        FS-->>Analyzer: size_bytes
        Analyzer->>FS: find_latest_modified(path)
        FS-->>Analyzer: last_modified
        Analyzer->>Analyzer: calculate_age_days(last_modified)
        Analyzer->>Analyzer: recommend_cleanup(age_days)
        Analyzer-->>CLI: AnalysisResult
        CLI->>Output: print per-artifact report
        CLI->>CLI: print_summary(artifacts)
    else empty result
        CLI->>Output: print "No Rust artifacts found."
    end
Loading
sequenceDiagram
    participant Analyzer as analyze()
    participant Size as calculate_directory_size
    participant Metadata as find_latest_modified
    participant Age as calculate_age_days
    participant Recommend as recommend_cleanup
    Analyzer->>Size: path
    Size-->>Analyzer: size_bytes
    Analyzer->>Metadata: path
    Metadata-->>Analyzer: last_modified
    Analyzer->>Age: last_modified
    Age-->>Analyzer: age_days
    Analyzer->>Recommend: age_days
    Recommend-->>Analyzer: recommendation
    Analyzer->>Analyzer: build ArtifactAnalysis
    Analyzer-->>Analyzer: sort artifacts by size
Loading

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related PRs

  • FrilLab/dustfril#34: Implements the scanner/detector that produces ScanResult, which this PR's analyze command consumes.

Poem

🐰 In burrows of bytes I nibble and peek,

I count all the crumbs, both tiny and meek.
Old builds I mark, some safe, some to keep,
I whisper suggestions while you sleep.
Hop on—clean or keep—your disk's now sleek.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.52% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add: Disk Usage Analysis' directly and concisely summarizes the main change—adding functionality to analyze disk usage across artifacts. It is clear, specific, and accurately represents the primary purpose of the changeset.
Linked Issues check ✅ Passed All code requirements from linked issues are met: directory size calculation [#16], total size aggregation [#17], artifact sorting by size [#18], human-readable formatting [#19], timestamp collection and formatting [#36], recursive latest-modification detection [#37], and cleanup recommendations [#38].
Out of Scope Changes check ✅ Passed All changes align with the stated objectives. The pull request includes artifact analysis infrastructure, CLI integration, and supporting utilities—all directly supporting the linked issue requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
crates/dustfril-core/src/analyzer/recommendation.rs (1)

17-38: ⚡ Quick win

Add tests for None and boundary values.

Current tests miss explicit checks for None, 30/31, and 90/91, which are the exact policy edges in this matcher.

Proposed test additions
 #[cfg(test)]
 mod tests {
     use super::*;
@@
     fn safe_to_clean_when_old() {
         assert_eq!(
             recommend_cleanup(Some(180)),
             CleanupRecommendation::SafeToClean
         );
     }
+
+    #[test]
+    fn review_when_age_unknown() {
+        assert_eq!(recommend_cleanup(None), CleanupRecommendation::Review);
+    }
+
+    #[test]
+    fn boundary_checks() {
+        assert_eq!(recommend_cleanup(Some(30)), CleanupRecommendation::Keep);
+        assert_eq!(recommend_cleanup(Some(31)), CleanupRecommendation::Review);
+        assert_eq!(recommend_cleanup(Some(90)), CleanupRecommendation::Review);
+        assert_eq!(recommend_cleanup(Some(91)), CleanupRecommendation::SafeToClean);
+    }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/dustfril-core/src/analyzer/recommendation.rs` around lines 17 - 38,
Add unit tests covering the missing edge cases for recommend_cleanup: assert
that recommend_cleanup(None) returns the correct CleanupRecommendation, and add
explicit boundary tests for 30 vs 31 days and 90 vs 91 days (e.g., test
functions named keep_when_none, review_at_30_and_31, safe_to_clean_at_90_and_91)
that assert recommend_cleanup(Some(30)), recommend_cleanup(Some(31)),
recommend_cleanup(Some(90)), and recommend_cleanup(Some(91)) return the expected
CleanupRecommendation variants; place them in the existing tests mod alongside
the other test functions (referencing recommend_cleanup and
CleanupRecommendation).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/dustfril-cli/src/commands/analyze.rs`:
- Around line 82-88: The current print for artifact.age_days builds a string
that yields "Unknown days" when age_days is None; change the logic in analyze.rs
(the block that references artifact.age_days and the println! producing "Age: {}
days") to handle the None branch separately so you print either "Age: X days"
when age_days is Some(X) or "Age: Unknown" when None (use a match, if let, or
map_or to produce either the numeric + " days" string or the distinct "Unknown"
string).
- Around line 63-66: The early return in analyze.rs (the block checking
analysis_result.artifacts.is_empty() that prints "No Rust artifacts found." and
returns) prevents the "DustFril Analysis Summary" from being printed; instead,
remove the return and ensure the summary is always printed: keep the "No Rust
artifacts found." message when artifacts is empty but proceed to print the
summary (and any subsequent summary/reporting logic) so the summary appears in
both empty and non-empty cases.

In `@crates/dustfril-core/src/analyzer/metadata.rs`:
- Around line 13-17: get_latest_modified currently uses path.is_dir() which
follows symlinks and can recurse infinitely for directory symlinks; change the
directory check to use non-following metadata (e.g., path.symlink_metadata() and
file_type().is_dir()) and explicitly skip or treat symlinks as non-directories
(file_type().is_symlink()) to avoid recursing into symlinked directories. Update
the logic in get_latest_modified to only recurse when the symlink_metadata
indicates a real directory and/or maintain a visited set of canonicalized
paths/inodes to detect cycles; also review analogous uses of entry.metadata()
(e.g., in size.rs) and switch to non-following metadata or similar
cycle-avoidance logic where appropriate. Ensure references to the symbols
get_latest_modified, path.is_dir, entry.metadata, and size.rs are used to locate
the changes.

---

Nitpick comments:
In `@crates/dustfril-core/src/analyzer/recommendation.rs`:
- Around line 17-38: Add unit tests covering the missing edge cases for
recommend_cleanup: assert that recommend_cleanup(None) returns the correct
CleanupRecommendation, and add explicit boundary tests for 30 vs 31 days and 90
vs 91 days (e.g., test functions named keep_when_none, review_at_30_and_31,
safe_to_clean_at_90_and_91) that assert recommend_cleanup(Some(30)),
recommend_cleanup(Some(31)), recommend_cleanup(Some(90)), and
recommend_cleanup(Some(91)) return the expected CleanupRecommendation variants;
place them in the existing tests mod alongside the other test functions
(referencing recommend_cleanup and CleanupRecommendation).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 2da8353c-736b-489d-b5c7-c63c2e30c49f

📥 Commits

Reviewing files that changed from the base of the PR and between 13169da and f18b68e.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (15)
  • .github/ISSUE_TEMPLATE/FEATURE_REQUEST.md
  • apps/dustfril-cli/src/commands/analyze.rs
  • crates/dustfril-core/Cargo.toml
  • crates/dustfril-core/src/analyzer/age.rs
  • crates/dustfril-core/src/analyzer/analyze.rs
  • crates/dustfril-core/src/analyzer/format.rs
  • crates/dustfril-core/src/analyzer/metadata.rs
  • crates/dustfril-core/src/analyzer/mod.rs
  • crates/dustfril-core/src/analyzer/recommendation.rs
  • crates/dustfril-core/src/analyzer/size.rs
  • crates/dustfril-core/src/analyzer/tests.rs
  • crates/dustfril-core/src/models/analysis_result.rs
  • crates/dustfril-core/src/models/artifact_analysis.rs
  • crates/dustfril-core/src/models/cleanup_recommendation.rs
  • crates/dustfril-core/src/models/mod.rs

Comment on lines +63 to +66
if analysis_result.artifacts.is_empty() {
println!("No Rust artifacts found.");
return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Always print the analysis summary, even when no artifacts are found.

This early return skips "DustFril Analysis Summary" in the empty case, which conflicts with the expected command behavior.

Proposed fix
     if analysis_result.artifacts.is_empty() {
         println!("No Rust artifacts found.");
-        return;
+        print_summary(&analysis_result);
+        return;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if analysis_result.artifacts.is_empty() {
println!("No Rust artifacts found.");
return;
}
if analysis_result.artifacts.is_empty() {
println!("No Rust artifacts found.");
print_summary(&analysis_result);
return;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/dustfril-cli/src/commands/analyze.rs` around lines 63 - 66, The early
return in analyze.rs (the block checking analysis_result.artifacts.is_empty()
that prints "No Rust artifacts found." and returns) prevents the "DustFril
Analysis Summary" from being printed; instead, remove the return and ensure the
summary is always printed: keep the "No Rust artifacts found." message when
artifacts is empty but proceed to print the summary (and any subsequent
summary/reporting logic) so the summary appears in both empty and non-empty
cases.

Comment thread apps/dustfril-cli/src/commands/analyze.rs Outdated
Comment on lines +13 to +17
let current = if path.is_dir() {
get_latest_modified(&path)
} else {
entry.metadata().ok().and_then(|m| m.modified().ok())
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Locate and inspect metadata.rs around get_latest_modified usage/definition
FILE="crates/dustfril-core/src/analyzer/metadata.rs"
echo "== $FILE =="
stat "$FILE" || true
rg -n "get_latest_modified|path\.is_dir\(" "$FILE" || true
# Print a small window around the relevant lines
sed -n '1,220p' "$FILE" | nl -ba | sed -n '1,220p'

# 2) Find size.rs and inspect for symlink-aware metadata usage
# Try common locations
echo "== find likely size.rs =="
fd -a "size.rs" crates | head -n 20 || true
# Search for "entry.metadata()" vs file_type in rust analyzer module
rg -n "entry\.metadata\(\)|file_type\(|symlink_metadata\(" crates/dustfril-core/src/analyzer || true

# If size.rs exists, print it
SIZEFILE="$(fd -a "size.rs" crates | head -n 1 || true)"
if [[ -n "${SIZEFILE:-}" ]]; then
  echo "== $SIZEFILE =="
  stat "$SIZEFILE" || true
  rg -n "get_latest_modified|path\.is_dir\(|file_type\(|symlink_metadata\(\)|entry\.metadata\(\)" "$SIZEFILE" || true
  sed -n '1,260p' "$SIZEFILE" | nl -ba | sed -n '1,260p'
fi

# 3) Search for any existing symlink recursion guard in this module
rg -n "symlink_metadata|is_symlink|follow_symlink|recurs|visited|inode|file_type\(\)\.is_dir" crates/dustfril-core/src/analyzer || true

Repository: FrilLab/dustfril

Length of output: 369


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="crates/dustfril-core/src/analyzer/metadata.rs"
echo "== $FILE =="
ls -l "$FILE" || true

echo "== get_latest_modified occurrences =="
rg -n "pub fn get_latest_modified|path\.is_dir\(" "$FILE" || true

echo "== metadata.rs (first 220 lines with line numbers via awk) =="
sed -n '1,220p' "$FILE" | awk '{printf "%5d\t%s\n", NR, $0}'

echo "== Look for symlink-aware APIs in analyzer module =="
rg -n "symlink_metadata|is_symlink|follow_symlink|visited|inode|file_type\(\)|read_link|is_dir\(\)" crates/dustfril-core/src/analyzer || true

echo "== find likely size.rs in crates =="
# Prefer fd if available; otherwise fallback to find
if command -v fd >/dev/null 2>&1; then
  fd -a "size.rs" crates | head -n 20
else
  find crates -name "size.rs" | head -n 20
fi

Repository: FrilLab/dustfril

Length of output: 1675


🏁 Script executed:

#!/bin/bash
set -euo pipefail

SIZEFILE="crates/dustfril-core/src/analyzer/size.rs"
echo "== $SIZEFILE =="
ls -l "$SIZEFILE" || true
sed -n '1,240p' "$SIZEFILE" | awk '{printf "%5d\t%s\n", NR, $0}'

echo "== Search within size.rs for metadata / file_type / symlink_metadata =="
rg -n "metadata\(|file_type\(|symlink_metadata\(|is_dir\(" "$SIZEFILE" || true

echo "== Search across analyzer module for file_type usage =="
rg -n "file_type\(\)|symlink_metadata\(\)" crates/dustfril-core/src/analyzer || true

echo "== Inspect metadata.rs for symlink-related calls =="
rg -n "symlink_metadata\(\)|read_link\(|file_type\(\)|is_symlink|follow" crates/dustfril-core/src/analyzer/metadata.rs || true

Repository: FrilLab/dustfril

Length of output: 1131


Prevent infinite recursion for directory symlinks in get_latest_modified.

get_latest_modified uses path.is_dir(), which follows symlinks; a directory symlink pointing to an ancestor/self can cause unbounded recursion and a stack overflow. Also, crates/dustfril-core/src/analyzer/size.rs uses entry.metadata() (symlink-following), so it doesn’t serve as a counterexample.

🛡️ Proposed fix using non-following file type
-        let current = if path.is_dir() {
-            get_latest_modified(&path)
-        } else {
-            entry.metadata().ok().and_then(|m| m.modified().ok())
-        };
+        let is_dir = entry
+            .file_type()
+            .map(|ft| ft.is_dir())
+            .unwrap_or(false);
+        let current = if is_dir {
+            get_latest_modified(&path)
+        } else {
+            entry.metadata().ok().and_then(|m| m.modified().ok())
+        };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let current = if path.is_dir() {
get_latest_modified(&path)
} else {
entry.metadata().ok().and_then(|m| m.modified().ok())
};
let is_dir = entry
.file_type()
.map(|ft| ft.is_dir())
.unwrap_or(false);
let current = if is_dir {
get_latest_modified(&path)
} else {
entry.metadata().ok().and_then(|m| m.modified().ok())
};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/dustfril-core/src/analyzer/metadata.rs` around lines 13 - 17,
get_latest_modified currently uses path.is_dir() which follows symlinks and can
recurse infinitely for directory symlinks; change the directory check to use
non-following metadata (e.g., path.symlink_metadata() and file_type().is_dir())
and explicitly skip or treat symlinks as non-directories
(file_type().is_symlink()) to avoid recursing into symlinked directories. Update
the logic in get_latest_modified to only recurse when the symlink_metadata
indicates a real directory and/or maintain a visited set of canonicalized
paths/inodes to detect cycles; also review analogous uses of entry.metadata()
(e.g., in size.rs) and switch to non-following metadata or similar
cycle-avoidance logic where appropriate. Ensure references to the symbols
get_latest_modified, path.is_dir, entry.metadata, and size.rs are used to locate
the changes.

@mors119 mors119 merged commit 0b7892d into FrilLab:main Jun 1, 2026
2 checks passed
@mors119 mors119 deleted the feature/early-stage-analyzer branch June 1, 2026 18:40
@coderabbitai coderabbitai Bot mentioned this pull request Jun 2, 2026
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant