diff --git a/apps/desktop/src-tauri/.config/nextest.toml b/.config/nextest.toml similarity index 62% rename from apps/desktop/src-tauri/.config/nextest.toml rename to .config/nextest.toml index ffdb1356..33d50c28 100644 --- a/apps/desktop/src-tauri/.config/nextest.toml +++ b/.config/nextest.toml @@ -1,6 +1,18 @@ # cargo-nextest configuration # See https://nexte.st/docs/configuration/ for full documentation +# Tests in the `virtual-mtp` group share a single backing-dir fixture root at +# `/tmp/cmdr-mtp-e2e-fixtures` (see `mtp::virtual_device::setup_virtual_mtp_device`), +# so running them concurrently races on a shared filesystem resource and one +# wins with `DirectoryNotEmpty`. Cap concurrency at 1 to serialize them across +# nextest's per-process forks. +[test-groups] +virtual-mtp = { max-threads = 1 } + +[[profile.default.overrides]] +filter = 'test(mtp_scan) + test(listing_is_watched_flips_with_connection)' +test-group = 'virtual-mtp' + [profile.default] # Show test output for failing tests only failure-output = "immediate" diff --git a/apps/desktop/src-tauri/src/commands/file_system/write_ops.rs b/apps/desktop/src-tauri/src/commands/file_system/write_ops.rs index 9c9ea639..052788be 100644 --- a/apps/desktop/src-tauri/src/commands/file_system/write_ops.rs +++ b/apps/desktop/src-tauri/src/commands/file_system/write_ops.rs @@ -301,7 +301,15 @@ pub async fn start_scan_preview( }; let progress_interval = progress_interval_ms.unwrap_or(500); - ops_start_scan_preview(app, sources, source_volume, sort_column, sort_order, progress_interval) + ops_start_scan_preview( + app, + sources, + source_volume, + volume_id, + sort_column, + sort_order, + progress_interval, + ) } #[tauri::command] diff --git a/apps/desktop/src-tauri/src/commands/ui.rs b/apps/desktop/src-tauri/src/commands/ui.rs index 9f3744a2..3afd86a7 100644 --- a/apps/desktop/src-tauri/src/commands/ui.rs +++ b/apps/desktop/src-tauri/src/commands/ui.rs @@ -118,6 +118,24 @@ pub fn show_breadcrumb_context_menu(window: Window, shortcut: Str #[tauri::command] #[specta::specta] pub fn show_main_window(window: Window) -> Result<(), String> { + // E2E: on macOS, use `orderFront:` instead of `makeKeyAndOrderFront:` so the + // window appears without stealing focus from whatever the user is currently + // working in. `window.show()` calls the latter on macOS, which always grabs + // OS focus. Linux/Windows test runs happen in headless containers, so the + // standard show is fine there. + #[cfg(target_os = "macos")] + if crate::test_mode::is_e2e_mode() { + use objc2::msg_send; + use objc2::runtime::AnyObject; + let ns_window = window.ns_window().map_err(|e| e.to_string())? as *mut AnyObject; + if ns_window.is_null() { + return Err("NSWindow pointer is null".into()); + } + unsafe { + let _: () = msg_send![ns_window, orderFront: std::ptr::null_mut::()]; + } + return Ok(()); + } window.show().map_err(|e| e.to_string()) } diff --git a/apps/desktop/src-tauri/src/file_system/listing/CLAUDE.md b/apps/desktop/src-tauri/src/file_system/listing/CLAUDE.md index 87ed242a..086516a6 100644 --- a/apps/desktop/src-tauri/src/file_system/listing/CLAUDE.md +++ b/apps/desktop/src-tauri/src/file_system/listing/CLAUDE.md @@ -128,6 +128,7 @@ since it only does cache lookups and `stat` calls via `get_single_entry`. Used by the watcher's incremental path and synthetic mkdir to patch listings without full re-reads: - `find_listings_for_path(path)`: returns all listing IDs whose directory matches the given path (multiple panes/tabs may show the same directory) - `find_listings_for_path_on_volume(volume_id, path)`: same, but also filters by volume ID. Prevents false matches when two volumes serve overlapping paths. +- `try_get_watched_listing(volume_id, path)`: the **fresh-listing oracle** for write-op pre-flight scans. Returns `Some(entries)` when a cached listing exists for `(volume_id, path)` and the volume reports `listing_is_watched(path) == true` (delegated to the backend via the `Volume` trait). Otherwise `None`. When multiple listings exist for the same `(volume_id, path)` pair (two panes), picks the most-recently-updated one deterministically: highest `sequence` (an `AtomicU64` on `CachedListing`), ties broken by latest `created_at`. Entries are cloned out under the cache `RwLock`, then the lock is released before the volume call (cheap clone for a flat `Vec` — < 5 ms for 15k entries; matters because the volume call would otherwise hold the listing-cache lock across an await and block pane navigation). See the freshness-contract section in `volume/CLAUDE.md` for the per-backend debounce windows callers must tolerate. - `insert_entry_sorted(listing_id, entry)`: inserts an entry in sorted position, returns the insertion index - `remove_entry_by_path(listing_id, path)`: removes an entry by its file path, returns the removed index and entry - `update_entry_sorted(listing_id, entry)`: updates an existing entry (remove + re-insert if sort position changed), returns `ModifyResult` diff --git a/apps/desktop/src-tauri/src/file_system/listing/caching.rs b/apps/desktop/src-tauri/src/file_system/listing/caching.rs index e1c7aa3a..fe010481 100644 --- a/apps/desktop/src-tauri/src/file_system/listing/caching.rs +++ b/apps/desktop/src-tauri/src/file_system/listing/caching.rs @@ -512,3 +512,69 @@ pub(crate) fn increment_sequence(listing_id: &str) -> Option { let seq = listing.sequence.fetch_add(1, Ordering::Relaxed) + 1; Some(seq) } + +/// Returns cached entries for `(volume_id, path)` when the volume reports that this +/// listing is being kept in sync by an active watcher. Otherwise `None`. +/// +/// **Freshness contract (read carefully)**: a `Some(_)` result means the volume has +/// an active change-notification channel and the cache reflects the volume's most +/// recently observed state. It does NOT mean the cache is byte-perfect with the +/// device right now: every backend has a debounce or settling window between a real +/// change and the cache reflecting it. +/// +/// - Local FS: FSEvents coalesce window (~10 ms). +/// - SMB: 200 ms watcher debounce; > 50 events per directory triggers a +/// `FullRefresh` which arrives via a real re-read. +/// - MTP: 500 ms event debouncer plus per-device polling. Many MTP devices +/// (cameras especially) never emit per-object events, so "watched" there means +/// only "the device is reachable and would forward changes if it sent any." +/// +/// Callers must treat the result as "fresh as our most recent observation," which +/// is the same guarantee a `list_directory` call gives: it sees the device's state +/// at the moment the call returned, not at the moment the caller reads its result. +/// The contract intentionally accepts this window; a tighter one would force us to +/// re-validate every walk, defeating the whole point of the oracle. +/// +/// When multiple cached listings exist for the same `(volume_id, path)` pair (two +/// panes browsing the same directory), the picker is deterministic: highest +/// `sequence`, ties broken by the latest `created_at`. Both listings receive watcher +/// events, so they're equally fresh; the tiebreaker is just to keep the result +/// stable across calls. +#[allow(dead_code, reason = "M1 plumbing: callers (scan walker, scan-preview) wire up in M2")] +pub fn try_get_watched_listing(volume_id: &str, path: &Path) -> Option> { + // Step 1: find all listings on this (volume_id, path) and pick the most-recently-updated + // one (highest sequence, ties broken by latest created_at). Read the entries out + // under the cache lock and drop the lock before crossing any async / volume boundary. + let entries: Vec = { + let cache = LISTING_CACHE.read().ok()?; + let mut best: Option<(&String, &CachedListing, u64, Instant)> = None; + for (id, listing) in cache.iter() { + if listing.volume_id != volume_id || listing.path != path { + continue; + } + let seq = listing.sequence.load(Ordering::Relaxed); + let created = listing.created_at; + best = match best { + None => Some((id, listing, seq, created)), + Some((_, _, best_seq, best_created)) + if seq > best_seq || (seq == best_seq && created > best_created) => + { + Some((id, listing, seq, created)) + } + Some(other) => Some(other), + }; + } + let (_, listing, ..) = best?; + listing.entries.clone() + }; + + // Step 2: ask the volume whether this listing is being kept fresh by a watcher. + // VolumeManager::get returns an Arc which we hold for the duration of + // the sync `listing_is_watched` call. No await between this and the entries return. + let volume = crate::file_system::get_volume_manager().get(volume_id)?; + if volume.listing_is_watched(path) { + Some(entries) + } else { + None + } +} diff --git a/apps/desktop/src-tauri/src/file_system/listing/caching_test.rs b/apps/desktop/src-tauri/src/file_system/listing/caching_test.rs index 452dcbc6..51f67a21 100644 --- a/apps/desktop/src-tauri/src/file_system/listing/caching_test.rs +++ b/apps/desktop/src-tauri/src/file_system/listing/caching_test.rs @@ -604,3 +604,321 @@ fn test_find_listings_on_volume_empty_for_unknown_volume() { let results = find_listings_on_volume("nonexistent-volume-id"); assert!(results.is_empty()); } + +// ============================================================================ +// try_get_watched_listing tests (M1 oracle) +// ============================================================================ +// +// These tests use a small `WatchedFlagVolume` wrapper around `InMemoryVolume` +// because `InMemoryVolume::listing_is_watched` always returns false (the +// default). The wrapper lets tests pin the watcher flag to `true` or `false` +// without touching `WATCHER_MANAGER` (which would require an `AppHandle`). + +mod oracle_tests { + use std::future::Future; + use std::path::{Path, PathBuf}; + use std::pin::Pin; + use std::sync::Arc; + use std::sync::atomic::{AtomicBool, AtomicU64, Ordering}; + + use super::super::caching::{CachedListing, LISTING_CACHE, try_get_watched_listing}; + use super::super::metadata::FileEntry; + use super::super::sorting::{DirectorySortMode, SortColumn, SortOrder}; + use crate::file_system::get_volume_manager; + use crate::file_system::volume::{ + BatchScanResult, CopyScanResult, InMemoryVolume, ScanConflict, SourceItemInfo, SpaceInfo, Volume, VolumeError, + VolumeReadStream, + }; + + /// A test-only volume wrapper that overrides `listing_is_watched` with a + /// flag controlled per test. Delegates every other method to the inner + /// `InMemoryVolume`. + struct WatchedFlagVolume { + inner: InMemoryVolume, + watched: AtomicBool, + } + + impl WatchedFlagVolume { + fn new(name: &str, watched: bool) -> Self { + Self { + inner: InMemoryVolume::new(name), + watched: AtomicBool::new(watched), + } + } + + fn set_watched(&self, v: bool) { + self.watched.store(v, Ordering::Relaxed); + } + } + + impl Volume for WatchedFlagVolume { + fn name(&self) -> &str { + self.inner.name() + } + + fn root(&self) -> &Path { + self.inner.root() + } + + fn list_directory<'a>( + &'a self, + path: &'a Path, + on_progress: Option<&'a (dyn Fn(usize) + Sync)>, + ) -> Pin, VolumeError>> + Send + 'a>> { + self.inner.list_directory(path, on_progress) + } + + fn get_metadata<'a>( + &'a self, + path: &'a Path, + ) -> Pin> + Send + 'a>> { + self.inner.get_metadata(path) + } + + fn exists<'a>(&'a self, path: &'a Path) -> Pin + Send + 'a>> { + self.inner.exists(path) + } + + fn is_directory<'a>( + &'a self, + path: &'a Path, + ) -> Pin> + Send + 'a>> { + self.inner.is_directory(path) + } + + fn listing_is_watched(&self, _path: &Path) -> bool { + self.watched.load(Ordering::Relaxed) + } + + fn get_space_info<'a>(&'a self) -> Pin> + Send + 'a>> { + self.inner.get_space_info() + } + + fn scan_for_copy<'a>( + &'a self, + path: &'a Path, + ) -> Pin> + Send + 'a>> { + self.inner.scan_for_copy(path) + } + + fn scan_for_copy_batch<'a>( + &'a self, + paths: &'a [PathBuf], + ) -> Pin> + Send + 'a>> { + self.inner.scan_for_copy_batch(paths) + } + + fn scan_for_conflicts<'a>( + &'a self, + source_items: &'a [SourceItemInfo], + dest_path: &'a Path, + ) -> Pin, VolumeError>> + Send + 'a>> { + self.inner.scan_for_conflicts(source_items, dest_path) + } + + fn open_read_stream<'a>( + &'a self, + path: &'a Path, + ) -> Pin, VolumeError>> + Send + 'a>> { + self.inner.open_read_stream(path) + } + } + + fn make_test_entry(name: &str) -> FileEntry { + FileEntry { + size: Some(123), + permissions: 0o644, + owner: "test".to_string(), + group: "staff".to_string(), + extended_metadata_loaded: true, + ..FileEntry::new(name.to_string(), format!("/oracle/{}", name), false, false) + } + } + + /// Inserts a `CachedListing` directly into `LISTING_CACHE` with a controllable + /// sequence. Returns the listing_id. + fn insert_listing_with_sequence( + id: &str, + volume_id: &str, + path: &str, + entries: Vec, + sequence: u64, + ) -> String { + let listing_id = id.to_string(); + let mut cache = LISTING_CACHE.write().unwrap(); + cache.insert( + listing_id.clone(), + CachedListing { + volume_id: volume_id.to_string(), + path: PathBuf::from(path), + entries, + sort_by: SortColumn::Name, + sort_order: SortOrder::Ascending, + directory_sort_mode: DirectorySortMode::LikeFiles, + sequence: AtomicU64::new(sequence), + created_at: std::time::Instant::now(), + }, + ); + listing_id + } + + fn remove_listing(id: &str) { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.remove(id); + } + + fn unique(suffix: &str) -> String { + use std::sync::atomic::AtomicU64; + static N: AtomicU64 = AtomicU64::new(0); + format!( + "oracle_{}_{}_{}", + suffix, + std::process::id(), + N.fetch_add(1, Ordering::Relaxed) + ) + } + + #[test] + fn try_get_watched_listing_hit_when_watcher_reports_true() { + let vid = unique("hit_vid"); + let lid = unique("hit_lid"); + let path = "/oracle/hit"; + + let vol = Arc::new(WatchedFlagVolume::new("hit-vol", true)); + get_volume_manager().register(&vid, vol); + + let entries = vec![make_test_entry("a.txt"), make_test_entry("b.txt")]; + let lid_inserted = insert_listing_with_sequence(&lid, &vid, path, entries.clone(), 0); + + let result = try_get_watched_listing(&vid, Path::new(path)); + assert!(result.is_some(), "expected Some(entries) on watched listing"); + let returned = result.unwrap(); + assert_eq!(returned.len(), entries.len()); + assert_eq!(returned[0].name, "a.txt"); + assert_eq!(returned[1].name, "b.txt"); + + remove_listing(&lid_inserted); + get_volume_manager().unregister(&vid); + } + + #[test] + fn try_get_watched_listing_miss_when_watcher_reports_false() { + let vid = unique("miss_watch_vid"); + let lid = unique("miss_watch_lid"); + let path = "/oracle/miss_watch"; + + let vol = Arc::new(WatchedFlagVolume::new("miss-vol", false)); + get_volume_manager().register(&vid, vol); + + let entries = vec![make_test_entry("a.txt")]; + let lid_inserted = insert_listing_with_sequence(&lid, &vid, path, entries, 0); + + let result = try_get_watched_listing(&vid, Path::new(path)); + assert!(result.is_none(), "expected None when watcher is dead"); + + remove_listing(&lid_inserted); + get_volume_manager().unregister(&vid); + } + + #[test] + fn try_get_watched_listing_miss_when_no_listing_exists() { + let vid = unique("miss_no_listing_vid"); + let vol = Arc::new(WatchedFlagVolume::new("no-listing-vol", true)); + get_volume_manager().register(&vid, vol); + + let result = try_get_watched_listing(&vid, Path::new("/oracle/nothing_here")); + assert!(result.is_none(), "expected None when no listing matches"); + + get_volume_manager().unregister(&vid); + } + + #[test] + fn try_get_watched_listing_miss_when_volume_not_registered() { + let vid = unique("no_vol"); + let lid = unique("no_vol_lid"); + let path = "/oracle/no_vol"; + + // Listing exists in cache, but no volume is registered for this ID. + let lid_inserted = insert_listing_with_sequence(&lid, &vid, path, vec![make_test_entry("a.txt")], 0); + + let result = try_get_watched_listing(&vid, Path::new(path)); + assert!(result.is_none(), "expected None when volume isn't registered"); + + remove_listing(&lid_inserted); + } + + #[test] + fn try_get_watched_listing_picks_highest_sequence() { + // Two listings on the same (volume_id, path) with different sequence + // numbers. The oracle must return the entries from the higher-sequence + // listing, deterministically — never the lower-sequence one. + let vid = unique("seq_vid"); + let lid_lo = unique("seq_lo"); + let lid_hi = unique("seq_hi"); + let path = "/oracle/seq_path"; + + let vol = Arc::new(WatchedFlagVolume::new("seq-vol", true)); + get_volume_manager().register(&vid, vol); + + let lid_lo_inserted = insert_listing_with_sequence(&lid_lo, &vid, path, vec![make_test_entry("low.txt")], 1); + let lid_hi_inserted = insert_listing_with_sequence(&lid_hi, &vid, path, vec![make_test_entry("high.txt")], 9); + + let result = try_get_watched_listing(&vid, Path::new(path)); + assert!(result.is_some()); + let returned = result.unwrap(); + assert_eq!(returned.len(), 1); + assert_eq!(returned[0].name, "high.txt", "expected the higher-sequence listing"); + + remove_listing(&lid_lo_inserted); + remove_listing(&lid_hi_inserted); + get_volume_manager().unregister(&vid); + } + + #[test] + fn try_get_watched_listing_miss_for_start_streaming_watcher_gap() { + // Simulates the documented race window between + // `list_directory_start_streaming` populating LISTING_CACHE and + // `start_watching` inserting into WATCHER_MANAGER: the listing exists + // in cache, but the volume reports no watcher yet (here: by reporting + // `false` from the test volume's `listing_is_watched`). The oracle + // must miss in that window so write ops fall through to a real read. + let vid = unique("race_vid"); + let lid = unique("race_lid"); + let path = "/oracle/race"; + + // `watched=false` mirrors "WATCHER_MANAGER has no entry yet" on the + // local backend without needing an AppHandle. + let vol = Arc::new(WatchedFlagVolume::new("race-vol", false)); + get_volume_manager().register(&vid, vol); + + let lid_inserted = insert_listing_with_sequence(&lid, &vid, path, vec![make_test_entry("a.txt")], 0); + + let result = try_get_watched_listing(&vid, Path::new(path)); + assert!(result.is_none(), "expected None during the streaming->watcher gap"); + + remove_listing(&lid_inserted); + get_volume_manager().unregister(&vid); + } + + #[test] + fn try_get_watched_listing_reflects_flip_to_unwatched() { + // Sanity check: flipping the watcher flag flips the oracle's verdict + // on subsequent calls. Documents that the oracle is a live query and + // doesn't memoize per-listing. + let vid = unique("flip_vid"); + let lid = unique("flip_lid"); + let path = "/oracle/flip"; + + let vol: Arc = Arc::new(WatchedFlagVolume::new("flip-vol", true)); + get_volume_manager().register(&vid, vol.clone() as Arc); + + let lid_inserted = insert_listing_with_sequence(&lid, &vid, path, vec![make_test_entry("x.txt")], 0); + + assert!(try_get_watched_listing(&vid, Path::new(path)).is_some()); + vol.set_watched(false); + assert!(try_get_watched_listing(&vid, Path::new(path)).is_none()); + + remove_listing(&lid_inserted); + get_volume_manager().unregister(&vid); + } +} diff --git a/apps/desktop/src-tauri/src/file_system/volume/CLAUDE.md b/apps/desktop/src-tauri/src/file_system/volume/CLAUDE.md index a6ed10ac..283ef142 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/CLAUDE.md +++ b/apps/desktop/src-tauri/src/file_system/volume/CLAUDE.md @@ -49,6 +49,7 @@ Optional methods default to `Err(VolumeError::NotSupported)` or `false`, so new - `on_unmount()`: lifecycle hook called before unregistration. `SmbVolume` uses it to disconnect its smb2 session. Default is no-op. - `scanner()` / `watcher()`: drive indexing hooks; `None` by default. - `space_poll_interval()`: recommended interval for the live disk-space poller (`space_poller.rs`). Default 2 s (local volumes). `SmbVolume` and `MtpVolume` override to 5 s. `InMemoryVolume` returns `None` (no polling). The poller uses this to tick each volume at its own cadence. +- `listing_is_watched(path)`: returns `true` when this volume's cached listing for `path` is being kept in sync by a live watcher. Used by `file_system::listing::caching::try_get_watched_listing` (the "fresh-listing oracle") to decide whether write-op pre-flight scans can reuse a cached listing instead of re-reading. Default `false` so a new backend without a real watcher won't accidentally claim freshness. **Freshness contract**: a `true` result does NOT mean the cache is byte-perfect with the device right now. Every backend has a debounce or settling window between a real change and the cache reflecting it: local FS ≈ 10 ms (FSEvents coalesce), SMB 200 ms (watcher debounce; > 50 events/dir triggers a `FullRefresh`), MTP 500 ms (event debouncer plus per-device polling; many cameras emit no events at all, so on those `true` means only "the device is reachable"). Callers must treat the result as "fresh as our most recent observation" — the same guarantee a `list_directory` call gives. The MTP and SMB checks are volume-level, not path-level: when the gate flips true, every path on that volume becomes oracle-eligible. ## Building a new volume @@ -83,6 +84,7 @@ Everything below is optional per the trait (methods default to `Err(NotSupported ### Tier 3: integrate with the wider app (optional, but mostly expected) - [ ] If the backend has its own change-notification channel, set `supports_watching() = true` and implement a watcher task that calls `notify_directory_changed` when things move. If you rely on the OS mount's FSEvents (like SmbVolume currently does), leave it `false`. +- [ ] Implement `listing_is_watched(path)` if your backend can answer "is the cached listing for this path being kept in sync by a live watcher?" cheaply. Returning `true` from this gate opts the volume into the fresh-listing oracle: write-op pre-flight scans (copy/move scan preview) reuse cached entries from `LISTING_CACHE` for any path your volume reports as watched, skipping the redundant `list_directory` round-trip. Default `false` is the safe choice — without a real watcher, the cache may be arbitrarily stale. Path-level (LocalPosixVolume) is the most accurate signal; volume-level (MTP "device connected", SMB "Direct + watcher running") is fine when the underlying notification channel is volume-wide. Be honest about the per-backend debounce window in the doc comment; see `try_get_watched_listing` for the freshness contract. - [ ] If `std::fs` operations work on the volume's paths (you're a local FS with extra flavor), leave `supports_local_fs_access()` at the default `true`. Otherwise override to `false` so the legacy synthetic-diff path is skipped. - [ ] If `std::fs::copy` can target this volume's paths directly, return `Some(root)` from `local_path()`. The copy path will prefer `copyfile(3)` / `copy_file_range(2)` for same-device copies. Otherwise return `None` (the default). - [ ] Override `space_poll_interval()` to whatever polling cadence your backend can afford (local 2 s, network 5 s, none = don't poll). @@ -110,6 +112,7 @@ At-a-glance view of which capabilities each current volume opts into. Use this w | `open_read_stream` | ✅ spawn_blocking | ✅ owned download | ✅ channel-backed | ✅ in-memory | | `write_from_stream` | ✅ spawn_blocking | ✅ streaming | ✅ streaming | ✅ in-memory | | `supports_watching` | ✅ FSEvents/inotify | ❌ (own USB watcher) | ❌ (OS-mount FSEvents) | ❌ | +| `listing_is_watched` | ✅ path-level (WATCHER_MANAGER) | ✅ volume-level (device connected) | ✅ volume-level (watcher + Direct) | ❌ (default) | | `supports_local_fs_access` | ✅ (default) | ❌ | ❌ | ❌ | | `local_path` | ✅ `Some(root)` | `None` | `None` | `None` | | `notify_mutation` | default (std::fs) | ✅ MTP `get_metadata` | ✅ smb2 `get_metadata` | ✅ in-memory | @@ -414,7 +417,10 @@ spawned detached task. This is safe because the stream always lives in an async **Why**: `SmbClient::create_file_writer` / `Tree::create_file_writer` both return `FileWriter<'a>` which borrows `&'a mut Connection`. We can't release the client mutex while the writer is alive, so the streaming (large-file) write path serializes concurrent writes on one `SmbVolume`. This is acceptable because the compound fast-path in `write_from_stream` handles every file ≤ `max_write_size` (typically 1 MB on QNAP, the SMB2 spec ceiling is 8 MB) without touching the client mutex for the actual write. Large files are rare in the hot copy path; if this ever becomes a bottleneck, the fix is a future `smb2` release that exposes a `FileWriter` built from a cloned `Connection` + `Arc` (both owned inside the writer) rather than borrowing. **Decision**: `SmbVolume` overrides `scan_for_copy_batch` to pipeline per-path stats over a single SMB session -**Why**: The copy pipeline's scan phase used to loop `scan_for_copy` per top-level source, N sequential RTTs on the wire before the copy phase could even start. For a 100-file copy over a ~60 ms Tailscale link that's ~5 s of serial stats. Fix 4 overrides `scan_for_copy_batch` to clone `smb2::Connection` per path under a brief client-mutex acquire (cheap `Arc::clone`, all clones multiplex over the same SMB session), release the lock, then drive `tree.stat(&mut conn, path)` on each clone inside a `FuturesUnordered`. Empty root paths skip the stat. Single-path batches fall through to `scan_recursive` so one-file drag-drops don't pay the batch machinery cost. Directories found during the stat phase recurse sequentially afterward. Parallel directory recursion is a future "Fix 5". Measured 6.5× wall-clock win at 100 × 10 KB: 6.11 s → 947 ms. See `docs/notes/phase4-rtt-investigation.md` for the wire trace. +**Why**: The copy pipeline's scan phase used to loop `scan_for_copy` per top-level source, N sequential RTTs on the wire before the copy phase could even start. For a 100-file copy over a ~60 ms Tailscale link that's ~5 s of serial stats. Fix 4 overrides `scan_for_copy_batch` to clone `smb2::Connection` per path under a brief client-mutex acquire (cheap `Arc::clone`, all clones multiplex over the same SMB session), release the lock, then drive `tree.stat(&mut conn, path)` on each clone inside a `FuturesUnordered`. Empty root paths skip the stat. Single-path batches fall through to `scan_recursive` so one-file drag-drops don't pay the batch machinery cost. Directories found during the stat phase recurse sequentially afterward. Parallel directory recursion is a future "Fix 5". Measured 6.5× wall-clock win at 100 × 10 KB: 6.11 s → 947 ms. See `docs/notes/phase4-rtt-investigation.md` for the wire trace. **Oracle layered on top (M2b of fresh-listing-reuse)**: before the pipelined-stat block runs, every input path's parent is checked against the fresh-listing oracle (`try_get_watched_listing(volume_id, parent)`). Oracle-served paths get their size + `is_directory` from the cached `FileEntry` and are removed from the leftover set; only the leftover paths go through the pipelined stat. Decision is per-parent: one batch can mix oracle-served and pipelined-stat paths, and if every path resolves via the oracle the stat pipeline is skipped entirely. + +**Decision**: `MtpVolume` overrides `scan_for_copy_batch_with_progress` to group selected paths by parent and list each parent once +**Why**: MTP has no single-file stat call: `get_metadata(path)` lists the parent directory and searches by name. A naive scan that called `get_metadata` per path would re-list `/DCIM/Camera` (15k entries, ~17 s over USB) for every selected photo. The override groups the input paths by parent, calls `list_directory(parent, on_progress)` once per unique parent, and indexes the entries by name for O(1) lookups. **Oracle layered on top (M2b of fresh-listing-reuse)**: before listing a parent, the override consults `try_get_watched_listing(volume_id, parent)`; on hit, the cached entries replace the listing call entirely (no USB I/O for that parent). On miss the existing single-listing-per-parent path runs, so cold-cache perf is preserved. Decision is per-parent; one batch can mix watcher-fresh and cold parents. **Decision**: `Volume::scan_for_copy_batch` returns `BatchScanResult { aggregate, per_path }` (changed in Phase 4 Fix 4) **Why**: The copy engine needs per-source type+size hints (`is_directory`, `total_bytes`) for its `source_hints` map, which seeds conflict detection and feeds the SMB compound fast-path's size hint. Pre-Fix-4 it paid N separate `scan_for_copy` calls to collect both aggregate stats and per-path info. Returning a `BatchScanResult` lets the batch scan surface both at once: one trait call, one round-trip to each backend. Scan-preview callers that only want the aggregate just read `.aggregate`. `LocalPosixVolume` and `InMemoryVolume` inherit the default (serial per-path loop, cheap); `MtpVolume` preserves its "group by parent dir" batch; `SmbVolume` overrides with the pipelined stat path. diff --git a/apps/desktop/src-tauri/src/file_system/volume/friendly_error/kinds.rs b/apps/desktop/src-tauri/src/file_system/volume/friendly_error/kinds.rs index fceff939..0c7708d9 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/friendly_error/kinds.rs +++ b/apps/desktop/src-tauri/src/file_system/volume/friendly_error/kinds.rs @@ -206,6 +206,29 @@ pub(super) fn not_supported(raw_detail: String) -> FriendlyError { } } +/// `STATUS_DELETE_PENDING`: the file has been marked for deletion on the server +/// but at least one open handle is keeping it alive. The file disappears the +/// moment the last handle closes, so retry-after-a-moment is the right hint. +pub(super) fn delete_pending(path_display: &str, raw_detail: String) -> FriendlyError { + FriendlyError { + category: ErrorCategory::Transient, + title: "File is being removed".into(), + explanation: format!( + "`{}` is on its way out. The server marked it for deletion, but another \ + open handle is keeping it around until that handle closes.", + path_display + ), + suggestion: "Here's what to try:\n\ + - Wait a moment and try again — once the last handle closes, the file disappears\n\ + - Close any other apps that might have this file open\n\ + - If it sticks around, restart Cmdr to drop any handles it might still hold" + .into(), + raw_detail, + retry_hint: true, + action_kind: None, + } +} + pub(super) fn io_serious(path_display: &str, message: &str, raw_detail: String) -> FriendlyError { FriendlyError { category: ErrorCategory::Serious, diff --git a/apps/desktop/src-tauri/src/file_system/volume/friendly_error/mod.rs b/apps/desktop/src-tauri/src/file_system/volume/friendly_error/mod.rs index f9826227..075df64c 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/friendly_error/mod.rs +++ b/apps/desktop/src-tauri/src/file_system/volume/friendly_error/mod.rs @@ -211,6 +211,7 @@ mod tests { true, ), (VolumeError::Cancelled("x".into()), ErrorCategory::Transient, true), + (VolumeError::DeletePending("x".into()), ErrorCategory::Transient, true), ( VolumeError::IoError { message: "x".into(), @@ -296,6 +297,7 @@ mod tests { VolumeError::NotFound("x".into()), VolumeError::PermissionDenied("x".into()), VolumeError::ConnectionTimeout("x".into()), + VolumeError::DeletePending("x".into()), VolumeError::IoError { message: "x".into(), raw_os_error: None, @@ -357,6 +359,36 @@ mod tests { } } + #[test] + fn delete_pending_uses_dedicated_copy() { + let path = Path::new("/Volumes/share/photo.jpg"); + let err = VolumeError::DeletePending("Protocol error: STATUS_DELETE_PENDING during Create".into()); + let friendly = friendly_error_from_volume_error(&err, path); + + assert_eq!(friendly.category, ErrorCategory::Transient); + assert!( + friendly.retry_hint, + "DeletePending is transient — user should see a retry hint" + ); + assert!( + friendly.title.contains("being removed"), + "DeletePending title should say the file is being removed, got: {:?}", + friendly.title, + ); + // The path is interpolated into the explanation so the user knows which file. + assert!( + friendly.explanation.contains("photo.jpg"), + "DeletePending explanation should include the path, got: {:?}", + friendly.explanation, + ); + // raw_detail preserves the underlying NTSTATUS for the technical-details disclosure. + assert!( + friendly.raw_detail.contains("DELETE_PENDING"), + "raw_detail should preserve the NTSTATUS code, got: {:?}", + friendly.raw_detail, + ); + } + // ── action_kind tests ─────────────────────────────────────────────── #[cfg(target_os = "macos")] diff --git a/apps/desktop/src-tauri/src/file_system/volume/friendly_error/volume_error.rs b/apps/desktop/src-tauri/src/file_system/volume/friendly_error/volume_error.rs index 3cab2ef3..3b21ed46 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/friendly_error/volume_error.rs +++ b/apps/desktop/src-tauri/src/file_system/volume/friendly_error/volume_error.rs @@ -49,6 +49,7 @@ pub fn friendly_error_from_volume_error(err: &VolumeError, path: &Path) -> Frien VolumeError::StorageFull { .. } => kinds::storage_full(raw), VolumeError::ConnectionTimeout(_) => kinds::connection_timeout(raw), VolumeError::Cancelled(_) => kinds::cancelled(raw), + VolumeError::DeletePending(_) => kinds::delete_pending(&path_display, raw), VolumeError::IsADirectory(_) => FriendlyError { category: ErrorCategory::NeedsAction, title: "This is a folder, not a file".into(), diff --git a/apps/desktop/src-tauri/src/file_system/volume/friendly_error/write_error.rs b/apps/desktop/src-tauri/src/file_system/volume/friendly_error/write_error.rs index 39a152a3..8e607dcb 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/friendly_error/write_error.rs +++ b/apps/desktop/src-tauri/src/file_system/volume/friendly_error/write_error.rs @@ -142,6 +142,7 @@ pub fn friendly_from_write_error(err: &crate::file_system::write_operations::Wri retry_hint: false, action_kind: None, }, + W::DeletePending { path } => kinds::delete_pending(path, raw), W::IoError { path, message } => kinds::io_serious(path, message, raw), } } diff --git a/apps/desktop/src-tauri/src/file_system/volume/local_posix.rs b/apps/desktop/src-tauri/src/file_system/volume/local_posix.rs index 2b52a74d..26027c64 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/local_posix.rs +++ b/apps/desktop/src-tauri/src/file_system/volume/local_posix.rs @@ -245,6 +245,29 @@ impl Volume for LocalPosixVolume { true } + fn listing_is_watched(&self, path: &Path) -> bool { + // Resolve relative-to-volume paths to their absolute form so the comparison + // against `LISTING_CACHE` (which stores absolute paths) lines up. + let abs_path = self.resolve(path); + // Find any listing on this path (volume-agnostic: the listing cache is keyed + // by listing_id and tagged with a volume_id, but LocalPosixVolume doesn't + // store its own volume_id — the manager assigns it at registration time). + let listings = crate::file_system::listing::caching::find_listings_for_path_on_volume(None, &abs_path); + if listings.is_empty() { + return false; + } + // A listing exists; check whether an FSEvents watcher is attached to any + // matching listing_id. There's a race window between the listing being + // populated and the watcher being registered, during which we deliberately + // return false (the listing exists but isn't being kept fresh yet). + match crate::file_system::watcher::WATCHER_MANAGER.read() { + Ok(manager) => listings + .iter() + .any(|(lid, ..)| manager.watches.contains_key(lid.as_str())), + Err(_) => false, + } + } + fn local_path(&self) -> Option { Some(self.root.clone()) } diff --git a/apps/desktop/src-tauri/src/file_system/volume/local_posix_test.rs b/apps/desktop/src-tauri/src/file_system/volume/local_posix_test.rs index 13c0b785..9e22de89 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/local_posix_test.rs +++ b/apps/desktop/src-tauri/src/file_system/volume/local_posix_test.rs @@ -548,3 +548,75 @@ async fn test_list_directory_includes_symlinks() { // Cleanup let _ = fs::remove_dir_all(&test_dir); } + +/// `listing_is_watched` flips with the watcher lifecycle: false before a listing +/// opens, true after `start_watching` succeeds, false after `stop_watching`. +#[test] +fn test_listing_is_watched_flips_with_watcher_lifecycle() { + use crate::file_system::listing::caching::{CachedListing, LISTING_CACHE}; + use crate::file_system::listing::sorting::{DirectorySortMode, SortColumn, SortOrder}; + use crate::file_system::watcher::{start_watching, stop_watching}; + use std::fs; + use std::path::PathBuf; + use std::sync::atomic::AtomicU64; + + let test_dir = std::env::temp_dir().join(format!( + "cmdr_listing_is_watched_test_{}_{}", + std::process::id(), + std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() + )); + let _ = fs::remove_dir_all(&test_dir); + fs::create_dir_all(&test_dir).unwrap(); + + let volume = LocalPosixVolume::new("Test", &test_dir); + let path = PathBuf::from(&test_dir); + + // No listing yet, no watcher: false. + assert!(!volume.listing_is_watched(&path), "expected false before listing opens"); + + // Seed a listing in the cache (the watcher reads its path via LISTING_CACHE). + let listing_id = format!("test_liw_{}", std::process::id()); + { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.insert( + listing_id.clone(), + CachedListing { + volume_id: "root".to_string(), + path: path.clone(), + entries: Vec::new(), + sort_by: SortColumn::Name, + sort_order: SortOrder::Ascending, + directory_sort_mode: DirectorySortMode::LikeFiles, + sequence: AtomicU64::new(0), + created_at: std::time::Instant::now(), + }, + ); + } + + // Listing exists but no watcher: still false (the race-window contract). + assert!( + !volume.listing_is_watched(&path), + "expected false during the listing->watcher gap" + ); + + // Start the watcher: now true. + start_watching(&listing_id, &path).expect("start_watching should succeed on a real temp dir"); + assert!( + volume.listing_is_watched(&path), + "expected true once watcher is attached" + ); + + // Stop the watcher: back to false. + stop_watching(&listing_id); + assert!(!volume.listing_is_watched(&path), "expected false after watcher stops"); + + // Cleanup + { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.remove(&listing_id); + } + let _ = fs::remove_dir_all(&test_dir); +} diff --git a/apps/desktop/src-tauri/src/file_system/volume/mod.rs b/apps/desktop/src-tauri/src/file_system/volume/mod.rs index 05070d75..000dfbe0 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/mod.rs +++ b/apps/desktop/src-tauri/src/file_system/volume/mod.rs @@ -166,6 +166,11 @@ pub enum VolumeError { Cancelled(String), /// The path is a directory, not a file (for example, SMB STATUS_FILE_IS_A_DIRECTORY). IsADirectory(String), + /// The file is in `STATUS_DELETE_PENDING`: a delete has been requested on the server + /// but at least one open handle is keeping the file alive. The file will disappear + /// once the last handle closes; any new `Create` (stat, open, write) on the path + /// fails with this status in the meantime. SMB-only today. + DeletePending(String), IoError { message: String, raw_os_error: Option, @@ -194,6 +199,7 @@ impl std::fmt::Display for VolumeError { Self::ConnectionTimeout(msg) => write!(f, "Connection timed out: {}", msg), Self::Cancelled(msg) => write!(f, "Cancelled: {}", msg), Self::IsADirectory(path) => write!(f, "Is a directory: {}", path), + Self::DeletePending(path) => write!(f, "Delete pending: {}", path), Self::IoError { message, .. } => write!(f, "I/O error: {}", message), Self::FriendlyGit(err) => write!(f, "git: {}", err), } @@ -513,6 +519,22 @@ pub trait Volume: Send + Sync { true } + /// Returns `true` when the listing at `path` is currently being kept in sync + /// by a live watcher on this volume. Used by + /// `file_system::listing::caching::try_get_watched_listing` to decide whether + /// a cached listing can replace a real read in write-op pre-flight. + /// + /// "Live watcher" is intentionally coarse for non-local backends; the + /// returned `true` does NOT mean the cache is byte-perfect with the device + /// right now. Every backend has a debounce or settling window between a real + /// change and the cache reflecting it. See the freshness contract on + /// `try_get_watched_listing` for the per-backend windows callers must tolerate. + /// + /// Default `false`: new backends without an active watcher opt in explicitly. + fn listing_is_watched(&self, _path: &Path) -> bool { + false + } + // ======================================== // Indexing: Optional, default None // ======================================== @@ -796,3 +818,7 @@ mod in_memory_test; mod inmemory_test; #[cfg(test)] mod local_posix_test; +#[cfg(all(test, any(target_os = "macos", target_os = "linux")))] +mod mtp_scan_oracle_tests; +#[cfg(all(test, any(target_os = "macos", target_os = "linux")))] +mod smb_scan_oracle_tests; diff --git a/apps/desktop/src-tauri/src/file_system/volume/mtp.rs b/apps/desktop/src-tauri/src/file_system/volume/mtp.rs index 8d6610b4..99543267 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/mtp.rs +++ b/apps/desktop/src-tauri/src/file_system/volume/mtp.rs @@ -8,6 +8,7 @@ use super::{ VolumeReadStream, }; use crate::file_system::listing::FileEntry; +use crate::file_system::listing::caching::try_get_watched_listing; use crate::mtp::connection::{MtpConnectionError, connection_manager}; use log::debug; use std::future::Future; @@ -107,6 +108,9 @@ impl Volume for MtpVolume { on_progress: Option<&'a (dyn Fn(usize) + Sync)>, ) -> Pin, VolumeError>> + Send + 'a>> { Box::pin(async move { + #[cfg(test)] + test_hooks::bump_list_directory_call_count(); + let mtp_path = self.to_mtp_path(path); debug!( @@ -212,6 +216,16 @@ impl Volume for MtpVolume { false } + fn listing_is_watched(&self, _path: &Path) -> bool { + // MTP "watching" is volume-level, not path-level. The MTP event loop is + // per-device and would report any changes the device emits to any path. + // So as long as the device is connected, treat every cached listing on + // this volume as oracle-eligible. Caveat: many MTP devices (cameras + // especially) never emit per-object events, so `true` means only "the + // device is reachable and would forward changes if it sent any". + connection_manager().is_connected(&self.device_id) + } + fn notify_mutation<'a>( &'a self, _volume_id: &'a str, @@ -466,6 +480,23 @@ impl Volume for MtpVolume { self.scan_for_copy_batch_with_progress(paths, None) } + /// Batch scan with parent-grouping + fresh-listing oracle. + /// + /// Decision flow: + /// 1. Group selected paths by their parent directory (one MTP listing per + /// parent on the cold path is the load-bearing optimization: selecting + /// 135 photos in `/DCIM/Camera` should produce ONE `list_directory` + /// call, not 135 `get_metadata` calls each of which lists the parent). + /// 2. For each unique parent, ask `try_get_watched_listing(volume_id, + /// parent)` first. On hit, every child entry's size + `is_directory` + /// comes from the cached `FileEntry`, no MTP I/O. On miss, fall through + /// to the existing single `list_directory(parent)` per group. + /// + /// The oracle decision is per-parent: different parents in the same call + /// can resolve different ways (one watched, one cold). On oracle hit no + /// `list_directory_with_progress` callbacks fire for that parent, so the + /// FE's scan-preview counter doesn't tick for those entries; the final + /// `BatchScanResult.aggregate` still reflects them. fn scan_for_copy_batch_with_progress<'a>( &'a self, paths: &'a [PathBuf], @@ -484,19 +515,42 @@ impl Volume for MtpVolume { }); } - // Group paths by parent directory so we list each parent at most once - let mut by_parent: std::collections::HashMap> = std::collections::HashMap::new(); + // Group paths by parent. Two keys per group: + // - `original_parent`: the path the FE/cache uses as a listing + // key (typically `/DCIM/Camera`-style absolute). This is what + // the oracle is looked up against. + // - `mtp_parent`: the MTP-relative form used by `list_directory` + // on the cold-cache fallthrough. Stored so we don't call + // `to_mtp_path` twice per group. + #[derive(Default)] + struct ParentGroup<'p> { + original_parent: PathBuf, + mtp_parent: String, + children: Vec<&'p PathBuf>, + } + let mut groups: std::collections::HashMap> = std::collections::HashMap::new(); for path in paths { - let mtp_path = self.to_mtp_path(path); - let mtp_path_buf = PathBuf::from(&mtp_path); - let parent = mtp_path_buf.parent().unwrap_or(Path::new("")).to_path_buf(); - by_parent.entry(parent).or_default().push(path); + let original_parent = path.parent().unwrap_or(Path::new("")).to_path_buf(); + let entry = groups.entry(original_parent.clone()).or_insert_with(|| { + let mtp_path = self.to_mtp_path(path); + let mtp_path_buf = PathBuf::from(&mtp_path); + let mtp_parent = mtp_path_buf + .parent() + .map(|p| p.to_string_lossy().to_string()) + .unwrap_or_default(); + ParentGroup { + original_parent, + mtp_parent, + children: Vec::new(), + } + }); + entry.children.push(path); } debug!( "MtpVolume::scan_for_copy_batch: {} paths across {} unique parent dirs", paths.len(), - by_parent.len() + groups.len() ); // Stage per-path results in a map so the final per_path vec @@ -512,15 +566,35 @@ impl Volume for MtpVolume { top_level_is_directory: false, }; - for (parent, children) in &by_parent { - // List the parent directory once (goes through the listing cache). - // The MTP listing is what dominates wall-clock on a cold cache - // (17 s for 1047 entries via USB), so forward `on_progress` to - // `list_directory_with_progress` (via the trait method) so the - // scan-preview dialog sees a climbing file count instead of a - // frozen 0/0/0 spinner. - let parent_str = parent.to_string_lossy(); - let entries = self.list_directory(Path::new(parent_str.as_ref()), on_progress).await?; + for group in groups.values() { + // Oracle short-circuit: if the parent is watcher-fresh, use + // the cached listing instead of touching the device. The + // freshness contract for MTP is volume-level: when this + // returns `Some`, the device is connected and would forward + // any change events it sends. + let cached = try_get_watched_listing(&self.volume_id, &group.original_parent); + + // List the parent directory once on cold cache (goes through + // the listing cache). The MTP listing is what dominates + // wall-clock on a cold cache (17 s for 1047 entries via USB), + // so forward `on_progress` to `list_directory_with_progress` + // (via the trait method) so the scan-preview dialog sees a + // climbing file count instead of a frozen 0/0/0 spinner. On + // an oracle hit there's no list, so no progress ticks fire + // for this parent's children — the final aggregate still + // includes them. + let entries = match cached { + Some(entries) => { + debug!( + "MtpVolume::scan_for_copy_batch: oracle hit for parent {} ({} cached entries, {} selected children)", + group.original_parent.display(), + entries.len(), + group.children.len() + ); + entries + } + None => self.list_directory(Path::new(&group.mtp_parent), on_progress).await?, + }; // Index entries by name so each child lookup is O(1). A naive // `entries.iter().find(...)` per child is O(n) and the outer @@ -530,7 +604,7 @@ impl Volume for MtpVolume { let entries_by_name: std::collections::HashMap<&str, &FileEntry> = entries.iter().map(|e| (e.name.as_str(), e)).collect(); - for child_path in children { + for child_path in &group.children { let mtp_path = self.to_mtp_path(child_path); let name = Path::new(&mtp_path).file_name().and_then(|n| n.to_str()).unwrap_or(""); @@ -768,6 +842,30 @@ fn map_mtp_error(e: MtpConnectionError) -> VolumeError { } } +/// Test-only call counter for `MtpVolume::list_directory`. The +/// `scan_for_copy_batch_with_progress` integration tests assert "exactly 2 +/// `list_directory` calls for 2 unique parents" without having to wrap the +/// volume (the override calls `self.list_directory` via static dispatch on +/// `MtpVolume`, so a wrapper Volume can't intercept it). +#[cfg(test)] +pub(super) mod test_hooks { + use std::sync::atomic::{AtomicUsize, Ordering}; + + static LIST_DIRECTORY_CALL_COUNT: AtomicUsize = AtomicUsize::new(0); + + pub(super) fn bump_list_directory_call_count() { + LIST_DIRECTORY_CALL_COUNT.fetch_add(1, Ordering::Relaxed); + } + + pub fn reset_list_directory_call_count() { + LIST_DIRECTORY_CALL_COUNT.store(0, Ordering::Relaxed); + } + + pub fn list_directory_call_count() -> usize { + LIST_DIRECTORY_CALL_COUNT.load(Ordering::Relaxed) + } +} + #[cfg(test)] mod tests { use super::*; @@ -842,4 +940,49 @@ mod tests { let vol = MtpVolume::new("mtp-20-5", 65537, "Test"); assert!(vol.supports_streaming()); } + + #[test] + fn test_listing_is_watched_false_when_device_not_connected() { + // Without `virtual-mtp`, we can still assert the negative case: a freshly + // created `MtpVolume` whose device_id was never connected returns false. + let vol = MtpVolume::new("mtp-never-connected-9999", 65537, "Test"); + assert!(!vol.listing_is_watched(Path::new("/DCIM"))); + } + + /// Connects to a virtual MTP device, asserts the oracle gate flips true, then + /// disconnects and asserts it flips false. Requires the `virtual-mtp` feature. + #[cfg(feature = "virtual-mtp")] + #[tokio::test(flavor = "multi_thread", worker_threads = 2)] + async fn test_listing_is_watched_flips_with_connection() { + use crate::mtp::virtual_device::setup_virtual_mtp_device; + + // Register a virtual device backed by a tmp dir. + let location_id = setup_virtual_mtp_device(); + let device_id = format!("mtp-{}", location_id); + + // Before connect: false. + let vol = MtpVolume::new(&device_id, 65537, "Test"); + assert!(!vol.listing_is_watched(Path::new("/")), "expected false before connect"); + + // Connect, then assert true. + let info = connection_manager() + .connect(&device_id, None) + .await + .expect("virtual-mtp connect should succeed"); + // Use whatever storage_id the virtual device reported (we don't care + // which storage; the gate is volume-level). + let storage_id = info.storages.first().expect("virtual device should have storages").id; + let vol = MtpVolume::new(&device_id, storage_id, "Test"); + assert!(vol.listing_is_watched(Path::new("/")), "expected true once connected"); + + // Disconnect, then assert false again. + connection_manager() + .disconnect(&device_id, None, crate::mtp::connection::MtpDisconnectReason::User) + .await + .expect("virtual-mtp disconnect should succeed"); + assert!( + !vol.listing_is_watched(Path::new("/")), + "expected false after disconnect" + ); + } } diff --git a/apps/desktop/src-tauri/src/file_system/volume/mtp_scan_oracle_tests.rs b/apps/desktop/src-tauri/src/file_system/volume/mtp_scan_oracle_tests.rs new file mode 100644 index 00000000..60ad1ca1 --- /dev/null +++ b/apps/desktop/src-tauri/src/file_system/volume/mtp_scan_oracle_tests.rs @@ -0,0 +1,249 @@ +//! Integration tests for the fresh-listing oracle layered on top of +//! `MtpVolume::scan_for_copy_batch_with_progress`. +//! +//! Two scenarios pinned: +//! +//! 1. **Oracle hit**: when the parent listing is watcher-backed (the device is +//! connected and `LISTING_CACHE` holds the entries), the MTP batch scan +//! reads child sizes from the cache and doesn't hit the device. We pin +//! this with a test-only call counter on `MtpVolume::list_directory` +//! (`super::mtp::test_hooks`): zero calls after the scan. +//! 2. **Cold cache, parent-grouped**: when there's no cached listing, the +//! existing parent-grouping optimization still runs. 4 children sharing +//! parent `A` + 2 children sharing parent `B` collapse to exactly 2 +//! `list_directory` calls, not 6. This is the load-bearing perf for the +//! selected-many-photos-in-one-folder workflow. +//! +//! Both live behind the `virtual-mtp` feature so a real `MtpVolume` (with its +//! own override) runs end-to-end against a backing-dir-shaped virtual device. + +#![cfg(feature = "virtual-mtp")] + +use std::path::{Path, PathBuf}; +use std::sync::Arc; +use std::sync::atomic::{AtomicU64, Ordering}; + +use crate::file_system::get_volume_manager; +use crate::file_system::listing::caching::{CachedListing, LISTING_CACHE}; +use crate::file_system::listing::metadata::FileEntry; +use crate::file_system::listing::sorting::{DirectorySortMode, SortColumn, SortOrder}; +use crate::file_system::volume::{MtpVolume, Volume}; +use crate::mtp::connection::{MtpDisconnectReason, connection_manager}; +use crate::mtp::virtual_device::setup_virtual_mtp_device; + +use super::mtp::test_hooks; + +// `setup_virtual_mtp_device` wipes and recreates a shared backing-dir fixture +// root at `/tmp/cmdr-mtp-e2e-fixtures`. Tests in this module + the existing +// `test_listing_is_watched_flips_with_connection` in `mtp.rs` would clobber +// each other if nextest scheduled them concurrently. They run inside the +// `virtual-mtp` test-group (max-threads = 1) per `.config/nextest.toml`. + +/// Unique-per-test counter so parallel tests don't collide in the listing cache. +fn unique(suffix: &str) -> String { + static N: AtomicU64 = AtomicU64::new(0); + format!( + "mtp_oracle_{}_{}_{}", + suffix, + std::process::id(), + N.fetch_add(1, Ordering::Relaxed) + ) +} + +fn make_file_entry(name: &str, parent: &str, size: u64) -> FileEntry { + FileEntry { + size: Some(size), + permissions: 0o644, + owner: "test".to_string(), + group: "staff".to_string(), + extended_metadata_loaded: true, + ..FileEntry::new( + name.to_string(), + format!("{}/{}", parent.trim_end_matches('/'), name), + false, + false, + ) + } +} + +fn insert_listing(id: &str, volume_id: &str, path: &str, entries: Vec) { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.insert( + id.to_string(), + CachedListing { + volume_id: volume_id.to_string(), + path: PathBuf::from(path), + entries, + sort_by: SortColumn::Name, + sort_order: SortOrder::Ascending, + directory_sort_mode: DirectorySortMode::LikeFiles, + sequence: AtomicU64::new(1), + created_at: std::time::Instant::now(), + }, + ); +} + +fn remove_listing(id: &str) { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.remove(id); +} + +/// Connects the virtual MTP device, builds an `MtpVolume` for its first +/// storage, and returns `(device_id, volume, volume_id)`. The volume_id format +/// matches what `MtpVolume::new` computes internally +/// (`"{device_id}:{storage_id}"`); see `mtp/CLAUDE.md` § Volume IDs. +async fn connect_virtual_device() -> (String, Arc, String) { + let location_id = setup_virtual_mtp_device(); + let device_id = format!("mtp-{}", location_id); + let info = connection_manager() + .connect(&device_id, None) + .await + .expect("virtual-mtp connect"); + let storage_id = info.storages.first().expect("at least one virtual storage").id; + let vol = Arc::new(MtpVolume::new(&device_id, storage_id, "Test")); + let volume_id = format!("{}:{}", device_id, storage_id); + (device_id, vol, volume_id) +} + +/// Test 1: on oracle hit, the MTP override skips its `list_directory` call +/// for the (sole) watcher-backed parent. The cached sizes flow into the +/// aggregate; no MTP I/O happens for those entries. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn mtp_scan_uses_oracle_on_hit_skips_list_directory() { + let (device_id, vol, vid) = connect_virtual_device().await; + // Register the volume so the oracle's `VolumeManager::get(vid)` finds it + // and the `listing_is_watched` gate returns true (device connected). + get_volume_manager().register(&vid, vol.clone() as Arc); + + // Pre-populate `LISTING_CACHE` for the parent with sizes that don't match + // any real file on the virtual device. If the oracle short-circuit fails + // (override calls `list_directory` and uses real sizes), `total_bytes` + // would be the device's real numbers, not these cached ones. + let lid = unique("hit"); + let cached = vec![ + make_file_entry("a.jpg", "/DCIM", 1000), + make_file_entry("b.jpg", "/DCIM", 2000), + make_file_entry("c.jpg", "/DCIM", 3000), + ]; + insert_listing(&lid, &vid, "/DCIM", cached); + + // Sanity-check the oracle gate. Without this, an unrelated regression in + // `listing_is_watched` would make the test claim the wrong cause. + assert!( + vol.listing_is_watched(Path::new("/DCIM")), + "virtual device must report connected (listing watched)" + ); + + let paths = vec![ + PathBuf::from("/DCIM/a.jpg"), + PathBuf::from("/DCIM/b.jpg"), + PathBuf::from("/DCIM/c.jpg"), + ]; + + test_hooks::reset_list_directory_call_count(); + let result = vol + .scan_for_copy_batch_with_progress(&paths, None) + .await + .expect("oracle-served batch scan"); + + assert_eq!( + test_hooks::list_directory_call_count(), + 0, + "expected zero MtpVolume::list_directory calls on oracle hit" + ); + // Cached sizes (not device sizes) win. + assert_eq!(result.aggregate.file_count, 3); + assert_eq!(result.aggregate.total_bytes, 6000); + assert_eq!(result.per_path.len(), 3); + + remove_listing(&lid); + get_volume_manager().unregister(&vid); + connection_manager() + .disconnect(&device_id, None, MtpDisconnectReason::User) + .await + .expect("virtual-mtp disconnect"); +} + +/// Test 2: no cached listing → the cold-cache parent-grouping optimization +/// runs. Two unique parents, multiple children under each → exactly 2 +/// `list_directory` calls. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn mtp_scan_cold_cache_still_uses_parent_grouping() { + let (device_id, vol, vid) = connect_virtual_device().await; + get_volume_manager().register(&vid, vol.clone() as Arc); + + // MTP needs the parent's path-handle cached before it can list any path + // (`resolve_path_to_handle` is cache-only; only `/` is auto-known). Walk + // root first so `/Documents` and `/DCIM` get into the path-handle cache. + // We don't care about the entries here, just the side effect on the cache. + let root = vol.list_directory(Path::new("/"), None).await.expect("listing /"); + assert!( + root.iter().any(|e| e.name == "Documents") && root.iter().any(|e| e.name == "DCIM"), + "expected Documents/ and DCIM/ at root of virtual device fixture" + ); + + // Sanity-verify the children exist before relying on them in the scan. + // These two listings also seed the listing cache (5 s TTL) for `/Documents` + // and `/DCIM`, so we clear that cache below before the actual scan to + // ensure the cold path runs. + let documents = vol + .list_directory(Path::new("/Documents"), None) + .await + .expect("listing /Documents"); + let dcim = vol + .list_directory(Path::new("/DCIM"), None) + .await + .expect("listing /DCIM"); + assert!( + documents.iter().any(|e| e.name == "report.txt") && documents.iter().any(|e| e.name == "notes.txt"), + "expected report.txt and notes.txt in /Documents fixture" + ); + assert!( + dcim.iter().any(|e| e.name == "photo-001.jpg"), + "expected photo-001.jpg in /DCIM fixture" + ); + + // Clear the mtp-rs listing cache so the override's `list_directory` + // calls actually hit USB (rather than the cache) — the override invokes + // `MtpVolume::list_directory` which counts via `test_hooks`, but the + // assertion is structural ("called exactly twice"), not "did real I/O". + // The path-handle cache stays primed; only the listing cache is dropped. + connection_manager().clear_all_listing_caches().await; + test_hooks::reset_list_directory_call_count(); + + // 4 children under /Documents (duplicates are intentional: even a + // 100-photo-pick should produce one parent listing, not 100), 2 under + // /DCIM. Total: 6 input paths, 2 unique parents. + let paths = vec![ + PathBuf::from("/Documents/report.txt"), + PathBuf::from("/Documents/notes.txt"), + PathBuf::from("/Documents/report.txt"), + PathBuf::from("/Documents/notes.txt"), + PathBuf::from("/DCIM/photo-001.jpg"), + PathBuf::from("/DCIM/photo-001.jpg"), + ]; + + let result = vol + .scan_for_copy_batch_with_progress(&paths, None) + .await + .expect("cold batch scan"); + + assert_eq!( + test_hooks::list_directory_call_count(), + 2, + "expected exactly 2 MtpVolume::list_directory calls (one per unique parent)" + ); + // Sanity: every unique input resolved. + let unique_inputs: std::collections::HashSet<&Path> = paths.iter().map(|p| p.as_path()).collect(); + assert_eq!( + result.per_path.len(), + unique_inputs.len(), + "per_path should have one entry per unique input path" + ); + + get_volume_manager().unregister(&vid); + connection_manager() + .disconnect(&device_id, None, MtpDisconnectReason::User) + .await + .expect("virtual-mtp disconnect"); +} diff --git a/apps/desktop/src-tauri/src/file_system/volume/smb.rs b/apps/desktop/src-tauri/src/file_system/volume/smb.rs index dd00e091..266c5902 100644 --- a/apps/desktop/src-tauri/src/file_system/volume/smb.rs +++ b/apps/desktop/src-tauri/src/file_system/volume/smb.rs @@ -10,6 +10,7 @@ use super::{ VolumeReadStream, path_to_id, }; use crate::file_system::listing::FileEntry; +use crate::file_system::listing::caching::try_get_watched_listing; use log::{debug, info, warn}; use smb2::client::tree::Tree; use smb2::{ClientConfig, SmbClient}; @@ -134,6 +135,15 @@ fn fs_info_to_space_info(info: &smb2::client::tree::FsInfo) -> SpaceInfo { /// Converts an `smb2::Error` to `VolumeError`. fn map_smb_error(err: smb2::Error) -> VolumeError { use smb2::ErrorKind; + use smb2::types::status::NtStatus; + + // `STATUS_DELETE_PENDING` currently classifies as `ErrorKind::Other` in + // smb2 (no typed variant yet), so we detect it via the raw NTSTATUS before + // falling through to the generic kind match. + if err.status() == Some(NtStatus::DELETE_PENDING) { + return VolumeError::DeletePending(err.to_string()); + } + match err.kind() { ErrorKind::NotFound => VolumeError::NotFound(err.to_string()), ErrorKind::AlreadyExists => VolumeError::AlreadyExists(err.to_string()), @@ -281,6 +291,18 @@ impl SmbVolume { &self.volume_id } + /// Test-only: drops the smb2 client session. After calling this, any code + /// path that tries to acquire the client mutex sees `None` and returns + /// `VolumeError::DeviceDisconnected`. Used by + /// `smb_scan_oracle_tests::smb_scan_uses_oracle_on_hit_skips_stat_pipeline` + /// to prove the oracle short-circuit doesn't touch the SMB session: if it + /// did, the scan would fail with DeviceDisconnected after this call. + #[cfg(test)] + pub(super) async fn detach_session_for_test(&self) { + let mut client_guard = self.client.lock().await; + *client_guard = None; + } + /// Returns the current connection state. pub fn connection_state(&self) -> ConnectionState { ConnectionState::from_u8(self.state.load(Ordering::Relaxed)) @@ -1131,6 +1153,22 @@ impl Volume for SmbVolume { false } + fn listing_is_watched(&self, _path: &Path) -> bool { + // SMB watching is volume-level: the smb_watcher monitors the whole share + // via CHANGE_NOTIFY. So once the watcher is alive and the session is + // Direct, every cached listing on this volume is oracle-eligible. + // `watcher_cancel` is a std `Mutex` (not async): use `try_lock` and treat + // contention as "not watched" to keep the oracle out of the lock-wait path. + // The oracle will simply fall through to a real read; that's the safe + // direction. Don't hold the lock across awaits (we never `.await` here + // anyway: this is a sync method). + let has_watcher = match self.watcher_cancel.try_lock() { + Ok(guard) => guard.is_some(), + Err(_) => return false, + }; + has_watcher && self.connection_state() == ConnectionState::Direct + } + fn notify_mutation<'a>( &'a self, _volume_id: &'a str, @@ -1460,14 +1498,110 @@ impl Volume for SmbVolume { }); } + // Oracle short-circuit: group inputs by parent and ask + // `try_get_watched_listing` for each unique parent. Any path whose + // parent is watcher-backed gets its size + is_directory from the + // cached `FileEntry` (no SMB stat). Remaining paths fall through + // to the pipelined-stat flow below. Decision is per-parent: one + // call can mix oracle-served paths with pipelined-stat paths. + // + // SMB stats are per-path (not per-parent listing), so the grouping + // here is purely about oracle eligibility; the fallthrough path + // doesn't need parent grouping itself. + let mut per_path_results: Vec> = (0..paths.len()).map(|_| None).collect(); + let mut leftover_indices: Vec = Vec::with_capacity(paths.len()); + { + use std::collections::HashMap; + // Cache oracle lookups so two paths sharing a parent only pay + // one cache scan + clone. Value: indexed-by-name view over the + // cached entries, or None if the oracle missed for this parent. + let mut parent_cache: HashMap>> = HashMap::new(); + for (idx, path) in paths.iter().enumerate() { + let original_parent = path.parent().unwrap_or(Path::new("")).to_path_buf(); + let entries = parent_cache + .entry(original_parent.clone()) + .or_insert_with(|| try_get_watched_listing(&self.volume_id, &original_parent)); + + let Some(cached_entries) = entries.as_ref() else { + leftover_indices.push(idx); + continue; + }; + + let Some(name) = path.file_name().and_then(|n| n.to_str()) else { + leftover_indices.push(idx); + continue; + }; + + let Some(entry) = cached_entries.iter().find(|e| e.name == name) else { + // Cache doesn't know this child (stale selection, + // encoding mismatch). Fall through to a real stat for + // safety rather than reporting it as missing. + leftover_indices.push(idx); + continue; + }; + + if entry.is_directory { + // Directories still need a recursive scan to count + // descendants. The oracle just told us "this is a + // dir without an SMB stat"; recurse to expand it. + let smb_path = self.to_smb_path(path); + let scan = self.scan_recursive(&smb_path).await?; + per_path_results[idx] = Some(scan); + } else { + per_path_results[idx] = Some(CopyScanResult { + file_count: 1, + dir_count: 0, + total_bytes: entry.size.unwrap_or(0), + top_level_is_directory: false, + }); + } + } + + if !leftover_indices.is_empty() { + debug!( + "SmbVolume::scan_for_copy_batch: share={}, oracle resolved {}/{} paths; pipelining stats for {}", + self.share_name, + paths.len() - leftover_indices.len(), + paths.len(), + leftover_indices.len() + ); + } + } + + // All paths resolved via oracle: assemble the result and skip the + // pipelined-stat machinery entirely. + if leftover_indices.is_empty() { + let mut aggregate = CopyScanResult { + file_count: 0, + dir_count: 0, + total_bytes: 0, + top_level_is_directory: false, + }; + let mut per_path = Vec::with_capacity(paths.len()); + for (i, slot) in per_path_results.into_iter().enumerate() { + let scan = slot.expect("oracle path must have populated every index"); + aggregate.file_count += scan.file_count; + aggregate.dir_count += scan.dir_count; + aggregate.total_bytes += scan.total_bytes; + per_path.push((paths[i].clone(), scan)); + } + return Ok(BatchScanResult { aggregate, per_path }); + } + // Pre-compute SMB paths so the pipelined stats can borrow strings - // that outlive the futures' lifetimes. - let smb_paths: Vec = paths.iter().map(|p| self.to_smb_path(p)).collect(); + // that outlive the futures' lifetimes. We compute them for the + // leftover indices only so an oracle-only path costs zero + // `to_smb_path` calls below. + let smb_paths: Vec<(usize, String)> = leftover_indices + .iter() + .map(|&idx| (idx, self.to_smb_path(&paths[idx]))) + .collect(); debug!( - "SmbVolume::scan_for_copy_batch: share={}, {} paths; pipelining stats", + "SmbVolume::scan_for_copy_batch: share={}, {} paths leftover for pipelined stats (oracle handled {})", self.share_name, - paths.len() + smb_paths.len(), + paths.len() - smb_paths.len() ); // Build N pipelined stats: one cloned `Connection` per path, no @@ -1489,7 +1623,8 @@ impl Volume for SmbVolume { type StatFuture = Pin)> + Send>>; let mut stat_futs: FuturesUnordered = FuturesUnordered::new(); - for (idx, smb_path) in smb_paths.iter().enumerate() { + for (idx, smb_path) in &smb_paths { + let idx = *idx; if smb_path.is_empty() { // Root: no stat needed. Inline a ready future so the // ordering logic below still sees a slot for this index. @@ -1514,9 +1649,9 @@ impl Volume for SmbVolume { })); } - // Stage per-path scan results + "recurse later" list while - // draining the pipelined stats as they complete. - let mut per_path_results: Vec> = (0..paths.len()).map(|_| None).collect(); + // `per_path_results` is already shaped to the input length and + // pre-populated with oracle-resolved entries; the pipelined-stat + // path below only fills the still-None slots. // Indices to recurse into after the stat batch finishes. let mut dirs_to_recurse: Vec = Vec::new(); @@ -1561,8 +1696,15 @@ impl Volume for SmbVolume { // future "Fix 5" (pipelined directory recursion). For the 100 × // tiny-file scenario all sources are files, so this loop is never // entered. + // `smb_paths` is `Vec<(idx, String)>` keyed by the leftover index; + // build a lookup so dir-recursion can find each path by its + // original input index. + let smb_path_by_idx: std::collections::HashMap = + smb_paths.iter().map(|(i, s)| (*i, s.as_str())).collect(); for idx in dirs_to_recurse { - let smb_path = &smb_paths[idx]; + let smb_path = smb_path_by_idx + .get(&idx) + .expect("dirs_to_recurse only carries indices from the leftover stat batch"); let scan = self.scan_recursive(smb_path).await?; per_path_results[idx] = Some(scan); } @@ -1989,6 +2131,26 @@ mod tests { assert!(matches!(ve, VolumeError::NotFound(_))); } + #[test] + fn map_smb_error_delete_pending() { + // STATUS_DELETE_PENDING surfaces when a delete has been requested but at + // least one open handle is keeping the file alive. smb2 currently classifies + // it as `ErrorKind::Other`, so `map_smb_error` must dispatch on the raw + // NTSTATUS to produce the typed `VolumeError::DeletePending` variant — + // otherwise the FE falls back to the generic "disk needs attention" copy + // instead of the transient "file is being removed" message. + let err = smb2::Error::Protocol { + status: smb2::types::status::NtStatus::DELETE_PENDING, + command: smb2::types::Command::Create, + }; + let ve = map_smb_error(err); + assert!( + matches!(ve, VolumeError::DeletePending(_)), + "STATUS_DELETE_PENDING should map to VolumeError::DeletePending, got: {:?}", + ve, + ); + } + #[test] fn map_smb_error_access_denied() { let err = smb2::Error::Protocol { @@ -2262,6 +2424,62 @@ mod tests { assert_eq!(vol.connection_state(), ConnectionState::Direct); } + #[test] + fn listing_is_watched_false_when_disconnected() { + // No watcher_cancel set and state Disconnected: false. + let vol = make_test_volume(); + assert!(!vol.listing_is_watched(Path::new("/"))); + } + + #[test] + fn listing_is_watched_false_when_direct_but_no_watcher() { + // State Direct but `watcher_cancel` empty: still false (we need both). + let vol = make_test_volume_direct(); + assert!(!vol.listing_is_watched(Path::new("/"))); + } + + #[test] + fn listing_is_watched_false_when_watcher_set_but_disconnected() { + // `watcher_cancel` populated but state Disconnected: false. + let vol = make_test_volume(); + let (tx, _rx) = tokio::sync::oneshot::channel::<()>(); + *vol.watcher_cancel.lock().unwrap() = Some(tx); + assert!(!vol.listing_is_watched(Path::new("/"))); + } + + #[test] + fn listing_is_watched_true_when_direct_and_watcher_set() { + // Both conditions met: true. + let vol = make_test_volume_direct(); + let (tx, _rx) = tokio::sync::oneshot::channel::<()>(); + *vol.watcher_cancel.lock().unwrap() = Some(tx); + assert!(vol.listing_is_watched(Path::new("/"))); + } + + #[tokio::test] + #[ignore = "Requires Docker SMB containers (./apps/desktop/test/smb-servers/start.sh)"] + async fn smb_integration_listing_is_watched_flips_with_connection() { + // End-to-end check against a live Docker SMB server: after + // `connect_smb_volume`, the watcher is spawned and state is Direct, + // so the oracle gate returns true. After flipping the state to + // Disconnected (simulating a ConnectionLost event), the gate flips + // false even though `watcher_cancel` is still set: the contract is + // "watcher present AND Direct," and a half-broken volume must not be + // treated as fresh. + let vol = make_docker_volume().await; + assert_eq!(vol.connection_state(), ConnectionState::Direct); + assert!( + vol.listing_is_watched(Path::new("/")), + "expected true on a freshly-connected Docker volume" + ); + + vol.transition_to_disconnected(); + assert!( + !vol.listing_is_watched(Path::new("/")), + "expected false after transitioning to Disconnected" + ); + } + #[test] fn on_unmount_marks_volume_dead() { // `on_unmount` is sync (called from FSEvents thread) and uses diff --git a/apps/desktop/src-tauri/src/file_system/volume/smb_scan_oracle_tests.rs b/apps/desktop/src-tauri/src/file_system/volume/smb_scan_oracle_tests.rs new file mode 100644 index 00000000..68514207 --- /dev/null +++ b/apps/desktop/src-tauri/src/file_system/volume/smb_scan_oracle_tests.rs @@ -0,0 +1,135 @@ +//! Integration test for the fresh-listing oracle layered on top of +//! `SmbVolume::scan_for_copy_batch`. +//! +//! Pinned scenario: when the parent listing is watcher-backed, the SMB batch +//! scan resolves child sizes from the cache and skips the pipelined-stat +//! path entirely. We prove this by **dropping the smb2 session before the +//! scan**: if the oracle handled every path, no `tree.stat` call happens and +//! the scan still succeeds. If the oracle misses, the pipelined-stat block +//! tries to acquire the client mutex, finds `None`, and returns +//! `VolumeError::DeviceDisconnected`. That's a strong signal that doesn't +//! require plumbing a call counter through the smb2 client. +//! +//! Gated like the other Docker SMB tests with `#[ignore]` so a regular +//! `cargo nextest run` won't fail when no containers are running. Run via +//! `cargo nextest run smb_integration_scan_oracle --run-ignored all` after +//! starting `apps/desktop/test/smb-servers/start.sh`. + +use std::path::PathBuf; +use std::sync::Arc; +use std::sync::atomic::{AtomicU64, Ordering}; + +use crate::file_system::get_volume_manager; +use crate::file_system::listing::caching::{CachedListing, LISTING_CACHE}; +use crate::file_system::listing::metadata::FileEntry; +use crate::file_system::listing::sorting::{DirectorySortMode, SortColumn, SortOrder}; +use crate::file_system::volume::Volume; +use crate::file_system::volume::smb::{SmbVolume, connect_smb_volume}; + +fn unique(suffix: &str) -> String { + static N: AtomicU64 = AtomicU64::new(0); + format!( + "smb_oracle_{}_{}_{}", + suffix, + std::process::id(), + N.fetch_add(1, Ordering::Relaxed) + ) +} + +fn make_file_entry(name: &str, parent: &str, size: u64) -> FileEntry { + FileEntry { + size: Some(size), + permissions: 0o644, + owner: "test".to_string(), + group: "staff".to_string(), + extended_metadata_loaded: true, + ..FileEntry::new( + name.to_string(), + format!("{}/{}", parent.trim_end_matches('/'), name), + false, + false, + ) + } +} + +fn insert_listing(id: &str, volume_id: &str, path: &str, entries: Vec) { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.insert( + id.to_string(), + CachedListing { + volume_id: volume_id.to_string(), + path: PathBuf::from(path), + entries, + sort_by: SortColumn::Name, + sort_order: SortOrder::Ascending, + directory_sort_mode: DirectorySortMode::LikeFiles, + sequence: AtomicU64::new(1), + created_at: std::time::Instant::now(), + }, + ); +} + +fn remove_listing(id: &str) { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.remove(id); +} + +async fn make_docker_volume() -> SmbVolume { + let port: u16 = std::env::var("SMB_CONSUMER_GUEST_PORT") + .ok() + .and_then(|v| v.parse().ok()) + .unwrap_or(10480); + connect_smb_volume("public", "/tmp/smb-test-mount", "127.0.0.1", "public", None, None, port) + .await + .unwrap_or_else(|e| { + panic!("Failed to connect to Docker SMB container at 127.0.0.1:{port}. Is it running? ({e:?})") + }) +} + +#[tokio::test] +#[ignore = "Requires Docker SMB containers (./apps/desktop/test/smb-servers/start.sh)"] +async fn smb_scan_uses_oracle_on_hit_skips_stat_pipeline() { + let vol = Arc::new(make_docker_volume().await); + let vid = vol.volume_id().to_string(); + + get_volume_manager().register(&vid, vol.clone() as Arc); + + // Cache a listing for a synthetic parent path. The actual share doesn't + // need to host these files: the oracle short-circuit reads from cache, + // and we'll drop the SMB session so any fallthrough would fail. + let parent = "/Volumes/TestShare/oracle-test"; + let lid = unique("hit"); + let cached = vec![ + make_file_entry("a.bin", parent, 4096), + make_file_entry("b.bin", parent, 8192), + ]; + insert_listing(&lid, &vid, parent, cached); + + // Now break the session. With no client, any pipelined stat fails with + // DeviceDisconnected. The oracle path doesn't acquire the client lock at + // all for served paths, so it survives. + vol.detach_session_for_test().await; + + let paths = vec![ + PathBuf::from(format!("{}/a.bin", parent)), + PathBuf::from(format!("{}/b.bin", parent)), + ]; + + let result = vol + .scan_for_copy_batch(&paths) + .await + .expect("oracle-served batch scan should succeed with no SMB session"); + + assert_eq!(result.aggregate.file_count, 2); + assert_eq!( + result.aggregate.total_bytes, 12288, + "size should come from cached entries, not real SMB stat" + ); + assert_eq!(result.per_path.len(), 2); + + remove_listing(&lid); + get_volume_manager().unregister(&vid); + // No need to clean up the share: nothing was written. + // Note: we don't transition the volume back to Direct here. The volume is + // about to be dropped (Arc), and `Drop` doesn't restart the session. +} diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/CLAUDE.md b/apps/desktop/src-tauri/src/file_system/write_operations/CLAUDE.md index 73bf9e87..1d7448f7 100644 --- a/apps/desktop/src-tauri/src/file_system/write_operations/CLAUDE.md +++ b/apps/desktop/src-tauri/src/file_system/write_operations/CLAUDE.md @@ -8,6 +8,10 @@ Implements the four destructive file operations as background tasks that stream operation is cancellable, reports byte-level progress, and handles edge cases: symlink loops, same-inode overwrites, network mounts, cross-filesystem moves, and name/path length limits. +Pre-flight scans reuse cached listings when the source volume reports an active watcher, avoiding redundant +`list_directory` calls. The freshness contract and per-backend debounce windows are documented in +`../volume/CLAUDE.md` and `../listing/caching.rs::try_get_watched_listing`. + ## Files | File | Responsibility | @@ -20,7 +24,7 @@ network mounts, cross-filesystem moves, and name/path length limits. | `scan_preview.rs` | Scan preview subsystem for Copy dialog live stats: `start_scan_preview`, `cancel_scan_preview`, `is_scan_preview_complete`. Background scans (local and volume-based) with result caching. Emits `expected_files_total` / `expected_bytes_total` (sampled once at scan start from the drive index) on every `scan-preview-progress` event, alongside the running tallies and `current_dir`. | | `copy.rs` | `copy_files_with_progress`: scan → disk space check → per-file copy via `copy_single_item`. `CopyTransaction` for rollback. The per-source execute loop runs through `drive_transfer_serial_sync` (`transfer_driver.rs`); the closure captures `&mut transaction` / `&mut created_dirs` / `&mut tracker` / `&mut apply_to_all_resolution` and threads them into `copy_single_item`. Pre-flight scan / dry-run / disk-space / bulk-skip filtering stay outside the driver. Post-loop dispatch matches on `PostLoopIntent` (Completed / Cancelled / Failed) and reproduces the historic three-arm shape — including the post-completion `RollingBack` recheck for the click-during-the-last-millisecond race (commit `1de4255d`). | | `move_op.rs` | Same-fs: `fs::rename`. Cross-fs: copy to `.cmdr-staging-`, atomic rename, delete sources. | -| `delete.rs` | Scan, delete files first, then directories in reverse/deepest-first order. Not rollbackable. Also contains `delete_volume_files_with_progress` for non-local volumes (MTP): scans via `volume.list_directory(path, Some(&cb))` (per-entry throttled progress so the FE tally climbs mid-listing on slow MTP roundtrips), deletes via `volume.delete()` per item. Shared cumulative tally lives in an `Arc` (atomics for files/dirs/bytes + `Mutex` throttle) so the per-entry callback and the post-subtree snapshot agree across recursion levels. Both emit paths use `with_scan_meta(current_dir, dirs_done, None)` so the scanning UI shows the dir count and the directory the walker is currently in. | +| `delete.rs` | Scan, delete files first, then directories in reverse/deepest-first order. Not rollbackable. Also contains `delete_volume_files_with_progress` for non-local volumes (MTP): consumes the scan preview via `take_cached_scan_result(preview_id)` first (top-level files come straight from `CopyScanResult` with no `is_directory` probe, top-level dirs recurse via the oracle-aware walker); on no-preview paths (MCP, programmatic) the top-level `is_directory(source)` probe stays unless the source's parent is watcher-fresh in `LISTING_CACHE`, in which case the type comes from the cached entry. The walker (`scan_volume_recursive`) consults `try_get_watched_listing(volume_id, path)` before every `list_directory`, so any subtree open in another pane is cache-fed. Scans via `volume.list_directory(path, Some(&cb))` (per-entry throttled progress so the FE tally climbs mid-listing on slow MTP roundtrips), deletes via `volume.delete()` per item. Shared cumulative tally lives in an `Arc` (atomics for files/dirs/bytes + `Mutex` throttle) so the per-entry callback and the post-subtree snapshot agree across recursion levels. Both emit paths use `with_scan_meta(current_dir, dirs_done, None)` so the scanning UI shows the dir count and the directory the walker is currently in. | | `eta.rs` | `EtaEstimator`: time-weighted EWMA per axis (bytes, files), τ ≈ 3 s. Combines via `max(ETA_bytes, ETA_files)`. One per `WriteOperationState`, fed by `state.enrich_progress` at every `write-progress` emit site. See [ETA + throughput](#eta--throughput) below. | | `trash.rs` | `move_to_trash_sync()` (macOS: ObjC `trashItemAtURL`; Linux: `trash` crate; reused by `commands/rename.rs`) and `trash_files_with_progress()` (batch trash with per-item progress, cancellation, partial failure). Uses `symlink_metadata()` for existence checks (handles dangling symlinks). | | `copy_strategy.rs` | Strategy selection per file: network FS → chunked copy; overwrite → temp+rename; macOS → `copyfile(3)`; Linux → `copy_file_range(2)`. | @@ -34,6 +38,7 @@ network mounts, cross-filesystem moves, and name/path length limits. | `tests.rs` | Unit tests. | | `copy_integration_test.rs` | Copy operation integration tests (permissions, symlinks, xattrs, edge cases). | | `delete_integration_test.rs` | Delete operation integration tests. | +| `delete_volume_reuse_tests.rs` | Volume-delete tests for scan-preview reuse and oracle fast paths (M3 of `fresh-listing-reuse-plan.md`). | | `move_integration_test.rs` | Same-fs and cross-fs move integration tests. | | `transaction_integration_test.rs` | CopyTransaction record/rollback/commit tests. | | `validation_integration_test.rs` | Validation functions, safety checks, path length, disk space tests. | @@ -49,6 +54,10 @@ Frontend → tokio::task::spawn_blocking (local I/O) or direct async (volume ops) → validate (sources exist, dest writable, not same location, dest not inside source) → scan phase: walk_dir_recursive, emit scan-progress events + (delete on a volume also: `take_cached_scan_result(preview_id)` first; + on hit, build the entry list from `per_path` — top-level files come + straight from the cache, top-level dirs recurse via the oracle-aware + walker; on miss, fall through to `scan_volume_recursive`) → disk space check (statvfs) → execute phase: per-file copy/delete → throttled write-progress events (200ms default) @@ -230,9 +239,15 @@ exits, partial files or staging directories may remain on disk. These use the `. **Decision**: `copy_files_with_progress_inner` aligns `scan_result.files` to the driver's `&[PathBuf]` API via a paired `Vec<&FileInfo>` and a closure-captured `slice::Iter` advanced in lock-step with the driver iteration. **Why**: The sync driver iterates a generic `&[PathBuf]`, but the local-FS copy loop needs the full `FileInfo` (for `dest_path`, `is_symlink`, `size`, and the `SourceItemTracker` key). Three alternatives were rejected: (a) indexing into `scan_result.files` by `ctx.files_done_so_far` — wrong, the cumulative counter is bytes-affecting and includes bulk-skipped files, so the index would shift; (b) extending `TransferContext` with a generic associated payload — couples the driver to local-FS specifics; (c) cloning the `FileInfo` slice for `sources` — copies on the hot path. The iterator approach is O(0) memory beyond the path vec and matches the driver's iteration order exactly (`pre_skip_paths` is empty because we pre-filter `scan_result.files` ourselves, so the driver invokes the closure once per surviving file). The `.expect()` is justified inline; if the driver ever stopped calling the closure once per source the test suite would break. +**Decision**: Scan preview reuses watched listings (the "fresh-listing oracle"). +**Why**: Pre-flight scans for copy/move on MTP (and to a lesser degree SMB and big local trees) used to duplicate work the backend already had in `LISTING_CACHE`. Selecting 135 photos in a watched `/DCIM/Camera` (~15k entries) and pressing F5 would re-list the parent dir over USB just to look up size by name — ~17 s of "Verifying before copy…" while the listing was already fresh on the pane behind the dialog. `run_volume_scan_preview` now groups input sources by parent dir and consults `try_get_watched_listing(volume_id, parent)` first. On hit, sizes and `is_directory` flags come from the cached `FileEntry` for top-level files; top-level directories recurse via `scan_subtree_with_oracle`, which re-applies the oracle at every level (so a subfolder open in another pane also short-circuits). On miss, the call falls through to `volume.scan_for_copy_batch_with_progress(paths_in_group, ...)` — same code path as before — so MTP's parent-grouping and SMB's pipelined-stat optimizations still run for cold-cache parents. The local-FS walker (`walk_dir_recursive` in `scan.rs`) also takes an oracle check at the top of each recursive call, with `volume_id = "root"` plumbed through from `scan_sources_internal` and `run_scan_preview`. The freshness contract is bright-line at the watcher boundary: no "5 seconds is fresh enough" TTL, just "the volume's `listing_is_watched(path)` returned true." See `file_system/listing/caching.rs::try_get_watched_listing` for the per-backend debounce windows that contract tolerates. + **Decision**: `delete_files_start` routes to either `delete_files_with_progress` (local, uses `walkdir` + `fs::remove_file`) or `delete_volume_files_with_progress` (non-local, uses `Volume` trait) based on `volume_id`. **Why**: MTP volumes can't use `walkdir` or `fs::remove_*`. Rather than refactoring the existing local delete to go through the Volume trait (which would add overhead for local ops), we keep the fast local path and add a parallel volume-aware path. Both emit identical events so the frontend progress dialog works unchanged. +**Decision**: Volume delete reuses the scan preview and is oracle-aware on the no-preview path. +**Why**: Before this, `delete_volume_files_with_progress_inner` ignored `config.preview_id` entirely and ran `scan_volume_recursive` again. On MTP that meant a second 17 s parent listing for a 135-photo `/DCIM/Camera` delete after the user had just paid that cost in the pre-flight dialog — and the second scan emitted no per-top-level-file progress, so the UI looked frozen. The fix has three parts. (1) `delete_volume_files_with_progress_inner` calls `take_cached_scan_result(preview_id)` at the top; on hit, top-level files are recorded from `CopyScanResult::total_bytes` with no `is_directory` probe and no `list_directory` round-trip, and top-level dirs recurse via the oracle-aware `scan_volume_recursive` (passing `is_dir_hint = Some(true)` so the recursion never re-probes). (2) The walker's internal `volume.list_directory(path, ...)` is now preceded by `try_get_watched_listing(volume_id, path)`; on hit, the cached entries replace the volume call entirely at every recursion level. (3) On the no-preview path (MCP triggers, programmatic deletes), the top-level `volume.is_directory(source)` probe stays only when the parent oracle misses — when a pane has the source's parent open and watcher-fresh, the type comes from the cached `FileEntry` and the probe is skipped. The cache-hit path also emits a throttled scan-progress event per `progress_interval` while building the entry list, so the FE dialog shows movement during the fast path instead of waiting for the delete phase to start. Pinned by `delete_volume_reuse_tests.rs`. Data-safety contract: stale-by-one cached entries can either silently skip a now-gone file (acceptable: the user already moved it elsewhere) or attempt to delete a missing one (the volume's `delete` errors cleanly). Neither direction can delete the wrong file because we feed `volume.delete(&entry.path)` exact paths the cache observed; a cached entry that races with a concurrent rename ends up addressing the old path the next call won't find. + **Decision**: Keep `exacl` crate for ACL copy in chunked copies (not custom FFI bindings). **Why**: `exacl` adds zero new transitive dependencies (all of its deps, `bitflags`, `log`, `scopeguard`, `uuid`, are already in our tree). It provides cross-platform ACL support (macOS, Linux, FreeBSD) and full ACL parsing/manipulation for potential future UI features. The crate appears unmaintained (last release Feb 2024) but ACL APIs are stable and don't change. Our usage is best-effort with graceful fallback: if `exacl` ever breaks, files still copy, they just lose ACLs. MIT licensed (compatible with BSL). @@ -262,6 +277,12 @@ and `statfs.f_fstypename` for APFS. See `copy_strategy.rs` for the implementatio **Gotcha**: On macOS, never use `statvfs` alone for disk space checks; use `NSURLVolumeAvailableCapacityForImportantUsageKey` **Why**: `statvfs` reports only physically free blocks. On APFS, purgeable space (iCloud caches, APFS snapshots) can account for tens of GB that macOS will reclaim on demand. Using `statvfs` causes the "insufficient space" error to reject copies that would actually succeed, and shows a different available-space number than the status bar (which uses the NSURL API). `validate_disk_space` in `helpers.rs` calls `crate::volumes::get_volume_space()` on macOS and falls back to `statvfs` on Linux. +**Gotcha**: Hardlink dedup doesn't straddle the oracle/walk boundary. +**Why**: `walk_dir_recursive` dedupes hardlinks by inode for `total_bytes` (so a hardlink-heavy tree like cargo's `target/` reports correct bytes-to-free). `FileEntry` doesn't carry inode, so when the oracle supplies one half of a hardlink pair from the cached listing and a real walk supplies the other half, the dedup misses and bytes get over-counted. Direction is safe: over-counting → pessimistic ETA + conservative disk-space reject, never the other way. The walker's existing `walker_dedupes_hardlinks_by_inode` test still pins the same-side dedup. If true cross-boundary dedup ever becomes worth it, add `inode: Option` to `FileEntry`; not in this milestone. + +**Gotcha**: Volume disconnect mid-walk races with the oracle. +**Why**: The oracle returns `Some(entries)` when `listing_is_watched` is true at the moment of the check. Between that read and the recursive walk consuming the entries (and then issuing real `list_directory` calls for any sub-subfolders that aren't cached), the watcher can die (cable yanked, network drop). The synthesized totals for the cached level are correct — they reflect what the listing held — but recursion into now-disconnected sub-subfolders fails per-call, and the per-file copy/delete later then hits `DeviceDisconnected`-shaped errors instead of a single "device gone" message at the scan level. Same race that `scan_for_copy_batch` already had; the oracle doesn't widen it. Documented here so future investigation knows where to look. + **Gotcha**: Skip-All on volume copy/move with a top-level dir conflict still skips the entire dir subtree, even after the local-FS bulk-skip fix. **Why**: `build_pre_skip_set` now excludes top-level directories so non-conflicting children inside a conflicting dir get a chance to copy. For LOCAL copy this works because `copy_files_with_progress_inner` flattens dirs to per-file `FileInfo` entries pre-loop, and per-iter conflict resolution then evaluates each child individually. For VOLUME copy/move, the driver iterates top-level paths directly, and `resolve_volume_conflict` returns `Ok(None)` (= Skip) for ANY dir-vs-dir conflict under Skip mode without recursing — so the whole subtree is still dropped. Fixing this requires teaching `resolve_volume_conflict` (or the volume-side closure) to recurse-and-merge for dir conflicts under Skip, the same way `apply_volume_conflict_resolution` already does for Overwrite. Pinned by the Playwright spec `conflict-copy.spec.ts::Copy with Skip All preserves destination files` (local-FS path, currently green) — a volume-side equivalent test would catch the residual. diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/delete.rs b/apps/desktop/src-tauri/src/file_system/write_operations/delete.rs index f013ff1b..2bfc5a41 100644 --- a/apps/desktop/src-tauri/src/file_system/write_operations/delete.rs +++ b/apps/desktop/src-tauri/src/file_system/write_operations/delete.rs @@ -6,7 +6,7 @@ use std::sync::{Arc, Mutex}; use std::time::{Duration, Instant}; use super::helpers::spawn_async_sync; -use super::scan::{SourceItemTracker, scan_sources}; +use super::scan::{SourceItemTracker, scan_sources, take_cached_scan_result}; use super::state::{WriteOperationState, update_operation_status}; use super::types::{ DryRunResult, IoResultExt, OperationEventSink, TauriEventSink, WriteCancelledEvent, WriteCompleteEvent, @@ -14,6 +14,7 @@ use super::types::{ WriteSourceItemDoneEvent, }; use super::volume_copy::map_volume_error; +use crate::file_system::listing::caching::try_get_watched_listing; use crate::file_system::volume::Volume; // ============================================================================ @@ -229,7 +230,9 @@ impl VolumeScanTracker { )] async fn scan_volume_recursive( volume: &dyn Volume, + volume_id: &str, path: &Path, + is_dir_hint: Option, entries: &mut Vec, total_bytes: &mut u64, state: &Arc, @@ -245,10 +248,18 @@ async fn scan_volume_recursive( }); } - let is_dir = volume - .is_directory(path) - .await - .map_err(|e| map_volume_error(&path.display().to_string(), e))?; + // Resolve whether `path` is a directory. Prefer the caller's hint (the + // top-level cache-hit path supplies it from `CopyScanResult`, the + // recursive call supplies it from the parent's `FileEntry`). Without a + // hint, fall back to a real `is_directory` probe — which on MTP lists + // the parent dir, so we avoid it whenever the oracle can answer. + let is_dir = match is_dir_hint { + Some(v) => v, + None => volume + .is_directory(path) + .await + .map_err(|e| map_volume_error(&path.display().to_string(), e))?, + }; if is_dir { // Snapshot the cumulative tallies BEFORE the (potentially slow) listing @@ -295,20 +306,37 @@ async fn scan_volume_recursive( ); }; - let children = volume - .list_directory(path, Some(&on_progress)) - .await - .map_err(|e| map_volume_error(&path.display().to_string(), e))?; + // Oracle-first: if this directory is watcher-backed in `LISTING_CACHE`, + // reuse the cached entries and skip the volume round-trip entirely. + // Falls through to `volume.list_directory` on miss, preserving the + // per-entry progress callback for slow MTP listings. + let children = match try_get_watched_listing(volume_id, path) { + Some(cached) => { + // The cached listing is already complete, so synthesize a + // single end-of-listing progress tick to keep the FE counter + // climbing during the cache-fed pass too. + on_progress(cached.len()); + cached + } + None => volume + .list_directory(path, Some(&on_progress)) + .await + .map_err(|e| map_volume_error(&path.display().to_string(), e))?, + }; // Recurse into children first. list_directory returns FileEntry with size, // so we use child.size directly instead of calling get_metadata (which - // returns NotSupported on MTP). + // returns NotSupported on MTP). Pass `is_dir_hint = Some(child.is_directory)` + // so the recursive call doesn't re-probe (avoids another parent listing + // on MTP). for child in &children { let child_path = PathBuf::from(&child.path); if child.is_directory { Box::pin(scan_volume_recursive( volume, + volume_id, &child_path, + Some(true), entries, total_bytes, state, @@ -393,6 +421,7 @@ async fn scan_volume_recursive( )] pub(super) async fn delete_volume_files_with_progress( volume: Arc, + volume_id: &str, app: &tauri::AppHandle, operation_id: &str, state: &Arc, @@ -400,7 +429,7 @@ pub(super) async fn delete_volume_files_with_progress( config: &WriteOperationConfig, ) -> Result<(), WriteOperationError> { let events = TauriEventSink::new(app.clone()); - delete_volume_files_with_progress_inner(volume, &events, operation_id, state, sources, config).await + delete_volume_files_with_progress_inner(volume, volume_id, &events, operation_id, state, sources, config).await } #[allow( @@ -409,43 +438,193 @@ pub(super) async fn delete_volume_files_with_progress( )] pub(super) async fn delete_volume_files_with_progress_inner( volume: Arc, + volume_id: &str, events: &dyn OperationEventSink, operation_id: &str, state: &Arc, sources: &[PathBuf], config: &WriteOperationConfig, ) -> Result<(), WriteOperationError> { + use std::sync::atomic::Ordering; + // Phase 1: Scan. Recursively enumerate the tree via volume.list_directory(). + // Fast path: if the scan preview already enumerated the top-level shape (file + // vs dir + size per source), reuse it. The per-source `top_level_is_directory` + // and `total_bytes` flags come from `CachedScanResult::per_path` populated by + // `run_volume_scan_preview`. We still need to recurse into top-level dirs to + // get the per-file paths delete needs — but the recursion's walker is now + // oracle-aware, so a watched subtree skips the `list_directory` round-trip. + let cached_scan = config.preview_id.as_deref().and_then(take_cached_scan_result); + let mut entries: Vec = Vec::new(); let mut total_bytes = 0u64; let tracker = Arc::new(VolumeScanTracker::new(state.progress_interval)); + let mut last_scan_emit = Instant::now(); - for source in sources { - // Check if the source itself is a file or directory - let is_dir = volume.is_directory(source).await.unwrap_or(false); - - if is_dir { - scan_volume_recursive( - &*volume, - source, - &mut entries, - &mut total_bytes, - state, - events, - operation_id, - &tracker, - ) - .await?; - } else { - // Top-level file: size unknown without listing the parent, use 0. - // Progress still tracks file count accurately, and individual file - // deletes are near-instant on MTP. - tracker.files_so_far.fetch_add(1, std::sync::atomic::Ordering::Relaxed); - entries.push(VolumeDeleteEntry { - path: source.to_path_buf(), - size: 0, - is_dir: false, + if let Some(scan) = cached_scan.as_ref() { + log::debug!( + "delete_volume_files_with_progress: reused cached scan for operation_id={}, files={}, bytes={}, per_path={}", + operation_id, + scan.file_count, + scan.total_bytes, + scan.per_path.len() + ); + + // Index cached per-path results by source path so we can look up + // `top_level_is_directory` and `total_bytes` for each input source. + let by_path: std::collections::HashMap = + scan.per_path.iter().map(|(p, r)| (p.clone(), r)).collect(); + + for source in sources { + if super::state::is_cancelled(&state.intent) { + events.emit_cancelled(WriteCancelledEvent { + operation_id: operation_id.to_string(), + operation_type: WriteOperationType::Delete, + files_processed: 0, + rolled_back: false, + }); + return Err(WriteOperationError::Cancelled { + message: "Operation cancelled by user".to_string(), + }); + } + + match by_path.get(source) { + Some(per_path_scan) if !per_path_scan.top_level_is_directory => { + // Top-level file: size comes straight from the cache. No + // `is_directory` probe and no `list_directory` round-trip. + let size = per_path_scan.total_bytes; + tracker.files_so_far.fetch_add(1, Ordering::Relaxed); + tracker.bytes_so_far.fetch_add(size, Ordering::Relaxed); + total_bytes += size; + entries.push(VolumeDeleteEntry { + path: source.to_path_buf(), + size, + is_dir: false, + }); + } + Some(_) => { + // Top-level dir: recurse via the oracle-aware walker, with + // `is_dir_hint = Some(true)` so the recursion never re-probes + // the top-level. Any subtree that's open in another pane and + // watcher-fresh gets cache-fed inside the walker. + scan_volume_recursive( + &*volume, + volume_id, + source, + Some(true), + &mut entries, + &mut total_bytes, + state, + events, + operation_id, + &tracker, + ) + .await?; + } + None => { + // Scan preview was non-volume (local-FS) or didn't include + // this source. Fall back to the no-preview shape for this + // path: oracle-aware walker resolves the type. + scan_volume_recursive( + &*volume, + volume_id, + source, + None, + &mut entries, + &mut total_bytes, + state, + events, + operation_id, + &tracker, + ) + .await?; + } + } + + // Even on the all-files fast path, throttle a single scan-progress + // emit per interval so the FE dialog shows movement during the + // entry-list build. Without this the user sees "Scanning…" frozen + // until the actual delete phase starts. + if last_scan_emit.elapsed() >= state.progress_interval { + let files_done = tracker.files_so_far.load(Ordering::Relaxed); + let dirs_done = tracker.dirs_so_far.load(Ordering::Relaxed); + let bytes_done = tracker.bytes_so_far.load(Ordering::Relaxed); + state.emit_progress_via_sink( + events, + WriteProgressEvent::new( + operation_id.to_string(), + WriteOperationType::Delete, + WriteOperationPhase::Scanning, + None, + files_done, + 0, + bytes_done, + 0, + ) + .with_scan_meta(Some(source.display().to_string()), dirs_done, None), + ); + update_operation_status( + operation_id, + WriteOperationPhase::Scanning, + None, + files_done, + 0, + bytes_done, + 0, + ); + last_scan_emit = Instant::now(); + } + } + } else { + for source in sources { + // Oracle-first parent lookup: if the parent's listing is watched + // (open in some pane), pull `is_directory` from there instead of + // calling `volume.is_directory`, which on MTP lists the parent. + // Falls through to the trait probe when the oracle can't answer. + let parent_hint = source.parent().and_then(|parent| { + let cached_entries = try_get_watched_listing(volume_id, parent)?; + let name = source.file_name()?.to_string_lossy().to_string(); + cached_entries + .iter() + .find(|e| { + PathBuf::from(&e.path) + .file_name() + .map(|n| n.to_string_lossy() == name) + .unwrap_or(false) + }) + .map(|e| e.is_directory) }); + + let is_dir = match parent_hint { + Some(v) => v, + None => volume.is_directory(source).await.unwrap_or(false), + }; + + if is_dir { + scan_volume_recursive( + &*volume, + volume_id, + source, + Some(true), + &mut entries, + &mut total_bytes, + state, + events, + operation_id, + &tracker, + ) + .await?; + } else { + // Top-level file: size unknown without listing the parent, use 0. + // Progress still tracks file count accurately, and individual file + // deletes are near-instant on MTP. + tracker.files_so_far.fetch_add(1, Ordering::Relaxed); + entries.push(VolumeDeleteEntry { + path: source.to_path_buf(), + size: 0, + is_dir: false, + }); + } } } diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/delete_volume_reuse_tests.rs b/apps/desktop/src-tauri/src/file_system/write_operations/delete_volume_reuse_tests.rs new file mode 100644 index 00000000..2dc18ef8 --- /dev/null +++ b/apps/desktop/src-tauri/src/file_system/write_operations/delete_volume_reuse_tests.rs @@ -0,0 +1,438 @@ +//! Integration tests for `delete_volume_files_with_progress_inner`'s reuse of +//! the scan preview and the fresh-listing oracle (M3 of the fresh-listing-reuse +//! plan). +//! +//! These tests use the `OperationEventSink` test plumbing (no Tauri +//! `AppHandle`) and a counter-wrapping `InMemoryVolume` so we can assert call +//! counts directly. The patterns mirror `scan_preview_oracle_tests.rs` (oracle +//! wiring, listing cache seeding) and `volume_copy_tests.rs` (state + sink + +//! preview-result seeding). + +use std::future::Future; +use std::path::{Path, PathBuf}; +use std::pin::Pin; +use std::sync::Arc; +use std::sync::atomic::{AtomicBool, AtomicU64, AtomicUsize, Ordering}; +use std::time::Duration; + +use super::delete::delete_volume_files_with_progress_inner; +use super::state::{CachedScanResult, SCAN_PREVIEW_RESULTS, WriteOperationState}; +use super::types::{CollectorEventSink, WriteOperationConfig}; +use crate::file_system::get_volume_manager; +use crate::file_system::listing::caching::{CachedListing, LISTING_CACHE}; +use crate::file_system::listing::metadata::FileEntry; +use crate::file_system::listing::sorting::{DirectorySortMode, SortColumn, SortOrder}; +use crate::file_system::volume::{BatchScanResult, CopyScanResult, InMemoryVolume, Volume, VolumeError}; + +// ---------------------------------------------------------------------------- +// Counter-wrapping volume +// ---------------------------------------------------------------------------- + +/// Wraps an `InMemoryVolume` and counts `list_directory`, `is_directory`, and +/// `delete` calls. Lets `listing_is_watched` be flipped at runtime so tests can +/// pin both oracle-hit and oracle-miss behaviours. +struct CountingVolume { + inner: InMemoryVolume, + watched: AtomicBool, + list_dir_calls: AtomicUsize, + is_dir_calls: AtomicUsize, + delete_calls: AtomicUsize, +} + +impl CountingVolume { + fn new(name: &str, watched: bool) -> Self { + Self { + inner: InMemoryVolume::new(name), + watched: AtomicBool::new(watched), + list_dir_calls: AtomicUsize::new(0), + is_dir_calls: AtomicUsize::new(0), + delete_calls: AtomicUsize::new(0), + } + } + + fn list_dir_count(&self) -> usize { + self.list_dir_calls.load(Ordering::Relaxed) + } + + fn is_dir_count(&self) -> usize { + self.is_dir_calls.load(Ordering::Relaxed) + } + + fn delete_count(&self) -> usize { + self.delete_calls.load(Ordering::Relaxed) + } + + #[allow(dead_code, reason = "Kept for future tests that flip the watcher mid-scan.")] + fn set_watched(&self, v: bool) { + self.watched.store(v, Ordering::Relaxed); + } +} + +impl Volume for CountingVolume { + fn name(&self) -> &str { + self.inner.name() + } + fn root(&self) -> &Path { + self.inner.root() + } + + fn list_directory<'a>( + &'a self, + path: &'a Path, + on_progress: Option<&'a (dyn Fn(usize) + Sync)>, + ) -> Pin, VolumeError>> + Send + 'a>> { + self.list_dir_calls.fetch_add(1, Ordering::Relaxed); + self.inner.list_directory(path, on_progress) + } + + fn get_metadata<'a>( + &'a self, + path: &'a Path, + ) -> Pin> + Send + 'a>> { + self.inner.get_metadata(path) + } + + fn exists<'a>(&'a self, path: &'a Path) -> Pin + Send + 'a>> { + self.inner.exists(path) + } + + fn is_directory<'a>( + &'a self, + path: &'a Path, + ) -> Pin> + Send + 'a>> { + self.is_dir_calls.fetch_add(1, Ordering::Relaxed); + self.inner.is_directory(path) + } + + fn delete<'a>(&'a self, path: &'a Path) -> Pin> + Send + 'a>> { + self.delete_calls.fetch_add(1, Ordering::Relaxed); + self.inner.delete(path) + } + + fn listing_is_watched(&self, _path: &Path) -> bool { + self.watched.load(Ordering::Relaxed) + } + + fn scan_for_copy<'a>( + &'a self, + path: &'a Path, + ) -> Pin> + Send + 'a>> { + self.inner.scan_for_copy(path) + } + + fn scan_for_copy_batch_with_progress<'a>( + &'a self, + paths: &'a [PathBuf], + on_progress: Option<&'a (dyn Fn(usize) + Sync)>, + ) -> Pin> + Send + 'a>> { + let _ = on_progress; + self.inner.scan_for_copy_batch(paths) + } +} + +// ---------------------------------------------------------------------------- +// Test helpers +// ---------------------------------------------------------------------------- + +fn unique(suffix: &str) -> String { + static N: AtomicU64 = AtomicU64::new(0); + format!( + "delreuse_{}_{}_{}", + suffix, + std::process::id(), + N.fetch_add(1, Ordering::Relaxed) + ) +} + +fn make_file_entry(name: &str, parent: &str, size: u64, is_dir: bool) -> FileEntry { + FileEntry { + size: if is_dir { None } else { Some(size) }, + permissions: if is_dir { 0o755 } else { 0o644 }, + owner: "test".to_string(), + group: "staff".to_string(), + extended_metadata_loaded: true, + ..FileEntry::new( + name.to_string(), + format!("{}/{}", parent.trim_end_matches('/'), name), + is_dir, + false, + ) + } +} + +fn insert_listing(id: &str, volume_id: &str, path: &str, entries: Vec) -> String { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.insert( + id.to_string(), + CachedListing { + volume_id: volume_id.to_string(), + path: PathBuf::from(path), + entries, + sort_by: SortColumn::Name, + sort_order: SortOrder::Ascending, + directory_sort_mode: DirectorySortMode::LikeFiles, + sequence: AtomicU64::new(1), + created_at: std::time::Instant::now(), + }, + ); + id.to_string() +} + +fn remove_listing(id: &str) { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.remove(id); +} + +fn make_state() -> Arc { + Arc::new(WriteOperationState::new(Duration::from_millis(50))) +} + +// ---------------------------------------------------------------------------- +// Tests +// ---------------------------------------------------------------------------- + +/// Test 1: `delete_files_start` with a fresh `preview_id` consumes the cached +/// scan and skips the rescan. `list_directory` is called once total (the +/// preview's listing — simulated here by seeding `SCAN_PREVIEW_RESULTS` +/// directly, the same shape `run_volume_scan_preview` produces); during the +/// delete itself, the call count stays at zero. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn delete_consumes_preview_id_skips_rescan() { + let vid = unique("consumes_preview"); + let preview_id = unique("preview_consumes"); + + // Keep a typed Arc so we can read the counters after the trait call. + let vol = Arc::new(CountingVolume::new("preview-vol", false)); + vol.inner.create_file(Path::new("/a.jpg"), b"alpha").await.unwrap(); + vol.inner.create_file(Path::new("/b.jpg"), b"betabb").await.unwrap(); + get_volume_manager().register(&vid, vol.clone() as Arc); + + // Simulate a completed scan preview: per-path entries for two top-level + // files. The real `start_scan_preview` path produces this same structure + // (see `run_volume_scan_preview` → `CachedScanResult`). + SCAN_PREVIEW_RESULTS.write().unwrap().insert( + preview_id.clone(), + CachedScanResult { + files: Vec::new(), + dirs: Vec::new(), + file_count: 2, + total_bytes: 11, + per_path: vec![ + ( + PathBuf::from("/a.jpg"), + CopyScanResult { + file_count: 1, + dir_count: 0, + total_bytes: 5, + top_level_is_directory: false, + }, + ), + ( + PathBuf::from("/b.jpg"), + CopyScanResult { + file_count: 1, + dir_count: 0, + total_bytes: 6, + top_level_is_directory: false, + }, + ), + ], + }, + ); + + let events = Arc::new(CollectorEventSink::new()); + let state = make_state(); + let config = WriteOperationConfig { + preview_id: Some(preview_id.clone()), + ..WriteOperationConfig::default() + }; + + let result = delete_volume_files_with_progress_inner( + vol.clone() as Arc, + &vid, + events.as_ref(), + "test-op-delete-reuse", + &state, + &[PathBuf::from("/a.jpg"), PathBuf::from("/b.jpg")], + &config, + ) + .await; + + assert!(result.is_ok(), "delete should succeed: {:?}", result); + assert_eq!( + vol.list_dir_count(), + 0, + "cached preview path must NOT call list_directory during delete" + ); + assert_eq!( + vol.is_dir_count(), + 0, + "cached preview path must NOT probe is_directory for top-level files" + ); + assert_eq!(vol.delete_count(), 2, "both top-level files should be deleted"); + + get_volume_manager().unregister(&vid); +} + +/// Test 2: no `preview_id` (MCP path) still produces a correct delete. The +/// walker walks via the volume backend; we just confirm correctness end-to-end. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn delete_without_preview_id_still_walks() { + let vid = unique("no_preview"); + + let vol = Arc::new(CountingVolume::new("no-preview-vol", false)); + vol.inner.create_file(Path::new("/x.txt"), b"x").await.unwrap(); + vol.inner.create_file(Path::new("/y.txt"), b"yy").await.unwrap(); + get_volume_manager().register(&vid, vol.clone() as Arc); + + let events = Arc::new(CollectorEventSink::new()); + let state = make_state(); + let config = WriteOperationConfig::default(); // preview_id: None + + let result = delete_volume_files_with_progress_inner( + vol.clone() as Arc, + &vid, + events.as_ref(), + "test-op-no-preview", + &state, + &[PathBuf::from("/x.txt"), PathBuf::from("/y.txt")], + &config, + ) + .await; + + assert!(result.is_ok(), "delete should succeed: {:?}", result); + assert_eq!(vol.delete_count(), 2, "both top-level files should be deleted"); + // Watcher not flipped, oracle misses → fall through to `is_directory` per + // source. Confirms the no-preview path keeps its top-level probe. + assert_eq!( + vol.is_dir_count(), + 2, + "no-preview path must probe is_directory per top-level source on oracle miss" + ); + + get_volume_manager().unregister(&vid); +} + +/// Test 3: no `preview_id`, but the parent listing is watcher-fresh in +/// `LISTING_CACHE`. The walker should consult the oracle for the top-level +/// `is_directory` decision and skip the trait probe entirely. Asserts the +/// `is_directory` call count stays at zero. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn delete_top_level_files_no_is_directory_probes() { + let vid = unique("oracle_top_level"); + let parent_lid = unique("oracle_top_level_lid"); + + let vol = Arc::new(CountingVolume::new("oracle-top-level-vol", true)); + vol.inner.create_file(Path::new("/dcim/a.jpg"), b"alpha").await.unwrap(); + vol.inner + .create_file(Path::new("/dcim/b.jpg"), b"betabb") + .await + .unwrap(); + get_volume_manager().register(&vid, vol.clone() as Arc); + + // Pane has /dcim open with entries for a.jpg and b.jpg. Both are files. + let cached = vec![ + make_file_entry("a.jpg", "/dcim", 5, false), + make_file_entry("b.jpg", "/dcim", 6, false), + ]; + let parent_lid_inserted = insert_listing(&parent_lid, &vid, "/dcim", cached); + + let events = Arc::new(CollectorEventSink::new()); + let state = make_state(); + let config = WriteOperationConfig::default(); // preview_id: None + + let result = delete_volume_files_with_progress_inner( + vol.clone() as Arc, + &vid, + events.as_ref(), + "test-op-oracle-top-level", + &state, + &[PathBuf::from("/dcim/a.jpg"), PathBuf::from("/dcim/b.jpg")], + &config, + ) + .await; + + assert!(result.is_ok(), "delete should succeed: {:?}", result); + assert_eq!( + vol.is_dir_count(), + 0, + "watched parent listing must replace the per-source is_directory probe" + ); + assert_eq!( + vol.list_dir_count(), + 0, + "watched parent listing must replace the list_directory call" + ); + assert_eq!(vol.delete_count(), 2); + + remove_listing(&parent_lid_inserted); + get_volume_manager().unregister(&vid); +} + +/// Test 4: a subfolder of the delete source is open in pane B. We start the +/// delete with the source as a top-level directory, but pane B closes mid-walk +/// (we simulate this by removing the sub-listing while the watcher gate is on). +/// The walker must fall through to a real `list_directory` for the subfolder +/// and produce a correct delete. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn delete_mid_scan_listing_close() { + let vid = unique("mid_scan_close"); + let parent_lid = unique("mid_scan_parent_lid"); + let sub_lid = unique("mid_scan_sub_lid"); + + let vol = Arc::new(CountingVolume::new("mid-scan-vol", true)); + // Real backend content for the subfolder so the fallthrough produces a + // sensible delete (one real file). + vol.inner + .create_file(Path::new("/root/sub/real.bin"), b"abcdef") + .await + .unwrap(); + get_volume_manager().register(&vid, vol.clone() as Arc); + + // Pane A has /root open: one subfolder `sub`. + let parent_cached = vec![make_file_entry("sub", "/root", 0, true)]; + let parent_lid_inserted = insert_listing(&parent_lid, &vid, "/root", parent_cached); + + // Pane B has /root/sub open, but we close it BEFORE the delete starts — + // mirroring the "user closes pane mid-recursion" scenario. + let sub_entries = vec![make_file_entry("phantom.bin", "/root/sub", 12345, false)]; + let sub_lid_inserted = insert_listing(&sub_lid, &vid, "/root/sub", sub_entries); + remove_listing(&sub_lid_inserted); + + let events = Arc::new(CollectorEventSink::new()); + let state = make_state(); + let config = WriteOperationConfig::default(); + + let result = delete_volume_files_with_progress_inner( + vol.clone() as Arc, + &vid, + events.as_ref(), + "test-op-mid-scan-close", + &state, + &[PathBuf::from("/root/sub")], + &config, + ) + .await; + + assert!(result.is_ok(), "delete should succeed: {:?}", result); + // /root oracle-hit (top-level `is_directory` resolved without trait probe); + // /root/sub falls through to list_directory because pane B closed before + // recursion got there. So: zero is_directory probes, ≥1 list_directory. + assert_eq!( + vol.is_dir_count(), + 0, + "parent oracle should have answered top-level is_directory" + ); + assert!( + vol.list_dir_count() >= 1, + "fallthrough to list_directory required after pane B closed" + ); + // Delete the real backend file + the now-empty `/root/sub` directory. + assert!( + vol.delete_count() >= 1, + "at least the real file should be deleted; dir cleanup is best-effort" + ); + + remove_listing(&parent_lid_inserted); + get_volume_manager().unregister(&vid); +} diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/mod.rs b/apps/desktop/src-tauri/src/file_system/write_operations/mod.rs index 6a02cf94..8014c943 100644 --- a/apps/desktop/src-tauri/src/file_system/write_operations/mod.rs +++ b/apps/desktop/src-tauri/src/file_system/write_operations/mod.rs @@ -296,9 +296,16 @@ pub async fn delete_files_start( } }; - let result = - delete_volume_files_with_progress(volume, &app, &operation_id_for_spawn, &state, &sources, &config) - .await; + let result = delete_volume_files_with_progress( + volume, + &volume_id_str, + &app, + &operation_id_for_spawn, + &state, + &sources, + &config, + ) + .await; if let Ok(mut cache) = WRITE_OPERATION_STATE.write() { cache.remove(&operation_id_for_cleanup); @@ -366,8 +373,12 @@ mod copy_integration_test; #[cfg(test)] mod delete_integration_test; #[cfg(test)] +mod delete_volume_reuse_tests; +#[cfg(test)] mod move_integration_test; #[cfg(test)] +mod scan_preview_oracle_tests; +#[cfg(test)] mod tests; #[cfg(test)] mod transaction_integration_test; diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/scan.rs b/apps/desktop/src-tauri/src/file_system/write_operations/scan.rs index 1f7ba61b..b20e2c8f 100644 --- a/apps/desktop/src-tauri/src/file_system/write_operations/scan.rs +++ b/apps/desktop/src-tauri/src/file_system/write_operations/scan.rs @@ -16,7 +16,9 @@ use super::types::{ ConflictInfo, IoResultExt, OperationEventSink, ScanProgressEvent, WriteOperationError, WriteOperationPhase, WriteOperationType, WriteProgressEvent, }; -use crate::file_system::listing::{SortColumn, SortOrder}; +use crate::file_system::listing::caching::try_get_watched_listing; +use crate::file_system::listing::{FileEntry, SortColumn, SortOrder}; +use crate::file_system::volume::{CopyScanResult, Volume, VolumeError}; /// Callbacks for customizing `walk_dir_recursive` behavior per caller. /// @@ -38,6 +40,13 @@ pub(super) struct WalkContext<'a, E> { /// Shared walker used by both scan preview and write operation scanning. /// Behavior is customized via `WalkContext` callbacks for error handling and progress reporting. /// +/// **Oracle reuse**: when `volume_id` is provided and the listing cache holds a +/// watcher-backed listing for the directory currently being walked, the walker +/// hydrates that level's entries from the cache instead of touching the disk. +/// See `file_system::listing::caching::try_get_watched_listing` for the full +/// freshness contract. Pass `None` for `volume_id` to opt out (no listing +/// lookup is performed, behavior is identical to the pre-oracle walker). +/// /// **Hardlink dedup**: `total_bytes` counts each inode at most once. Cargo's /// `target/` (and similar trees: snapshots, sccache caches, deduplicated /// backups) hardlinks files heavily, so a naïve `metadata.len()` per direntry @@ -59,6 +68,7 @@ pub(super) fn walk_dir_recursive( last_progress_time: &mut Instant, visited: &mut HashSet, seen_inodes: &mut HashSet, + volume_id: Option<&str>, ctx: &WalkContext<'_, E>, ) -> Result<(), E> { if (ctx.is_cancelled)() { @@ -88,19 +98,42 @@ pub(super) fn walk_dir_recursive( dirs.push(path.to_path_buf()); - let entries = fs::read_dir(path).map_err(|e| (ctx.on_io_error)(path, e))?; - for entry in entries.flatten() { - walk_dir_recursive( - &entry.path(), + // Oracle short-circuit: if a watcher-backed listing exists for this dir, + // walk it from cache instead of hitting the disk. The recurse-into-files + // step below still goes through `walk_dir_recursive`, which re-applies + // the oracle at each level. + if let Some(vid) = volume_id + && let Some(cached_entries) = try_get_watched_listing(vid, path) + { + walk_cached_entries( + path, source_root, + cached_entries, files, dirs, total_bytes, last_progress_time, visited, seen_inodes, + volume_id, ctx, )?; + } else { + let entries = fs::read_dir(path).map_err(|e| (ctx.on_io_error)(path, e))?; + for entry in entries.flatten() { + walk_dir_recursive( + &entry.path(), + source_root, + files, + dirs, + total_bytes, + last_progress_time, + visited, + seen_inodes, + volume_id, + ctx, + )?; + } } } else { log::debug!("scan: skipping special file: {}", path.display()); @@ -116,6 +149,176 @@ pub(super) fn walk_dir_recursive( Ok(()) } +/// Walks a directory level using cached `FileEntry` entries instead of `fs::read_dir`. +/// +/// Used when the oracle reports a watcher-backed listing exists for the current +/// directory. For each cached entry: files are recorded with size from the +/// cache; directories recurse via `walk_dir_recursive`, which re-applies the +/// oracle (so subfolders open in another pane also short-circuit). Cached +/// symlinks (`is_symlink == true`) are recorded as files without recursing, +/// matching `walk_dir_recursive`'s symlink policy. +#[allow( + clippy::too_many_arguments, + reason = "Mirrors `walk_dir_recursive`'s parameter list to keep state threading consistent." +)] +fn walk_cached_entries( + parent_path: &Path, + source_root: &Path, + cached: Vec, + files: &mut Vec, + dirs: &mut Vec, + total_bytes: &mut u64, + last_progress_time: &mut Instant, + visited: &mut HashSet, + seen_inodes: &mut HashSet, + volume_id: Option<&str>, + ctx: &WalkContext<'_, E>, +) -> Result<(), E> { + for entry in cached { + if (ctx.is_cancelled)() { + return Err((ctx.on_cancelled)()); + } + let child_path = PathBuf::from(&entry.path); + if entry.is_directory && !entry.is_symlink { + // Recurse: the oracle re-applies inside `walk_dir_recursive`, so a + // grandchild dir open in another pane is also short-circuited. + walk_dir_recursive( + &child_path, + source_root, + files, + dirs, + total_bytes, + last_progress_time, + visited, + seen_inodes, + volume_id, + ctx, + )?; + } else { + // File or symlink: record from the cache, no I/O. We can't build a + // full `FileInfo` (no `std::fs::Metadata`), but we have everything + // the scan-preview caller actually consumes: path, size, and the + // symlink flag. + let size = entry.size.unwrap_or(0); + *total_bytes += size; + files.push(FileInfo { + path: child_path, + source_root: source_root.to_path_buf(), + size, + modified: entry.modified_at.unwrap_or(0), + created: entry.created_at.unwrap_or(0), + is_symlink: entry.is_symlink, + }); + } + } + + if last_progress_time.elapsed() >= ctx.progress_interval { + let current_dir = Some(parent_path.display().to_string()); + (ctx.on_progress)(files.len(), dirs.len(), *total_bytes, None, current_dir); + *last_progress_time = Instant::now(); + } + + Ok(()) +} + +/// Totals returned by `scan_subtree_with_oracle`. +/// +/// `per_path` carries one entry per direct child of the scanned `path`, sized +/// to feed into a parent `BatchScanResult` upstream. The vec is empty when +/// `path` itself is a file (the caller knows it's a file in that case). +#[derive(Debug, Clone, Default)] +pub(super) struct SubtreeTotals { + pub file_count: usize, + pub dir_count: usize, + pub total_bytes: u64, + /// Per-direct-child results so the scan-preview can populate the + /// `BatchScanResult::per_path` slot the copy engine reads later. + pub per_path: Vec<(PathBuf, CopyScanResult)>, +} + +/// Scans a subtree using the fresh-listing oracle at every recursion level, +/// falling back to `volume.list_directory` on cache miss. +/// +/// This is the oracle-aware analogue of the per-volume `scan_for_copy`. It's +/// designed for `run_volume_scan_preview` to call when the parent directory of +/// the selected sources is watcher-backed: top-level files come from the +/// cached listing directly, top-level directories recurse here. +/// +/// Cancellation: the future polls `is_cancelled` between entries. Symlinks +/// (cached `is_symlink == true`) are counted as one entry and not recursed, +/// matching the local-FS walker's policy. +pub(super) async fn scan_subtree_with_oracle( + volume: &dyn Volume, + volume_id: &str, + path: &Path, + is_cancelled: &(dyn Fn() -> bool + Sync), + on_progress: Option<&(dyn Fn(usize) + Sync)>, +) -> Result { + if is_cancelled() { + return Err(VolumeError::Cancelled("Operation cancelled by user".to_string())); + } + + // Load entries from oracle or the volume itself. + let entries: Vec = match try_get_watched_listing(volume_id, path) { + Some(e) => e, + None => volume.list_directory(path, on_progress).await?, + }; + + let mut totals = SubtreeTotals::default(); + + for entry in entries { + if is_cancelled() { + return Err(VolumeError::Cancelled("Operation cancelled by user".to_string())); + } + let child_path = PathBuf::from(&entry.path); + if entry.is_directory && !entry.is_symlink { + // Recurse — oracle re-applies inside this call. + let child_totals = Box::pin(scan_subtree_with_oracle( + volume, + volume_id, + &child_path, + is_cancelled, + on_progress, + )) + .await?; + totals.file_count += child_totals.file_count; + // The directory itself plus all its descendant dirs. + totals.dir_count += 1 + child_totals.dir_count; + totals.total_bytes += child_totals.total_bytes; + totals.per_path.push(( + child_path, + CopyScanResult { + file_count: child_totals.file_count, + dir_count: child_totals.dir_count, + total_bytes: child_totals.total_bytes, + top_level_is_directory: true, + }, + )); + if let Some(cb) = on_progress { + cb(totals.file_count); + } + } else { + let size = entry.size.unwrap_or(0); + totals.file_count += 1; + totals.total_bytes += size; + totals.per_path.push(( + child_path, + CopyScanResult { + file_count: 1, + dir_count: 0, + total_bytes: size, + top_level_is_directory: false, + }, + )); + if let Some(cb) = on_progress { + cb(totals.file_count); + } + } + } + + Ok(totals) +} + /// Returns `true` if this file's bytes should count toward the running scan /// total. On Unix, dedupes hardlinks via inode: a file with `nlink > 1` only /// contributes bytes the first time its inode is seen; subsequent occurrences @@ -352,6 +555,12 @@ fn scan_sources_internal( }, }; + // Local FS scan goes through `LocalPosixVolume`, which is always registered as + // the `"root"` volume. Passing it threads the oracle through: when the source + // (or any subdirectory we recurse into) is open in a pane with a live FSEvents + // watcher, the walker skips the disk read for that level. + let volume_id = Some(crate::file_system::volume::DEFAULT_VOLUME_ID); + for source in sources { let source_root = source.parent().unwrap_or(source); walk_dir_recursive( @@ -363,6 +572,7 @@ fn scan_sources_internal( &mut last_progress_time, &mut visited, &mut seen_inodes, + volume_id, &ctx, )?; } @@ -761,6 +971,7 @@ mod tests { &mut last_progress, &mut visited, &mut seen_inodes, + None, &ctx, ) .expect("walk should succeed"); diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/scan_preview.rs b/apps/desktop/src-tauri/src/file_system/write_operations/scan_preview.rs index 2b8aa3a5..e01a53d9 100644 --- a/apps/desktop/src-tauri/src/file_system/write_operations/scan_preview.rs +++ b/apps/desktop/src-tauri/src/file_system/write_operations/scan_preview.rs @@ -11,24 +11,31 @@ use std::time::{Duration, Instant}; use uuid::Uuid; -use super::scan::{WalkContext, sort_files, walk_dir_recursive}; +use super::scan::{SubtreeTotals, WalkContext, scan_subtree_with_oracle, sort_files, walk_dir_recursive}; use super::state::{CachedScanResult, FileInfo, SCAN_PREVIEW_RESULTS, SCAN_PREVIEW_STATE, ScanPreviewState}; use super::types::{ ScanPreviewCancelledEvent, ScanPreviewCompleteEvent, ScanPreviewErrorEvent, ScanPreviewProgressEvent, ScanPreviewStartResult, }; +use crate::file_system::listing::caching::try_get_watched_listing; use crate::file_system::listing::{SortColumn, SortOrder}; -use crate::file_system::volume::Volume; +use crate::file_system::volume::{BatchScanResult, CopyScanResult, Volume}; /// Starts a scan preview for the Copy dialog. /// Returns a preview_id that can be used to cancel or to pass to copy_files. /// /// When `source_volume` is provided, uses `Volume::scan_for_copy()` instead of `std::fs`, /// enabling MTP and other non-local volumes to produce scan previews. +/// +/// `source_volume_id` identifies the volume the sources live on. It's used by the +/// fresh-listing oracle (`try_get_watched_listing`) to short-circuit re-reading +/// directories that an open pane is already keeping in sync. Pass `"root"` for +/// local-FS scans. pub fn start_scan_preview( app: tauri::AppHandle, sources: Vec, source_volume: Option>, + source_volume_id: String, sort_column: SortColumn, sort_order: SortOrder, progress_interval_ms: u64, @@ -52,7 +59,7 @@ pub fn start_scan_preview( // Local scans use std::thread directly (no runtime needed). if let Some(volume) = source_volume { tokio::spawn(async move { - run_volume_scan_preview(app, preview_id_clone, sources, volume, state).await; + run_volume_scan_preview(app, preview_id_clone, sources, volume, source_volume_id, state).await; }); } else { std::thread::spawn(move || { @@ -127,6 +134,9 @@ fn run_scan_preview( ); }, }; + // Local FS scan preview uses the "root" volume ID. The oracle short-circuits + // any subtree currently open in a pane with a live FSEvents watcher. + let volume_id = Some(crate::file_system::volume::DEFAULT_VOLUME_ID); for source in &sources { let source_root = source.parent().unwrap_or(source); walk_dir_recursive( @@ -138,6 +148,7 @@ fn run_scan_preview( &mut last_progress_time, &mut visited, &mut seen_inodes, + volume_id, &ctx, )?; } @@ -199,14 +210,25 @@ fn run_scan_preview( /// Runs a volume-based scan preview (for MTP and other non-local volumes). /// -/// Uses `Volume::scan_for_copy_batch()` to scan all sources in one call, allowing -/// volume implementations to batch I/O (for example, MTP groups by parent directory). -/// Emits the same events as `run_scan_preview` so the frontend can't tell the difference. +/// Decision flow per parent group (sources sharing a parent directory): +/// - Fresh-listing oracle hit: cached entries supply size + `is_directory` for +/// each selected child, so the per-group `BatchScanResult` slice is built +/// without any volume I/O for top-level files. Top-level directories among +/// the inputs recurse via `scan_subtree_with_oracle`, which re-applies the +/// oracle at every level (so a subfolder open in another pane is also +/// short-circuited). +/// - Oracle miss: falls through to `volume.scan_for_copy_batch_with_progress`, +/// preserving the cold-cache parent-grouping optimizations on MTP and the +/// pipelined stat optimization on SMB. +/// +/// Emits the same `scan-preview-progress` / `scan-preview-complete` events as +/// the pre-oracle code, so the FE dialog behavior is unchanged. async fn run_volume_scan_preview( app: tauri::AppHandle, preview_id: String, sources: Vec, volume: Arc, + source_volume_id: String, state: Arc, ) { use tauri::Emitter; @@ -250,15 +272,24 @@ async fn run_volume_scan_preview( ); }; - let result: Result = async { + // Cancellation predicate, captured by reference inside the async helpers. + let state_for_cancel = Arc::clone(&state); + let is_cancelled = move || state_for_cancel.cancelled.load(Ordering::Relaxed); + + let result: Result = async { if state.cancelled.load(Ordering::Relaxed) { return Err("Cancelled".to_string()); } - volume - .scan_for_copy_batch_with_progress(&sources, Some(&on_progress)) - .await - .map_err(|e| format!("Scan failed: {}", e)) + run_oracle_aware_batch_scan( + volume.as_ref(), + &source_volume_id, + &sources, + &is_cancelled, + &on_progress, + ) + .await + .map_err(|e| format!("Scan failed: {}", e)) } .await; @@ -319,3 +350,151 @@ async fn run_volume_scan_preview( } } } + +/// Oracle-aware batch scan: short-circuits parent directories that an open pane +/// is keeping watcher-fresh; falls through to the volume's own batch scan for +/// cold-cache parents. Builds a single merged `BatchScanResult` keyed back to +/// the caller's original `sources` slice (order matches input). +pub(super) async fn run_oracle_aware_batch_scan( + volume: &dyn Volume, + volume_id: &str, + sources: &[PathBuf], + is_cancelled: &(dyn Fn() -> bool + Sync), + on_progress: &(dyn Fn(usize) + Sync), +) -> Result { + use crate::file_system::listing::FileEntry; + use std::collections::HashMap; + + // Group sources by parent dir, preserving the input order of paths within + // each group. The merged result puts the per-path entries back in original + // input order (callers downstream don't currently depend on order, but it + // matches `BatchScanResult::per_path`'s documented contract). + let mut groups: HashMap> = HashMap::new(); + let mut group_order: Vec = Vec::new(); + for source in sources { + let parent = source + .parent() + .map(|p| p.to_path_buf()) + .unwrap_or_else(|| PathBuf::from("/")); + if !groups.contains_key(&parent) { + group_order.push(parent.clone()); + } + groups.entry(parent).or_default().push(source.clone()); + } + + let mut aggregate = CopyScanResult { + file_count: 0, + dir_count: 0, + total_bytes: 0, + // Aggregate across multiple paths — meaningless, per the BatchScanResult contract. + top_level_is_directory: false, + }; + let mut per_path_unordered: HashMap = HashMap::new(); + + for parent in &group_order { + if is_cancelled() { + return Err(crate::file_system::volume::VolumeError::Cancelled( + "Operation cancelled by user".to_string(), + )); + } + let paths_in_group = groups + .get(parent) + .expect("group_order tracks every parent inserted into groups"); + + if let Some(cached_entries) = try_get_watched_listing(volume_id, parent) { + log::debug!( + "scan-preview: oracle hit for parent {} ({} cached entries, {} selected children)", + parent.display(), + cached_entries.len(), + paths_in_group.len() + ); + // Index cached entries by their last path component so we can resolve + // selected paths without a per-path linear search. `path` on a cached + // FileEntry is the absolute path string, so the file name component + // is the disambiguator within this parent. + let by_name: HashMap = cached_entries + .iter() + .filter_map(|e| { + PathBuf::from(&e.path) + .file_name() + .map(|n| (n.to_string_lossy().to_string(), e)) + }) + .collect(); + + for source in paths_in_group { + let name = source + .file_name() + .map(|n| n.to_string_lossy().to_string()) + .unwrap_or_default(); + let Some(entry) = by_name.get(&name) else { + // Cache doesn't know this child. Could be a stale selection + // (entry deleted out-of-band) or a name encoding mismatch. + // Either way, fall through to a real stat for safety. + let scan = volume.scan_for_copy(source).await?; + aggregate.file_count += scan.file_count; + aggregate.dir_count += scan.dir_count; + aggregate.total_bytes += scan.total_bytes; + per_path_unordered.insert(source.clone(), scan); + on_progress(aggregate.file_count); + continue; + }; + + if entry.is_directory && !entry.is_symlink { + let subtree: SubtreeTotals = + scan_subtree_with_oracle(volume, volume_id, source, is_cancelled, Some(on_progress)).await?; + aggregate.file_count += subtree.file_count; + // `scan_for_copy_batch`'s aggregate.dir_count counts descendants + // only, not the top-level path itself. Match that convention + // so the FE's "X dirs" number is consistent across paths. + aggregate.dir_count += subtree.dir_count; + aggregate.total_bytes += subtree.total_bytes; + per_path_unordered.insert( + source.clone(), + CopyScanResult { + file_count: subtree.file_count, + dir_count: subtree.dir_count, + total_bytes: subtree.total_bytes, + top_level_is_directory: true, + }, + ); + } else { + let size = entry.size.unwrap_or(0); + aggregate.file_count += 1; + aggregate.total_bytes += size; + per_path_unordered.insert( + source.clone(), + CopyScanResult { + file_count: 1, + dir_count: 0, + total_bytes: size, + top_level_is_directory: false, + }, + ); + on_progress(aggregate.file_count); + } + } + } else { + // Cold cache for this parent. Delegate to the volume's own batch + // scan: it preserves the MTP parent-grouping and SMB pipelined-stat + // optimizations for cold paths. + let group_result = volume + .scan_for_copy_batch_with_progress(paths_in_group, Some(on_progress)) + .await?; + aggregate.file_count += group_result.aggregate.file_count; + aggregate.dir_count += group_result.aggregate.dir_count; + aggregate.total_bytes += group_result.aggregate.total_bytes; + for (path, scan) in group_result.per_path { + per_path_unordered.insert(path, scan); + } + } + } + + // Rebuild per_path in caller's original source order. Missing entries + // (shouldn't happen, but be defensive) are skipped silently. + let per_path: Vec<(PathBuf, CopyScanResult)> = sources + .iter() + .filter_map(|src| per_path_unordered.remove(src).map(|scan| (src.clone(), scan))) + .collect(); + + Ok(BatchScanResult { aggregate, per_path }) +} diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/scan_preview_oracle_tests.rs b/apps/desktop/src-tauri/src/file_system/write_operations/scan_preview_oracle_tests.rs new file mode 100644 index 00000000..e531cb6d --- /dev/null +++ b/apps/desktop/src-tauri/src/file_system/write_operations/scan_preview_oracle_tests.rs @@ -0,0 +1,432 @@ +//! Integration tests for the fresh-listing oracle wired into `run_volume_scan_preview`. +//! +//! These exercise `run_oracle_aware_batch_scan` directly (the function called by +//! `run_volume_scan_preview` once it's inside its async block). Skipping the +//! Tauri `AppHandle` plumbing keeps the tests focused on the +//! oracle-hit-vs-miss-vs-mid-walk decisions. + +use std::future::Future; +use std::path::{Path, PathBuf}; +use std::pin::Pin; +use std::sync::Arc; +use std::sync::atomic::{AtomicBool, AtomicU64, AtomicUsize, Ordering}; + +use super::scan_preview::run_oracle_aware_batch_scan; +use crate::file_system::get_volume_manager; +use crate::file_system::listing::caching::{CachedListing, LISTING_CACHE}; +use crate::file_system::listing::metadata::FileEntry; +use crate::file_system::listing::sorting::{DirectorySortMode, SortColumn, SortOrder}; +use crate::file_system::volume::{BatchScanResult, CopyScanResult, InMemoryVolume, Volume, VolumeError}; + +/// Wraps an `InMemoryVolume` and counts `list_directory` calls so tests can +/// assert that the oracle short-circuited or fell through. +/// +/// `watched` is the test-only `listing_is_watched` override. Each test flips +/// it independently — the oracle picker reads it via the `Volume` trait +/// dispatch through `get_volume_manager()`, so we need the wrapper registered +/// in the manager. +struct CountingWatchedVolume { + inner: InMemoryVolume, + watched: AtomicBool, + list_dir_calls: AtomicUsize, +} + +impl CountingWatchedVolume { + fn new(name: &str, watched: bool) -> Self { + Self { + inner: InMemoryVolume::new(name), + watched: AtomicBool::new(watched), + list_dir_calls: AtomicUsize::new(0), + } + } + + #[allow(dead_code, reason = "Kept for future tests that flip the watcher mid-scan.")] + fn set_watched(&self, v: bool) { + self.watched.store(v, Ordering::Relaxed); + } + + fn list_dir_count(&self) -> usize { + self.list_dir_calls.load(Ordering::Relaxed) + } +} + +impl Volume for CountingWatchedVolume { + fn name(&self) -> &str { + self.inner.name() + } + fn root(&self) -> &Path { + self.inner.root() + } + + fn list_directory<'a>( + &'a self, + path: &'a Path, + on_progress: Option<&'a (dyn Fn(usize) + Sync)>, + ) -> Pin, VolumeError>> + Send + 'a>> { + self.list_dir_calls.fetch_add(1, Ordering::Relaxed); + self.inner.list_directory(path, on_progress) + } + + fn get_metadata<'a>( + &'a self, + path: &'a Path, + ) -> Pin> + Send + 'a>> { + self.inner.get_metadata(path) + } + + fn exists<'a>(&'a self, path: &'a Path) -> Pin + Send + 'a>> { + self.inner.exists(path) + } + + fn is_directory<'a>( + &'a self, + path: &'a Path, + ) -> Pin> + Send + 'a>> { + self.inner.is_directory(path) + } + + fn listing_is_watched(&self, _path: &Path) -> bool { + self.watched.load(Ordering::Relaxed) + } + + fn scan_for_copy<'a>( + &'a self, + path: &'a Path, + ) -> Pin> + Send + 'a>> { + self.inner.scan_for_copy(path) + } + + fn scan_for_copy_batch_with_progress<'a>( + &'a self, + paths: &'a [PathBuf], + on_progress: Option<&'a (dyn Fn(usize) + Sync)>, + ) -> Pin> + Send + 'a>> { + // Reuse the default trait implementation by calling through self.inner. + // The inner volume's `scan_for_copy` does the actual list_directory work + // and is counted via this wrapper's `list_directory` only if the inner + // backend actually calls it. InMemory's `scan_for_copy` walks the entries + // map directly, so calls from this path don't bump `list_dir_calls` — + // which is exactly what we want for the cold-cache assertion (a real + // list_directory call only happens on the oracle path when we miss). + let _ = on_progress; + self.inner.scan_for_copy_batch(paths) + } +} + +/// Unique-per-test id so tests can run in parallel without colliding in +/// `LISTING_CACHE` / `VolumeManager`. +fn unique(suffix: &str) -> String { + static N: AtomicU64 = AtomicU64::new(0); + format!( + "scanprev_{}_{}_{}", + suffix, + std::process::id(), + N.fetch_add(1, Ordering::Relaxed) + ) +} + +fn make_file_entry(name: &str, parent: &str, size: u64) -> FileEntry { + FileEntry { + size: Some(size), + permissions: 0o644, + owner: "test".to_string(), + group: "staff".to_string(), + extended_metadata_loaded: true, + ..FileEntry::new( + name.to_string(), + format!("{}/{}", parent.trim_end_matches('/'), name), + false, + false, + ) + } +} + +fn make_dir_entry(name: &str, parent: &str) -> FileEntry { + FileEntry { + permissions: 0o755, + owner: "test".to_string(), + group: "staff".to_string(), + extended_metadata_loaded: true, + ..FileEntry::new( + name.to_string(), + format!("{}/{}", parent.trim_end_matches('/'), name), + true, + false, + ) + } +} + +fn make_symlinked_dir_entry(name: &str, parent: &str) -> FileEntry { + FileEntry { + permissions: 0o755, + owner: "test".to_string(), + group: "staff".to_string(), + extended_metadata_loaded: true, + ..FileEntry::new( + name.to_string(), + format!("{}/{}", parent.trim_end_matches('/'), name), + true, + true, + ) + } +} + +/// Inserts a CachedListing directly into LISTING_CACHE. Returns the listing_id. +fn insert_listing(id: &str, volume_id: &str, path: &str, entries: Vec) -> String { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.insert( + id.to_string(), + CachedListing { + volume_id: volume_id.to_string(), + path: PathBuf::from(path), + entries, + sort_by: SortColumn::Name, + sort_order: SortOrder::Ascending, + directory_sort_mode: DirectorySortMode::LikeFiles, + sequence: AtomicU64::new(1), + created_at: std::time::Instant::now(), + }, + ); + id.to_string() +} + +fn remove_listing(id: &str) { + let mut cache = LISTING_CACHE.write().unwrap(); + cache.remove(id); +} + +/// Test 1: when the parent listing is watcher-backed, scan-preview reads sizes +/// from the cache and never calls `list_directory` on the volume. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn scan_preview_uses_watched_listing_for_top_level_files() { + let vid = unique("uses_watched"); + let lid = unique("uses_watched_lid"); + + // Pre-populate the InMemoryVolume so the wrapper's `list_directory` would + // actually return useful data IF called. We assert it ISN'T called. + let vol = Arc::new(CountingWatchedVolume::new("watched-vol", true)); + // Note: we do NOT also seed entries via `create_file`. The whole point is + // that the oracle reads from the cached listing rather than the backend. + get_volume_manager().register(&vid, vol.clone() as Arc); + + let cached = vec![ + make_file_entry("a.jpg", "/dcim", 1000), + make_file_entry("b.jpg", "/dcim", 2000), + make_file_entry("c.jpg", "/dcim", 3000), + ]; + let lid_inserted = insert_listing(&lid, &vid, "/dcim", cached); + + let sources = vec![ + PathBuf::from("/dcim/a.jpg"), + PathBuf::from("/dcim/b.jpg"), + PathBuf::from("/dcim/c.jpg"), + ]; + let is_cancelled = || false; + let on_progress = |_: usize| {}; + let result = run_oracle_aware_batch_scan(vol.as_ref(), &vid, &sources, &is_cancelled, &on_progress) + .await + .expect("oracle-aware batch scan should succeed"); + + assert_eq!( + vol.list_dir_count(), + 0, + "expected zero list_directory calls on oracle hit" + ); + assert_eq!(result.aggregate.file_count, 3); + assert_eq!(result.aggregate.total_bytes, 6000); + assert_eq!(result.per_path.len(), 3); + // Order matches the caller's `sources` order, per BatchScanResult contract. + assert_eq!(result.per_path[0].0, PathBuf::from("/dcim/a.jpg")); + assert_eq!(result.per_path[1].0, PathBuf::from("/dcim/b.jpg")); + assert_eq!(result.per_path[2].0, PathBuf::from("/dcim/c.jpg")); + + remove_listing(&lid_inserted); + get_volume_manager().unregister(&vid); +} + +/// Test 2: when `listing_is_watched` returns false (watcher dead), the oracle +/// returns None and we fall through to the volume's `scan_for_copy_batch` +/// path, which on InMemoryVolume calls scan_for_copy per path (no list_directory +/// call). What we really want to assert: the oracle did NOT short-circuit +/// (so the per-path data still comes from the volume backend, not the cache). +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn scan_preview_falls_through_when_watcher_dead() { + let vid = unique("dead_watcher"); + let lid = unique("dead_watcher_lid"); + + let vol = Arc::new(CountingWatchedVolume::new("dead-watcher-vol", false)); + // Seed the backend so the cold-path scan can answer correctly. + vol.inner + .create_file(Path::new("/cold/a.jpg"), b"hello-12byte") + .await + .unwrap(); + get_volume_manager().register(&vid, vol.clone() as Arc); + + // Cached entries with a clearly-bogus size: if the oracle were used despite + // the watcher being dead, the result would carry this size instead of the + // backend's real 12 bytes. + let cached = vec![make_file_entry("a.jpg", "/cold", 99999)]; + let lid_inserted = insert_listing(&lid, &vid, "/cold", cached); + + let sources = vec![PathBuf::from("/cold/a.jpg")]; + let is_cancelled = || false; + let on_progress = |_: usize| {}; + let result = run_oracle_aware_batch_scan(vol.as_ref(), &vid, &sources, &is_cancelled, &on_progress) + .await + .expect("scan should succeed via fallthrough"); + + assert_eq!(result.aggregate.file_count, 1); + // Real backend size (12 bytes "hello-12byte"), NOT the cached 99999. + assert_eq!( + result.aggregate.total_bytes, 12, + "watcher-dead path must NOT consume the cached size" + ); + + remove_listing(&lid_inserted); + get_volume_manager().unregister(&vid); +} + +/// Test 3: when a subfolder of the scanned dir is open in another pane (a +/// second watcher-backed listing in `LISTING_CACHE`), the walker reuses that +/// subfolder's listing and never lists it via the volume. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn scan_preview_uses_cached_subfolder_listing_when_other_pane_has_it() { + let vid = unique("subfolder_pane"); + let parent_lid = unique("subfolder_parent_lid"); + let sub_lid = unique("subfolder_child_lid"); + + let vol = Arc::new(CountingWatchedVolume::new("subfolder-vol", true)); + get_volume_manager().register(&vid, vol.clone() as Arc); + + // Parent pane lists `/a` with one entry: subfolder `sub` (a directory). + let parent_cached = vec![make_dir_entry("sub", "/a")]; + let parent_lid_inserted = insert_listing(&parent_lid, &vid, "/a", parent_cached); + + // Other pane lists `/a/sub` with two cached files. Both panes share the + // same volume; `listing_is_watched` returns true volume-wide so the oracle + // hits both. + let sub_cached = vec![ + make_file_entry("x.txt", "/a/sub", 100), + make_file_entry("y.txt", "/a/sub", 200), + ]; + let sub_lid_inserted = insert_listing(&sub_lid, &vid, "/a/sub", sub_cached); + + // Scanning a copy of `/a` (selecting the subfolder). + let sources = vec![PathBuf::from("/a/sub")]; + let is_cancelled = || false; + let on_progress = |_: usize| {}; + let result = run_oracle_aware_batch_scan(vol.as_ref(), &vid, &sources, &is_cancelled, &on_progress) + .await + .expect("scan should succeed"); + + assert_eq!( + vol.list_dir_count(), + 0, + "expected no list_directory call when the subfolder is open in another pane" + ); + assert_eq!(result.aggregate.file_count, 2); + // `BatchScanResult::aggregate.dir_count` counts descendant directories only, + // not the top-level selected source itself. `/a/sub` has no subdirectories + // among its cached entries, so 0 is correct. (Same convention as + // `Volume::scan_for_copy_batch`'s default impl.) + assert_eq!(result.aggregate.dir_count, 0); + assert_eq!(result.aggregate.total_bytes, 300); + + remove_listing(&parent_lid_inserted); + remove_listing(&sub_lid_inserted); + get_volume_manager().unregister(&vid); +} + +/// Test 4: a cached entry with `is_symlink == true` and `is_directory == true` +/// is counted as one file-shaped entry. The walker must NOT recurse into it. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn scan_preview_preserves_symlink_semantics() { + let vid = unique("symlink"); + let parent_lid = unique("symlink_parent_lid"); + + let vol = Arc::new(CountingWatchedVolume::new("symlink-vol", true)); + get_volume_manager().register(&vid, vol.clone() as Arc); + + // /a is open, has one entry: `link-to-elsewhere`, a symlinked directory. + let parent_cached = vec![make_symlinked_dir_entry("link-to-elsewhere", "/a")]; + let parent_lid_inserted = insert_listing(&parent_lid, &vid, "/a", parent_cached); + + // We do NOT cache anything for /a/link-to-elsewhere. If the walker were + // to recurse into a symlinked directory, the oracle would miss for that + // child and `list_directory` would be called. We assert that doesn't happen. + + let sources = vec![PathBuf::from("/a/link-to-elsewhere")]; + let is_cancelled = || false; + let on_progress = |_: usize| {}; + let result = run_oracle_aware_batch_scan(vol.as_ref(), &vid, &sources, &is_cancelled, &on_progress) + .await + .expect("symlink scan should succeed"); + + assert_eq!( + vol.list_dir_count(), + 0, + "expected no list_directory call: symlinks must not be recursed into" + ); + // Symlink counts as one file-shaped entry (size 0 from the default + // FileEntry; the walker just records what the cache holds). + assert_eq!(result.aggregate.file_count, 1); + assert_eq!(result.aggregate.dir_count, 0); + + remove_listing(&parent_lid_inserted); + get_volume_manager().unregister(&vid); +} + +/// Test 5: if the user closes pane B (where `/a/sub` was open) right before +/// the walker recurses into `/a/sub`, the oracle returns None for `/a/sub` +/// and the walker falls through to `volume.list_directory`. The end-to-end +/// result is still correct. +#[tokio::test(flavor = "multi_thread", worker_threads = 2)] +async fn scan_preview_handles_listing_closed_mid_walk() { + let vid = unique("mid_walk_close"); + let parent_lid = unique("mid_walk_parent_lid"); + let sub_lid = unique("mid_walk_sub_lid"); + + let vol = Arc::new(CountingWatchedVolume::new("mid-walk-vol", true)); + // Seed the backing volume with what /a/sub really contains, so the + // fallthrough path produces sensible results. + vol.inner + .create_file(Path::new("/a/sub/real.bin"), b"abcdef") + .await + .unwrap(); + get_volume_manager().register(&vid, vol.clone() as Arc); + + // Parent pane lists `/a` with one entry: subfolder `sub`. + let parent_cached = vec![make_dir_entry("sub", "/a")]; + let parent_lid_inserted = insert_listing(&parent_lid, &vid, "/a", parent_cached); + + // Pane B has `/a/sub` cached too, but we close it BEFORE running the scan. + let sub_entries = vec![make_file_entry("phantom.bin", "/a/sub", 12345)]; + let sub_lid_inserted = insert_listing(&sub_lid, &vid, "/a/sub", sub_entries); + // Simulate pane B closing: remove the listing right before the scan. + remove_listing(&sub_lid_inserted); + + let sources = vec![PathBuf::from("/a/sub")]; + let is_cancelled = || false; + let on_progress = |_: usize| {}; + let result = run_oracle_aware_batch_scan(vol.as_ref(), &vid, &sources, &is_cancelled, &on_progress) + .await + .expect("mid-walk close scan should succeed"); + + // /a is still oracle-hit (it's the parent of the input source). The + // recursion into /a/sub goes through `scan_subtree_with_oracle` which + // misses the closed sub-listing and calls `volume.list_directory`. + assert!( + vol.list_dir_count() >= 1, + "expected fallthrough to list_directory for the closed /a/sub listing" + ); + // Result should reflect the real backend content, not the stale cached + // phantom file. + assert_eq!(result.aggregate.file_count, 1); + assert_eq!( + result.aggregate.total_bytes, 6, + "should reflect real file size, not the stale cache" + ); + + remove_listing(&parent_lid_inserted); + get_volume_manager().unregister(&vid); +} diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/types.rs b/apps/desktop/src-tauri/src/file_system/write_operations/types.rs index 82c9d5eb..3fa453d7 100644 --- a/apps/desktop/src-tauri/src/file_system/write_operations/types.rs +++ b/apps/desktop/src-tauri/src/file_system/write_operations/types.rs @@ -455,6 +455,12 @@ pub enum WriteOperationError { path: String, message: String, }, + /// The file is in `STATUS_DELETE_PENDING` on the server: a delete was requested + /// but at least one open handle is keeping it alive. Transient — clears when the + /// last handle closes. SMB-only today. + DeletePending { + path: String, + }, /// Catch-all for genuinely unexpected IO errors. IoError { path: String, diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/volume_copy.rs b/apps/desktop/src-tauri/src/file_system/write_operations/volume_copy.rs index a15f59d9..e942980d 100644 --- a/apps/desktop/src-tauri/src/file_system/write_operations/volume_copy.rs +++ b/apps/desktop/src-tauri/src/file_system/write_operations/volume_copy.rs @@ -1619,6 +1619,9 @@ pub(super) fn map_volume_error(context_path: &str, e: VolumeError) -> WriteOpera path, message: "Is a directory".to_string(), }, + VolumeError::DeletePending(_) => WriteOperationError::DeletePending { + path: context_path.to_string(), + }, } } diff --git a/apps/desktop/src-tauri/src/file_system/write_operations/volume_copy_tests.rs b/apps/desktop/src-tauri/src/file_system/write_operations/volume_copy_tests.rs index ad1af1ec..b18efc0d 100644 --- a/apps/desktop/src-tauri/src/file_system/write_operations/volume_copy_tests.rs +++ b/apps/desktop/src-tauri/src/file_system/write_operations/volume_copy_tests.rs @@ -183,6 +183,17 @@ fn test_map_volume_error_not_supported() { ); } +#[test] +fn test_map_volume_error_delete_pending() { + // STATUS_DELETE_PENDING surfaces when a delete was requested but an open + // handle is keeping the file alive on the server. It MUST become a typed + // `WriteOperationError::DeletePending` so the write-error event carries + // the transient "file is being removed" friendly copy — not the generic + // IoError fallback. + let err = map_volume_error("/ctx", VolumeError::DeletePending("STATUS_DELETE_PENDING".to_string())); + assert!(matches!(err, WriteOperationError::DeletePending { path } if path == "/ctx")); +} + // ======================================== // LocalPosixVolume integration tests // ======================================== diff --git a/apps/desktop/src-tauri/src/mtp/connection/mod.rs b/apps/desktop/src-tauri/src/mtp/connection/mod.rs index 4230aecc..9353766c 100644 --- a/apps/desktop/src-tauri/src/mtp/connection/mod.rs +++ b/apps/desktop/src-tauri/src/mtp/connection/mod.rs @@ -460,6 +460,22 @@ impl MtpConnectionManager { }) } + /// Returns `true` when `device_id` is currently connected, without awaiting. + /// + /// Used by `MtpVolume::listing_is_watched` (a sync trait method) to gate the + /// fresh-listing oracle. Uses `try_lock` on the devices map so a hot-path + /// MTP op holding the mutex doesn't stall pre-flight scan decisions: if the + /// lock is contended, treat it as "not watched" and fall through to a real + /// read, which is the safe direction. The oracle re-checks on the next + /// pre-flight; the lock is held only for short bookkeeping operations on + /// the device map, so we won't see sustained false negatives. + pub fn is_connected(&self, device_id: &str) -> bool { + match self.devices.try_lock() { + Ok(guard) => guard.contains_key(device_id), + Err(_) => false, + } + } + /// Gets information about all connected devices. pub async fn get_all_connected_devices(&self) -> Vec { let devices = self.devices.lock().await; diff --git a/apps/desktop/src-tauri/src/volumes_linux/mod.rs b/apps/desktop/src-tauri/src/volumes_linux/mod.rs index 14e51ccd..481296d4 100644 --- a/apps/desktop/src-tauri/src/volumes_linux/mod.rs +++ b/apps/desktop/src-tauri/src/volumes_linux/mod.rs @@ -338,7 +338,7 @@ pub fn get_mounted_volumes(mounts: &[MountEntry]) -> Vec { }); } - volumes.sort_by(|a, b| a.name.to_lowercase().cmp(&b.name.to_lowercase())); + volumes.sort_by_key(|v| v.name.to_lowercase()); volumes } @@ -375,7 +375,7 @@ fn get_cloud_drives(mounts: &[MountEntry]) -> Vec { } } - drives.sort_by(|a, b| a.name.to_lowercase().cmp(&b.name.to_lowercase())); + drives.sort_by_key(|d| d.name.to_lowercase()); drives } @@ -443,7 +443,7 @@ fn get_network_mounts() -> Vec { } } - mounts.sort_by(|a, b| a.name.to_lowercase().cmp(&b.name.to_lowercase())); + mounts.sort_by_key(|m| m.name.to_lowercase()); mounts } diff --git a/apps/desktop/src/lib/file-viewer/open-viewer.ts b/apps/desktop/src/lib/file-viewer/open-viewer.ts index a39c031d..eaa93831 100644 --- a/apps/desktop/src/lib/file-viewer/open-viewer.ts +++ b/apps/desktop/src/lib/file-viewer/open-viewer.ts @@ -1,12 +1,17 @@ /** Opens a file viewer window for the given file path. Multiple viewers can be open at once. */ export async function openFileViewer(filePath: string): Promise { const { WebviewWindow } = await import('@tauri-apps/api/webviewWindow') - const { decorateChildWindowTitle } = await import('$lib/app-mode') + const { decorateChildWindowTitle, getAppMode } = await import('$lib/app-mode') // Use a unique label per viewer instance (timestamp-based) const label = `viewer-${String(Date.now())}` const encodedPath = encodeURIComponent(filePath) + // E2E suites open viewer windows repeatedly; stealing OS focus each time + // makes the host machine unusable while tests run. The plugin reaches the + // webview over a Unix socket, so it doesn't need OS focus to drive the DOM. + const isE2e = getAppMode() === 'e2e' + new WebviewWindow(label, { url: `/viewer?path=${encodedPath}`, title: decorateChildWindowTitle(filePath.split('/').pop() ?? 'Viewer'), @@ -18,6 +23,6 @@ export async function openFileViewer(filePath: string): Promise { minimizable: true, maximizable: true, closable: true, - focus: true, + focus: !isE2e, }) } diff --git a/apps/desktop/src/lib/ipc/bindings.ts b/apps/desktop/src/lib/ipc/bindings.ts index 409454e5..71a9f51d 100644 --- a/apps/desktop/src/lib/ipc/bindings.ts +++ b/apps/desktop/src/lib/ipc/bindings.ts @@ -3192,6 +3192,12 @@ export type WriteOperationError = | { type: 'name_too_long'; path: string } // File name contains characters not allowed at the destination. | { type: 'invalid_name'; path: string; message: string } + /** + * The file is in `STATUS_DELETE_PENDING` on the server: a delete was requested + * but at least one open handle is keeping it alive. Transient — clears when the + * last handle closes. SMB-only today. + */ + | { type: 'delete_pending'; path: string } // Catch-all for genuinely unexpected IO errors. | { type: 'io_error'; path: string; message: string } diff --git a/apps/desktop/src/lib/settings/settings-window.ts b/apps/desktop/src/lib/settings/settings-window.ts index 8b66d1b8..95ade8f6 100644 --- a/apps/desktop/src/lib/settings/settings-window.ts +++ b/apps/desktop/src/lib/settings/settings-window.ts @@ -21,7 +21,7 @@ import { WebviewWindow } from '@tauri-apps/api/webviewWindow' import { emitTo } from '@tauri-apps/api/event' import { getAppLogger } from '$lib/logging/logger' import { getEffectiveScale } from '$lib/text-size.svelte' -import { decorateChildWindowTitle } from '$lib/app-mode' +import { decorateChildWindowTitle, getAppMode } from '$lib/app-mode' const log = getAppLogger('settings') @@ -54,11 +54,18 @@ export const settingsMaxWidth = (scale: number): number => * the matching section path (e.g., `['File systems', 'SMB/Network shares']`). */ export async function openSettingsWindow(section?: string[]): Promise { + // E2E suites re-open Settings many times; stealing OS focus each time + // makes the host machine unusable while tests run. The plugin reaches the + // webview over a Unix socket, so it doesn't need OS focus to drive the DOM. + const isE2e = getAppMode() === 'e2e' + const existing = await WebviewWindow.getByLabel('settings') if (existing) { // Emit to the settings window so it can self-focus. Cross-window setFocus() // doesn't reliably bring a window to front on macOS. - await emitTo('settings', 'focus-self') + if (!isE2e) { + await emitTo('settings', 'focus-self') + } if (section) { await emitTo('settings', 'navigate-to-section', { section }) } @@ -84,6 +91,6 @@ export async function openSettingsWindow(section?: string[]): Promise { center: true, resizable: true, decorations: true, - focus: true, + focus: !isE2e, }) } diff --git a/apps/desktop/test/e2e-playwright/mtp-copy-preflight-uses-cache.spec.ts b/apps/desktop/test/e2e-playwright/mtp-copy-preflight-uses-cache.spec.ts new file mode 100644 index 00000000..7eae4631 --- /dev/null +++ b/apps/desktop/test/e2e-playwright/mtp-copy-preflight-uses-cache.spec.ts @@ -0,0 +1,211 @@ +/** + * E2E test pinning the fresh-listing-reuse contract for MTP copy pre-flight (M4). + * + * Setup: connect to the virtual MTP device, navigate the left pane into a + * subfolder of `/DCIM` so the FE has the listing cached and the device's + * watcher is marked "alive" by `MtpVolume::listing_is_watched`. Select a + * handful of files, press F5, and assert that the "Verifying before copy…" + * pre-flight is satisfied from the cache: + * + * 1. The transfer-confirmation dialog's scan completes WELL under the + * cold-cache MTP cost (USB roundtrip + parent-listing). The bound is + * deliberately wider than the cache hit's actual cost (~5 ms) but tight + * enough that a regression which falls back to a real `list_directory` + * walk would blow past it. + * 2. `filesFound` matches the selection count. + * + * The spec then cancels the dialog: the value here is the pre-flight + * behavior, not the copy itself. + * + * Requires the app to be built with `--features playwright-e2e,virtual-mtp`. + */ + +import fs from 'fs' +import path from 'path' +import { test, expect } from './fixtures.js' +import { recreateFixtures } from '../e2e-shared/fixtures.js' +import { recreateMtpFixtures, MTP_FIXTURE_ROOT } from '../e2e-shared/mtp-fixtures.js' +import { + initMcpClient, + mcpCall, + mcpReadResource, + mcpSelectVolume, + mcpNavToPath, + mcpAwaitItem, +} from '../e2e-shared/mcp-client.js' +import { + dispatchMenuCommand, + ensureAppReady, + getFixtureRoot, + pollUntil, + moveCursorToFile, + pressKey, + isStateClean, + LOCAL_VOLUME_NAME, + TRANSFER_DIALOG, +} from './helpers.js' + +const INTERNAL_STORAGE = 'Virtual Pixel 9 - Internal Storage' + +/** + * Upper bound on the time from F5-press to the transfer-confirmation dialog + * showing `scanComplete` (the inline "✓" next to the scan stats). The MTP + * cold-cache cost for a small folder is hundreds of ms; the cache-hit cost + * should be ~5 ms plus dialog mount + one scan-preview round-trip. 1500 ms + * is a ~3× safety margin and well clear of the cold-cache regime. + */ +const SCAN_COMPLETE_BOUND_MS = 1500 + +// MTP operations go through the virtual device and add protocol overhead. +test.setTimeout(60_000) + +/** Reads cmdr://state and returns true when both panes show the local volume. */ +async function bothPanesOnLocalVolume(): Promise { + const state = await mcpReadResource('cmdr://state') + const volumeLines = (state.match(/\n {2}volume: ([^\n]+)/g) ?? []).map((line) => line.replace(/^\n {2}volume: /, '')) + return volumeLines.length >= 2 && volumeLines[0] === LOCAL_VOLUME_NAME && volumeLines[1] === LOCAL_VOLUME_NAME +} + +/** Discovers the mtp:// path prefix for a named MTP storage from cmdr://state. */ +async function getMtpVolumePath(storageName: string): Promise { + const state = await mcpReadResource('cmdr://state') + const lines = state.split('\n') + for (let i = 0; i < lines.length; i++) { + if (lines[i].includes(`name: ${storageName}`) && lines[i + 1]?.includes('id:')) { + const id = lines[i + 1].trim().replace('id: ', '') + const [deviceId, storageId] = id.split(':') + return `mtp://${deviceId}/${storageId}` + } + } + throw new Error(`MTP volume "${storageName}" not found in cmdr://state`) +} + +/** + * Seeds extra files in `/DCIM` so the test can select a non-trivial subset. + * The base fixture only ships one direct file in `internal/DCIM/`; this brings + * the top-level file count up to 5 so a 3-file selection is meaningful. + */ +function seedDcimWithExtras(): string[] { + const dcim = path.join(MTP_FIXTURE_ROOT, 'internal', 'DCIM') + const names = ['cache-a.jpg', 'cache-b.jpg', 'cache-c.jpg', 'cache-d.jpg'] + for (const name of names) { + fs.writeFileSync(path.join(dcim, name), Buffer.from([0xff, 0xd8, 0xff, 0xe0, ...Buffer.from('cache-test-' + name)])) + } + return names +} + +test.beforeEach(async ({ tauriPage }) => { + recreateFixtures(getFixtureRoot()) + await initMcpClient(tauriPage) + + // Pause the virtual MTP watcher across the disk swap, then resync. Mirrors + // mtp.spec.ts so the rescan can't race with stale FSEvents. + await tauriPage.evaluate(`window.__TAURI_INTERNALS__.invoke('pause_virtual_mtp_watcher')`) + recreateMtpFixtures() + seedDcimWithExtras() + await tauriPage.evaluate(`window.__TAURI_INTERNALS__.invoke('resync_virtual_mtp_after_disk_change')`) + + // Reset both panes to the local volume so ensureAppReady's mcp-nav-to-path + // events aren't rejected by a leftover MTP pane. + if (!(await isStateClean(tauriPage, LOCAL_VOLUME_NAME))) { + await tauriPage.evaluate(`(function() { + var invoke = window.__TAURI_INTERNALS__.invoke; + invoke('plugin:event|emit', { event: 'mcp-volume-select', payload: { pane: 'left', name: ${JSON.stringify(LOCAL_VOLUME_NAME)} } }); + invoke('plugin:event|emit', { event: 'mcp-volume-select', payload: { pane: 'right', name: ${JSON.stringify(LOCAL_VOLUME_NAME)} } }); + })()`) + await pollUntil(tauriPage, async () => bothPanesOnLocalVolume(), 5000) + await tauriPage.keyboard.press('Escape') + await tauriPage.keyboard.press('Escape') + await pollUntil(tauriPage, async () => !(await tauriPage.isVisible('.modal-overlay')), 2000) + } +}) + +test.describe('MTP copy pre-flight reuses watcher-backed listing', () => { + test('F5 pre-flight scan completes from cache and reports the right file count', async ({ tauriPage }) => { + await ensureAppReady(tauriPage) + const mtpPath = await getMtpVolumePath(INTERNAL_STORAGE) + + // Navigate the left pane into the MTP /DCIM folder. This populates + // `LISTING_CACHE` for the parent and marks the MTP volume as watched + // (the virtual device is connected). The oracle's + // `try_get_watched_listing(volume_id, path)` should hit on every entry. + await mcpSelectVolume('left', INTERNAL_STORAGE) + await mcpAwaitItem('left', 'DCIM') + await mcpNavToPath('left', `${mtpPath}/DCIM`) + await mcpAwaitItem('left', 'cache-a.jpg', 30) + + // Right pane should already be on local right/ from ensureAppReady. + // Pick three top-level files. The watcher-backed cache holds their + // size + is_directory, so neither the parent re-listing nor a per-file + // metadata probe is needed. + const selection = ['cache-a.jpg', 'cache-b.jpg', 'cache-c.jpg'] + for (const name of selection) { + await moveCursorToFile(tauriPage, name) + await pressKey(tauriPage, 'Space') + await pollUntil( + tauriPage, + async () => + tauriPage.evaluate( + `!!document.querySelector('.file-pane.is-focused .file-entry[data-filename=' + ${JSON.stringify(JSON.stringify(name))} + '].is-selected')`, + ), + 2000, + ) + } + + // Open the transfer-confirmation dialog via the same command path F5 hits + // in production. `dispatchMenuCommand` is unaffected by DOM focus drift. + const startedAt = Date.now() + await dispatchMenuCommand(tauriPage, 'file.copy') + await tauriPage.waitForSelector(TRANSFER_DIALOG, 5000) + + // Wait for the inline scan-preview to finish. `.scan-checkmark` renders + // when `scanComplete === true` in `TransferDialog.svelte`. With the + // fresh-listing oracle the scan is fed from the watcher-backed cache + // and should converge in low ms; without it, the MTP backend would + // re-list the parent dir. + const completedQuickly = await pollUntil( + tauriPage, + async () => tauriPage.evaluate(`!!document.querySelector('${TRANSFER_DIALOG} .scan-checkmark')`), + SCAN_COMPLETE_BOUND_MS, + 25, + ) + const elapsed = Date.now() - startedAt + expect( + completedQuickly, + `scan-preview did not complete within ${String(SCAN_COMPLETE_BOUND_MS)} ms (took ${String(elapsed)} ms)`, + ).toBe(true) + + // `filesFound` is rendered inside `.scan-stats .scan-stat:nth-child(3) .scan-value` + // ([size, files, dirs] order, separated by `.scan-divider`). Read it + // directly: the cache supplied 3 file entries; if the oracle missed and + // the walker fell back to a real parent listing, this would either match + // (correct but slow) or wildly overshoot if the bug came back where MTP + // returned every sibling in the parent dir as part of `filesFound`. + const filesFoundText = await tauriPage.evaluate(`(function() { + var stats = document.querySelectorAll('${TRANSFER_DIALOG} .scan-stats .scan-stat .scan-value'); + // Index 1 is files (size is index 0, dirs is index 2). + return stats[1] ? stats[1].textContent.trim() : ''; + })()`) + expect(filesFoundText).toBe(String(selection.length)) + + // Cancel the dialog. Don't run the copy: this spec is about the + // pre-flight contract, and skipping the actual transfer keeps the test + // independent of MTP write throughput. + await tauriPage.keyboard.press('Escape') + await pollUntil(tauriPage, async () => !(await tauriPage.isVisible('.modal-overlay')), 5000) + + // Source files must still be on the device (no copy happened). + for (const name of selection) { + expect(fs.existsSync(path.join(MTP_FIXTURE_ROOT, 'internal', 'DCIM', name))).toBe(true) + } + + // Destination must not have any of them. + const fixtureRoot = getFixtureRoot() + for (const name of selection) { + expect(fs.existsSync(path.join(fixtureRoot, 'right', name))).toBe(false) + } + + // Avoid leaking state for the next test in the spec. + await mcpCall('refresh', {}) + }) +}) diff --git a/apps/desktop/test/e2e-playwright/mtp-delete-no-double-scan.spec.ts b/apps/desktop/test/e2e-playwright/mtp-delete-no-double-scan.spec.ts new file mode 100644 index 00000000..e387d7e7 --- /dev/null +++ b/apps/desktop/test/e2e-playwright/mtp-delete-no-double-scan.spec.ts @@ -0,0 +1,279 @@ +/** + * E2E test pinning the "delete reuses scan preview, no double scan" contract + * for MTP delete (M4). + * + * The bug this guards against: pre-fix, `delete_volume_files_with_progress` + * ignored `preview_id` and re-walked the source tree. On MTP that meant a + * second silent parent listing after the user already paid that cost in the + * delete confirmation dialog — and because the re-walk emitted no top-level + * progress, the UI looked frozen. Now the backend `take_cached_scan_result`s + * the preview and goes straight from Scanning to Deleting. + * + * The spec subscribes to `write-progress` events from the webview, captures + * the `phase` sequence, and asserts: + * + * 1. Scanning -> Deleting transitions exactly once (no second Scanning + * after Deleting starts). + * 2. `filesDone` never decreases within a phase (the second-scan symptom + * was a counter that reset to zero when the silent re-walk began). + * 3. The files are gone from the device after the operation completes. + * + * Requires the app to be built with `--features playwright-e2e,virtual-mtp`. + */ + +import fs from 'fs' +import path from 'path' +import { test, expect } from './fixtures.js' +import { recreateFixtures } from '../e2e-shared/fixtures.js' +import { recreateMtpFixtures, MTP_FIXTURE_ROOT } from '../e2e-shared/mtp-fixtures.js' +import { + initMcpClient, + mcpCall, + mcpReadResource, + mcpSelectVolume, + mcpNavToPath, + mcpAwaitItem, +} from '../e2e-shared/mcp-client.js' +import { + ensureAppReady, + getFixtureRoot, + pollUntil, + moveCursorToFile, + pressKey, + isStateClean, + LOCAL_VOLUME_NAME, +} from './helpers.js' + +const INTERNAL_STORAGE = 'Virtual Pixel 9 - Internal Storage' + +interface CapturedProgress { + phase: string + filesDone: number + filesTotal: number + bytesDone: number +} + +test.setTimeout(60_000) + +async function bothPanesOnLocalVolume(): Promise { + const state = await mcpReadResource('cmdr://state') + const volumeLines = (state.match(/\n {2}volume: ([^\n]+)/g) ?? []).map((line) => line.replace(/^\n {2}volume: /, '')) + return volumeLines.length >= 2 && volumeLines[0] === LOCAL_VOLUME_NAME && volumeLines[1] === LOCAL_VOLUME_NAME +} + +async function getMtpVolumePath(storageName: string): Promise { + const state = await mcpReadResource('cmdr://state') + const lines = state.split('\n') + for (let i = 0; i < lines.length; i++) { + if (lines[i].includes(`name: ${storageName}`) && lines[i + 1]?.includes('id:')) { + const id = lines[i + 1].trim().replace('id: ', '') + const [deviceId, storageId] = id.split(':') + return `mtp://${deviceId}/${storageId}` + } + } + throw new Error(`MTP volume "${storageName}" not found in cmdr://state`) +} + +/** Seeds extra files in /DCIM so the delete has multiple top-level entries to track. */ +function seedDcimWithExtras(): string[] { + const dcim = path.join(MTP_FIXTURE_ROOT, 'internal', 'DCIM') + const names = ['delete-a.jpg', 'delete-b.jpg', 'delete-c.jpg'] + for (const name of names) { + fs.writeFileSync(path.join(dcim, name), Buffer.from([0xff, 0xd8, 0xff, 0xe0, ...Buffer.from('del-test-' + name)])) + } + return names +} + +test.beforeEach(async ({ tauriPage }) => { + recreateFixtures(getFixtureRoot()) + await initMcpClient(tauriPage) + + await tauriPage.evaluate(`window.__TAURI_INTERNALS__.invoke('pause_virtual_mtp_watcher')`) + recreateMtpFixtures() + seedDcimWithExtras() + await tauriPage.evaluate(`window.__TAURI_INTERNALS__.invoke('resync_virtual_mtp_after_disk_change')`) + + if (!(await isStateClean(tauriPage, LOCAL_VOLUME_NAME))) { + await tauriPage.evaluate(`(function() { + var invoke = window.__TAURI_INTERNALS__.invoke; + invoke('plugin:event|emit', { event: 'mcp-volume-select', payload: { pane: 'left', name: ${JSON.stringify(LOCAL_VOLUME_NAME)} } }); + invoke('plugin:event|emit', { event: 'mcp-volume-select', payload: { pane: 'right', name: ${JSON.stringify(LOCAL_VOLUME_NAME)} } }); + })()`) + await pollUntil(tauriPage, async () => bothPanesOnLocalVolume(), 5000) + await tauriPage.keyboard.press('Escape') + await tauriPage.keyboard.press('Escape') + await pollUntil(tauriPage, async () => !(await tauriPage.isVisible('.modal-overlay')), 2000) + } +}) + +test.describe('MTP delete reuses scan preview (no double scan)', () => { + test('F8 progresses Scanning -> Deleting exactly once and counts never go backwards', async ({ tauriPage }) => { + await ensureAppReady(tauriPage) + const mtpPath = await getMtpVolumePath(INTERNAL_STORAGE) + + // Land in MTP /DCIM so the parent listing is in the watcher-backed cache. + await mcpSelectVolume('left', INTERNAL_STORAGE) + await mcpAwaitItem('left', 'DCIM') + await mcpNavToPath('left', `${mtpPath}/DCIM`) + await mcpAwaitItem('left', 'delete-a.jpg', 30) + + const selection = ['delete-a.jpg', 'delete-b.jpg', 'delete-c.jpg'] + + // Subscribe to `write-progress` BEFORE the delete starts. The handler + // pushes every progress payload into a window-scoped buffer the test + // reads via `evaluate`. + await tauriPage.evaluate(`(async function() { + window.__deleteProgressEvents = []; + const handler = (event) => { window.__deleteProgressEvents.push(event.payload); }; + const handlerId = window.__TAURI_INTERNALS__.transformCallback(handler); + window.__deleteProgressEventId = await window.__TAURI_INTERNALS__.invoke('plugin:event|listen', { + event: 'write-progress', + target: { kind: 'Any' }, + handler: handlerId, + }); + window.__deleteCompleteEvents = []; + const completeHandler = (event) => { window.__deleteCompleteEvents.push(event.payload); }; + const completeHandlerId = window.__TAURI_INTERNALS__.transformCallback(completeHandler); + window.__deleteCompleteEventId = await window.__TAURI_INTERNALS__.invoke('plugin:event|listen', { + event: 'write-complete', + target: { kind: 'Any' }, + handler: completeHandlerId, + }); + })()`) + + try { + // Multi-select via Space (same path as mtp.spec.ts's "deletes multiple + // selected files on MTP"). Poll for `.is-selected` after each Space so + // we don't race the next cursor move. + for (const name of selection) { + await moveCursorToFile(tauriPage, name) + await pressKey(tauriPage, 'Space') + await pollUntil( + tauriPage, + async () => + tauriPage.evaluate( + `!!document.querySelector('.file-pane.is-focused .file-entry[data-filename=' + ${JSON.stringify(JSON.stringify(name))} + '].is-selected')`, + ), + 2000, + ) + } + + // Press F8 to open the delete confirmation dialog (MTP volumes show + // "Delete permanently" because they don't support trash). + await pressKey(tauriPage, 'F8') + await tauriPage.waitForSelector('[data-dialog-id="delete-confirmation"]', 10000) + + // Confirm the delete by clicking the danger button. + await tauriPage.evaluate(`(function() { + var dialog = document.querySelector('[data-dialog-id="delete-confirmation"]'); + if (!dialog) return; + var btn = dialog.querySelector('.btn-danger'); + if (btn) btn.click(); + })()`) + + // Wait for the operation to finish: filesystem-side files gone, then + // refresh the pane so the FE catches up. + await pollUntil( + tauriPage, + () => { + for (const name of selection) { + if (fs.existsSync(path.join(MTP_FIXTURE_ROOT, 'internal', 'DCIM', name))) return Promise.resolve(false) + } + return Promise.resolve(true) + }, + 30000, + ) + await mcpCall('refresh', {}) + + // Wait for write-complete (or for the progress dialog to go away). + await pollUntil( + tauriPage, + async () => tauriPage.evaluate(`(window.__deleteCompleteEvents || []).length > 0`), + 10000, + ) + + // Pull the captured progress sequence and run the M3 assertions. + const events = await tauriPage.evaluate( + `(window.__deleteProgressEvents || []).map(p => ({ phase: p.phase, filesDone: p.filesDone, filesTotal: p.filesTotal, bytesDone: p.bytesDone }))`, + ) + + // 1. Phase sequence must monotonically advance through Scanning -> + // Deleting. Collapse consecutive-same-phase runs and assert that the + // transition never reverses, regardless of whether `deleting` got a + // progress event at all. + // + // Why "or doesn't appear" is fine: the per-file delete on a small + // selection often finishes inside one 200 ms progress-throttle + // window, so the BE fires `scanning` then jumps to write-complete. + // That's the M3 fast path doing its job — the bug we guard against + // is a SECOND scanning phase showing up after deleting started, not + // "delete must emit a deleting event." + const phaseSequence: string[] = [] + for (const ev of events) { + if (phaseSequence[phaseSequence.length - 1] !== ev.phase) phaseSequence.push(ev.phase) + } + // Map phases to monotonically-increasing ranks. Anything not in the + // map (rolling_back, copying, etc.) is intentionally ignored — those + // shouldn't fire during a plain delete, and if they do they're not the + // double-scan bug. + const phaseOrder = new Map([ + ['scanning', 0], + ['deleting', 1], + ['trashing', 1], + ]) + let lastRank = -1 + for (const phase of phaseSequence) { + const rank = phaseOrder.get(phase) + if (rank === undefined) continue + expect( + rank, + `phase regressed: saw ${phase} after a later phase (full sequence: ${JSON.stringify(phaseSequence)})`, + ).toBeGreaterThanOrEqual(lastRank) + lastRank = rank + } + // The compacted phase sequence (consecutive duplicates removed) must + // contain at most one 'scanning' run. A second one would mean the + // pre-fix re-scan came back. + const scanningCount = phaseSequence.filter((p) => p === 'scanning').length + expect( + scanningCount, + `expected at most one scanning phase run, saw ${String(scanningCount)} (sequence: ${JSON.stringify(phaseSequence)})`, + ).toBeLessThanOrEqual(1) + + // 2. `filesDone` must be monotonic within each phase. The pre-fix + // symptom was a counter that reset to zero when the silent re-walk + // began. + const lastFilesByPhase: Record = {} + for (const ev of events) { + const last = lastFilesByPhase[ev.phase] ?? 0 + expect( + ev.filesDone, + `filesDone went backwards in phase=${ev.phase}: ${String(last)} -> ${String(ev.filesDone)} (full sequence: ${JSON.stringify(events.map((e) => `${e.phase}:${String(e.filesDone)}`))})`, + ).toBeGreaterThanOrEqual(last) + lastFilesByPhase[ev.phase] = ev.filesDone + } + + // 3. Files are gone from the device. + for (const name of selection) { + expect(fs.existsSync(path.join(MTP_FIXTURE_ROOT, 'internal', 'DCIM', name))).toBe(false) + } + } finally { + // Tear down listeners and clear test state, in that order so a partial + // failure can still clean up. + await tauriPage.evaluate(`(async function() { + const progressId = window.__deleteProgressEventId; + if (progressId !== undefined) { + await window.__TAURI_INTERNALS__.invoke('plugin:event|unlisten', { event: 'write-progress', eventId: progressId }); + } + const completeId = window.__deleteCompleteEventId; + if (completeId !== undefined) { + await window.__TAURI_INTERNALS__.invoke('plugin:event|unlisten', { event: 'write-complete', eventId: completeId }); + } + delete window.__deleteProgressEvents; + delete window.__deleteProgressEventId; + delete window.__deleteCompleteEvents; + delete window.__deleteCompleteEventId; + })()`) + } + }) +}) diff --git a/apps/desktop/test/e2e-playwright/mtp.spec.ts b/apps/desktop/test/e2e-playwright/mtp.spec.ts index dc34e8f0..03ffb9db 100644 --- a/apps/desktop/test/e2e-playwright/mtp.spec.ts +++ b/apps/desktop/test/e2e-playwright/mtp.spec.ts @@ -346,11 +346,21 @@ test.describe('MTP file operations', () => { await mcpCall('move_cursor', { pane: 'left', filename: 'report.txt' }) await mcpCall('copy', { autoConfirm: true }) - // Wait for file to appear in right pane - await mcpAwaitItem('right', 'report.txt') - - // Verify on disk - expect(fs.existsSync(path.join(fixtureRoot, 'right', 'report.txt'))).toBe(true) + // Poll the destination on disk first: that's the authoritative truth. + // Under heavy concurrent load (full slow-suite run with rust-tests-linux + + // Docker SMB containers eating disk + CPU), both safety nets for refreshing + // the destination pane can race: the FilePane's notify-rs watcher (200 ms + // debounce) can miss the FSEvents add, and the `refreshPanesAfterTransfer` + // IPC fired from `handleTransferComplete` can queue up behind a saturated + // Tauri event loop. PaneStateStore then stays stale even though the BE + // copy already succeeded and the file is on disk. Tests 11 (local→MTP) and + // 27 (50 MB MTP→local) already use this pattern; this brings test 10 inline. + const destPath = path.join(fixtureRoot, 'right', 'report.txt') + await pollFs(tauriPage, () => fs.existsSync(destPath), 30000) + + // Force the pane to re-list so the await reads a fresh PaneStateStore. + await mcpCall('refresh', {}) + await mcpAwaitItem('right', 'report.txt', 30) // Verify source still exists (copy, not move) await mcpSwitchPane() diff --git a/apps/desktop/test/e2e-playwright/playwright.config.ts b/apps/desktop/test/e2e-playwright/playwright.config.ts index a5a5f39c..38a80418 100644 --- a/apps/desktop/test/e2e-playwright/playwright.config.ts +++ b/apps/desktop/test/e2e-playwright/playwright.config.ts @@ -7,7 +7,12 @@ import { defineConfig } from '@playwright/test' // - unset / "all": every spec (single-instance / legacy run). const shardKind = process.env.CMDR_E2E_SHARD_KIND ?? 'all' -const mtpSpecMatch = /mtp(-conflicts)?\.spec\.ts$/ +// Match every MTP spec: `mtp.spec.ts`, `mtp-conflicts.spec.ts`, and the +// M4 fresh-listing-reuse specs (`mtp-copy-preflight-uses-cache.spec.ts`, +// `mtp-delete-no-double-scan.spec.ts`). Every spec that touches the virtual +// MTP backing dir must live on the dedicated MTP shard — running two MTP +// specs in parallel corrupts the shared fixture root. +const mtpSpecMatch = /mtp(-[a-z-]+)?\.spec\.ts$/ const testMatch = shardKind === 'mtp' ? mtpSpecMatch : '*.spec.ts' const testIgnore = shardKind === 'non-mtp' ? mtpSpecMatch : undefined diff --git a/docs/specs/fresh-listing-reuse-plan.md b/docs/specs/fresh-listing-reuse-plan.md new file mode 100644 index 00000000..d7f34986 --- /dev/null +++ b/docs/specs/fresh-listing-reuse-plan.md @@ -0,0 +1,413 @@ +# Fresh-listing reuse for write-op scans + +## Why this exists + +Pre-flight scans for copy, move, and delete on MTP (and to a lesser degree SMB, network mounts, big local trees) +duplicate work the backend has already done. Three concrete failures the user hit on a connected Android device with 15k +photos in `/DCIM/Camera`: + +1. **Copy/move pre-flight re-reads the current folder.** The user selects 135 photos and presses F5. The "Verifying + before copy…" dialog shows a `5,080 files` counter climbing — that's the MTP device re-listing the parent directory + just to look up size of the 135 selected entries by name. The listing is already in the pane's `LISTING_CACHE` and + the MTP event loop is keeping it fresh. +2. **Delete pre-flight does the same re-list**, then on confirm runs _a second_ re-scan inside + `delete_volume_files_with_progress` (it ignores `config.preview_id`). The second scan emits no progress for top-level + files, so the UI looks frozen. +3. **Per-file `is_directory` probes during the operation.** For each top-level selected path, the volume scan calls + `is_directory(path)` which on MTP lists the parent dir again. Copy/move already work around this via `source_hints` + seeded from the cached scan result. Delete does not. + +Beyond the perf wins, the change tightens a contract we want in writing: **write operations trust a cached listing only +when it's watcher-backed.** No "fresh-enough" age heuristics. No "the index says…" shortcuts. A listing is either being +actively kept in sync by a live watcher, or it gets re-read. This matches the user's risk model: write ops can lose +data, so the freshness contract is bright-line. + +## Scope + +- Backend-only change. No FE protocol change. (`startScanPreview` keeps the same signature; the cache layer below + decides whether to do real I/O.) +- Applies to **copy, move, delete** on all volume types. Local FS benefits modestly; MTP and SMB benefit a lot. +- Does NOT trust the drive index for write-op decisions. The index continues to feed UX (the "X% of estimated" + progress-bar denominator, recursive size in the file list). It just doesn't replace a real walk. +- Does NOT remove the existing volume-level caches (MTP's 5 s TTL, SCAN_PREVIEW_RESULTS). Those still play their roles. + The new oracle is additive. + +## Design + +### The oracle + +A single backend helper, in `file_system/listing/caching.rs`: + +```rust +/// Returns cached entries for `(volume_id, path)` if the volume reports that +/// this listing is being kept in sync by an active watcher. Otherwise `None`. +/// +/// **Freshness contract (read carefully)**: a `Some(_)` result means the +/// volume has an active change-notification channel and the cache reflects +/// the volume's most recently observed state. It does NOT mean the cache is +/// byte-perfect with the device right now — every backend has a debounce or +/// settling window between a real change and the cache reflecting it: +/// - MTP: 500 ms event debouncer, plus per-device event-loop polling; some +/// MTP devices (cameras especially) don't emit per-object events at all, +/// in which case "watched" means only "the device is reachable and would +/// report changes if it sent any". +/// - SMB: 200 ms watcher debounce; >50 events per dir triggers a +/// FullRefresh request that arrives via a real re-read. +/// - Local FS: FSEvents coalesce window (~10 ms). +/// +/// Callers must treat the result as "fresh as our most recent observation," +/// which is the same guarantee a `list_directory` call gives — it sees the +/// device's state at the moment the call returned, not at the moment the +/// caller reads its result. The contract intentionally accepts this window: +/// a tighter one would force us to re-validate every walk, which defeats the +/// whole point. +pub fn try_get_watched_listing(volume_id: &str, path: &Path) -> Option> +``` + +Implementation flow: + +1. `find_listings_for_path_on_volume(Some(volume_id), path)` — already exists, returns matching listing IDs. +2. Pick the **most-recently-updated** matching listing: highest `sequence`, ties broken by latest `created_at`. Don't + take the first iteration result — `HashMap` iteration order is non-deterministic and a stale duplicate listing could + poison the result. Worth noting: for path-level watchers (local FS) two panes on the same path both register + watchers, so both listings are equally fresh; the tiebreaker is just for determinism. +3. Look up the volume in `VolumeManager`. Call `listing_is_watched(path)`. +4. If true, clone the entries and return. + +Why on the `Volume` trait, not a free function: each backend has its own "watcher alive" signal (`WATCHER_MANAGER` for +local, `connection_manager().is_connected(device_id)` for MTP, the `smb_watcher` cancel sender for SMB). Putting the +check on the trait keeps the freshness contract colocated with the backend that owns it. Default `false` means a new +backend without a real watcher won't accidentally claim freshness. + +The oracle clones the `Vec` rather than borrowing. The listing cache's `RwLock` would otherwise be held +across awaits in the scan loop, blocking pane navigation. Entries are flat structs; for 15k entries the clone is < 5 ms. + +### How `volume_id` flows + +The plan does NOT add a `volume_id()` method to the `Volume` trait. `LocalPosixVolume` doesn't store one (it's +constructed with just name + root; the manager assigns the ID at registration time), and adding it would ripple through +every constructor and call site. Instead, write-op callers already have `volume_id` in +`WriteOperationConfig.source_volume_id` / `dest_volume_id`. The oracle and the new scan walker take `volume_id: &str` as +an explicit parameter, propagated down through the recursion. Cheaper, no trait change. + +### Recursive scan rule + +One rule, applied at every level of the scan walker (copy, move, delete): + +``` +fn scan(volume, volume_id, path): + entries = oracle.try_get_watched_listing(volume_id, path) + .unwrap_or_else(|| volume.list_directory(path)) + for entry in entries: + if entry.is_directory and not entry.is_symlink: + scan(volume, volume_id, entry.path) // recurse, same rule re-applies + else: + count_as_file(entry) +``` + +What this gives you, working through the user's `[a]`, `[b]`, `d` scenario: + +1. Top level: oracle hits the current folder's cached listing. `d`'s size, `[a]`'s `is_directory=true`, `[b]`'s + `is_directory=true` all come from cache. No I/O. +2. `d`: leaf, counted from cache. +3. `[a]`: recurse. Oracle misses (`[a]` isn't open in any pane). Real `list_directory(/current/[a])`. Walk continues. +4. `[b]/subfolder/`: recurse. Oracle hits (the other pane has it open). Use cached entries, skip the I/O for that + subtree's top level. Walk continues below normally for any _non-cached_ sub-subfolders. +5. Anywhere a watcher is dead or the listing isn't cached: real walk. + +Symlinks: cached `FileEntry.is_symlink == true` means the symlink is treated as a single entry (no recursion), matching +the existing `walk_dir_recursive` policy (`symlink_metadata`, no dereference). The recurse condition already excludes +symlinks. + +### Volume trait addition + +```rust +trait Volume { + /// Returns true when the listing at `path` is currently being kept in sync by + /// a live watcher on this volume. Used by `try_get_watched_listing` to decide + /// whether a cached listing can replace a real read in write-op pre-flight. + /// + /// "Live watcher" is intentionally coarse for non-local backends — see the + /// freshness contract on `try_get_watched_listing` for the per-backend + /// debounce / settling windows callers must tolerate. + /// + /// Default `false`: new backends without an active watcher opt in explicitly. + fn listing_is_watched(&self, _path: &Path) -> bool { false } +} +``` + +Per-backend implementations: + +| Backend | `listing_is_watched` returns true when | +| ------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `LocalPosixVolume` | `WATCHER_MANAGER` has a `WatchedDirectory` entry whose path matches `path` for some active listing | +| `MtpVolume` | the device is connected (`connection_manager().is_connected(device_id)`). The MTP event loop is per-device, not per-folder; connected = "the device will report any changes it sends." Documented caveat: many MTP devices don't send events at all | +| `SmbVolume` | `watcher_cancel.lock().is_some()` AND `connection_state() == Direct`. Volume-level, not path-level: the SMB watcher monitors the whole share via `CHANGE_NOTIFY` | +| `InMemoryVolume` | never — no real watcher; default `false` is correct | + +The MTP and SMB checks are volume-level, not path-level. This is intentional and a known limit: when the gate flips +true, every path on that volume becomes oracle-eligible, including paths whose cache entries are old enough that the +debounce window may not have fully settled around them. The trade-off: tightening this to "path-level freshness with +last-event-timestamp" requires per-path observation state we don't currently track, and the user already accepted the +volume-level grain ("if the watcher is alive we trust the cache"). + +### Scan-preview cache, current folder + +The pre-flight UI flow (TransferDialog, DeleteDialog) calls `start_scan_preview` which today always invokes +`scan_for_copy_batch_with_progress`. With the oracle in place, `run_volume_scan_preview` checks each input path's parent +against the oracle first; on hit, it builds the `BatchScanResult` from cached entries (no volume call) for any path +whose parent is watcher-backed. + +The result is still written into `SCAN_PREVIEW_RESULTS` keyed by `previewId`, so the downstream copy/delete consumes it +the same way. The cached entries' `size` and `is_directory` flow into `per_path` exactly as if a real scan had produced +them. + +For paths whose top-level item is a directory, the scan walker recurses into the directory with the same oracle-first +rule. So `[a]` either gets walked from real I/O (no other pane has it open) or from cached sub-listings (it or its +subfolders are open elsewhere). + +### Delete: stop ignoring `preview_id` + +`delete_volume_files_with_progress_inner` runs its own `scan_volume_recursive` regardless of whether the preview already +cached the same data. Fix: + +1. At the top of the function, `take_cached_scan_result(preview_id)` like `copy_volumes_with_progress` does. +2. If present, derive the delete entry list from the cached per-path results: + - For top-level files: one `VolumeDeleteEntry { path, size, is_dir: false }` per entry, size from cache. + - For top-level dirs: still need the recursive walk (delete needs per-file paths). But we know it's a dir without an + `is_directory` probe, and the walker uses the oracle-aware rule from above so the first level under each dir is + cache-fed when possible. +3. If absent (no preview, MCP path, etc.), fall through to today's `scan_volume_recursive` — but with the walker now + oracle-aware. On the no-preview path, the top-level `is_directory(source)` probe at `delete.rs:425` STAYS for paths + whose parent isn't open in any pane (the oracle misses, and we genuinely don't know what type the path is). It only + goes away when the parent oracle hits. This is fine because the no-preview path is rare (MCP-triggered delete, + programmatic invocation) and one `get_metadata` per top-level path on a cold cache is acceptable. + +### MTP-specific cleanup + +`MtpVolume::scan_for_copy_batch_with_progress` (mtp.rs:469-569) currently groups paths by parent dir to amortize +listings — that's a real cold-cache optimization (one parent listing for 135 selected siblings, not 135 separate +`get_metadata` calls). We **keep** this override and layer the oracle on top of it: before the parent-grouping logic +runs, check the oracle for each parent. On hit, skip the `list_directory` call entirely for that parent and look up +children from the cached entries. On miss, fall through to the existing parent-grouping path. + +The generic `Volume` trait default for `scan_for_copy_batch_with_progress` (which loops `scan_for_copy` per path, fine +for local FS where stat is cheap) also gets oracle support via the shared walker from M2. Local FS doesn't benefit much +in absolute terms, but the uniform code path is worth it. + +### What stays the same + +- `SCAN_PREVIEW_RESULTS` cache: unchanged. Still keyed by `previewId`. Still TTL-free (lives until consumed). +- MTP's 5 s `ListingCache` in `mtp/connection/cache.rs`: unchanged. It's a different layer (mtp-rs ↔ MTP volume), hit by + `list_directory_with_progress` when the oracle misses. +- `recursive_size`, `recursive_file_count` on cached entries: still populated by the indexer, still displayed in the + file list and used by `expected_totals_for_sources` for the scanning progress denominator. Not consumed by the + scan-decision path. +- ETA, throughput, conflict detection, rollback: untouched. + +## Risks and edge cases + +- **Hardlink dedup straddling the cache/walk boundary.** `walk_dir_recursive` dedupes hardlinks by inode for byte + counts; `FileEntry` doesn't carry inode. When the cache supplies one path and a real walk supplies another path to the + same inode, dedup misses. Direction is safe (overcount → pessimistic ETA, conservative disk-space reject). Existing + test `walk_dir_recursive_hardlinks_dedup_inodes` tests the walker, not the oracle path, so it stays valid. If we later + want true dedup across the boundary, add `inode: Option` to `FileEntry`. Not in this plan. +- **Watcher races.** The oracle returns entries; between that read and consumption, a watcher event could fire. This is + exactly the race a real walk has too (something changes between `list_directory` returning and the copy starting). No + new exposure. +- **Listing closed mid-scan.** User closes the source tab between scan-preview and copy/delete. The cached listing is + gone. Today `take_cached_scan_result` handles this for copy. The oracle inherits it: if the listing is gone, oracle + returns `None`, walker falls through to real I/O. +- **Volume disconnected mid-pre-flight (cable yanked).** Oracle returned `Some(entries)` a moment ago; now + `listing_is_watched` would flip `false`. The walker is already iterating those entries and recursing. Recursive calls + into the now-disconnected volume's `list_directory` fail fast, but the top-level synthesized totals (file count, byte + total) may reflect a now-gone state. The actual copy/delete then fails per-file with a confusing-ish error rather than + "device disconnected." This isn't new — `scan_for_copy_batch` had the same race — but documenting as a Gotcha in + `write_operations/CLAUDE.md`. +- **Listing exists but watcher not yet attached.** Per `listing/CLAUDE.md`: "File watcher starts AFTER listing + complete." Tiny window between `list_directory_start_streaming` finishing the read and `WATCHER_MANAGER.insert` + landing. During this window the cache has entries but `listing_is_watched` returns `false` for local. Oracle correctly + returns `None`, falls through. Tested explicitly in M1. +- **MTP devices that never send events.** Many cameras claim PTP/MTP but don't emit `ObjectAdded`/`Changed`. The cache + doesn't get updated reactively, so it ages out via the 5 s mtp-rs TTL (which is a separate layer the oracle doesn't + see). `is_connected = true` and the oracle will happily serve entries from the open listing in `LISTING_CACHE` + forever. This is acceptable per the documented freshness contract (we promise "device's most recent observed state", + which on an event-silent device is "the last time we listed it"). Workaround for users who need strict freshness: + navigate away and back in the pane to force a re-list. +- **Snapshot semantics.** The cached entries are a snapshot. They're correct as of the watcher's last update. Worst case + for a 15k-photo folder: one entry being stale-by-debounce-window — same exposure as reading the listing 200 ms ago. + Documented as a Gotcha. + +## Milestones + +### M1 — Plumbing: oracle + trait method + +Sequential within the milestone (each step depends on the previous compiling cleanly). + +1. Add `fn listing_is_watched(&self, _path: &Path) -> bool { false }` to `Volume` in `file_system/volume/mod.rs`. + Default false. No `volume_id()` method — callers pass the ID explicitly (see "How `volume_id` flows" above). +2. Wire up each backend: + - `LocalPosixVolume::listing_is_watched`: query `WATCHER_MANAGER` for any `WatchedDirectory` whose path matches. + - `MtpVolume::listing_is_watched`: `connection_manager().is_connected(&self.device_id)`. + - `SmbVolume::listing_is_watched`: `self.watcher_cancel.lock().is_some() && self.connection_state() == Direct`. Note: + `watcher_cancel` is a `Mutex` not `Mutex` — use the std `try_lock` or sync lock variant; avoid holding it + across awaits. If `try_lock` returns `WouldBlock` (some other task holds the mutex), treat that as "not watched" + (return `false`); the oracle falls through to a real read, which is the safe direction. Don't retry — the lock is + brief and another sweep can pick it up on the next pre-flight. + - `InMemoryVolume::listing_is_watched`: leave default `false`. +3. Add `try_get_watched_listing(volume_id: &str, path: &Path) -> Option>` in + `file_system/listing/caching.rs`. Picks the most-recently-updated matching listing (highest sequence, ties broken by + `created_at`), looks up the volume in `VolumeManager`, asks `listing_is_watched`, returns cloned entries on hit. +4. Unit tests in `caching_test.rs`: + - Hit when listing exists and watcher reports true. + - Miss when listing exists but watcher reports false. + - Miss when no listing exists. + - Miss when volume isn't registered. + - **Determinism**: with two listings on the same path differing in `sequence`, the higher-sequence one is picked. + - **Race window**: listing exists in cache but `WATCHER_MANAGER` has no entry yet (simulates the start-streaming → + register-watcher gap) — assert miss. +5. Per-backend tests for `listing_is_watched`: + - Local: opens a listing, asserts true; closes it, asserts false. + - MTP: connect/disconnect a virtual device, assert flip. + - SMB: needs a Docker SMB container test; `#[ignore]`-gated like existing soak tests. + +**Docs to touch**: + +- `file_system/volume/CLAUDE.md`: add `listing_is_watched` to the capability matrix and a paragraph on the freshness + contract, explicitly stating the per-backend debounce windows. +- `file_system/listing/CLAUDE.md`: document `try_get_watched_listing` next to `find_listings_for_path_on_volume`, + including the sequence/created_at tiebreaker rule. + +**Checks**: `./scripts/check.sh --rust` after step 2. Full `./scripts/check.sh` end of milestone. + +### M2a — Oracle-aware walker (local + scan-preview hookup, MTP/SMB cold-cache override unchanged) + +Goal: ship the shared walker abstraction and wire it into the scan-preview path. The MTP and SMB +`scan_for_copy_batch_with_progress` overrides stay in place in this milestone, so the cold-cache parent-grouping +optimization still runs when the oracle misses. M2a only adds an oracle short-circuit _before_ the volume call for +watched parents — the override itself is unchanged, just sometimes skipped. + +1. Extract `scan_subtree_with_oracle(volume, volume_id, path) -> SubtreeTotals` into `scan.rs`. Returns file count, byte + total, per-path facts in the shape `BatchScanResult` expects. At every recursion level, checks the oracle first; + falls through to `volume.list_directory` on miss. Skips recursion into symlinks. +2. In `run_volume_scan_preview` (`scan_preview.rs`), group input paths by parent. For each parent: oracle check first; + if hit, build the per-path slice of `BatchScanResult` from cached entries (top-level files done; top-level dirs go + through `scan_subtree_with_oracle`). If miss, call the existing + `volume.scan_for_copy_batch_with_progress(paths_for_this_parent, on_progress)` — same path as today, just over a + subset. +3. Local FS `walk_dir_recursive` in `scan.rs`: add an oracle check at the top of each recursive call. Same rule. +4. `SCAN_PREVIEW_RESULTS` wiring unchanged. The synthesized `BatchScanResult` is written to the cache the same way a + real scan's result is. +5. Integration tests in `volume_copy_tests.rs`: + - `scan_preview_uses_watched_listing_for_top_level_files`: InMemoryVolume with test-only override of + `listing_is_watched`. Counter-wrapping volume asserts `list_directory` call count is 0. + - `scan_preview_falls_through_when_watcher_dead`: same setup, override returns false, assert `list_directory` is + called. + - `scan_preview_uses_cached_subfolder_listing_when_other_pane_has_it`: open `[a]/sub` listing in a second listing ID, + scan a copy of `[a]`, assert no `list_directory` call for `[a]/sub`. + - `scan_preview_preserves_symlink_semantics`: cached entry with `is_symlink=true, is_directory=true` — walker counts + it as one entry, no recursion. + - `scan_preview_handles_listing_closed_mid_walk`: open `[a]/sub` in pane B, start scan of `[a]`, close pane B before + `[a]/sub` is reached during recursion. Assert correctness — recursion falls through to real walk for `[a]/sub` + mid-way. + +**Docs**: + +- `write_operations/CLAUDE.md`: Decision entry "Scan preview reuses watched listings", Gotcha for hardlink dedup + straddling the cache boundary, Gotcha for volume-disconnect race. +- `volume/CLAUDE.md`: in "Building a new volume" tier 3, mention `listing_is_watched` and what it gates. + +**Checks**: `./scripts/check.sh` end-of-milestone. + +### M2b — Port MTP and SMB scan overrides to the shared walker + +Goal: layer the oracle on top of the existing parent-grouping optimizations without losing cold-cache perf. + +1. In `MtpVolume::scan_for_copy_batch_with_progress`: before the existing parent-grouping logic, add an oracle check per + parent dir. On hit, populate that parent's child results from cached entries (skip the `list_directory` call). On + miss, fall through to today's logic unchanged. +2. Same shape for `SmbVolume::scan_for_copy_batch` (if it has an analogous parent-grouping override; otherwise just the + oracle check before the pipelined stats). +3. Integration tests: + - `mtp_scan_uses_oracle_on_hit_skips_list_directory`: virtual MTP device, listing in cache, oracle hit, assert no + `list_directory` call. + - `mtp_scan_cold_cache_still_uses_parent_grouping`: virtual MTP device, no cached listing, assert one + `list_directory` per unique parent (preserves today's optimization). + - `smb_scan_uses_oracle_on_hit_skips_stat_pipeline`: ditto for SMB, behind a Docker-gated test. + +**Docs**: update `volume/mtp.rs` doc comment on `scan_for_copy_batch_with_progress` to note the oracle layering. + +**Checks**: `./scripts/check.sh` end-of-milestone. + +### M3 — Delete reuses the scan preview and the oracle + +1. In `delete_volume_files_with_progress_inner` (`delete.rs:410`), call `take_cached_scan_result(preview_id)` at the + top. On hit, derive the entry list from `per_path`: + - Top-level files → `VolumeDeleteEntry { path, size, is_dir: false }`. + - Top-level dirs → recursive walk via `scan_subtree_with_oracle` from M2a. +2. On miss (no preview_id, MCP path, etc.), fall through to today's `scan_volume_recursive` with the walker now + oracle-aware. Important: on this no-preview path, the top-level `volume.is_directory(source)` probe stays for sources + whose parent the oracle can't answer. It only goes away when the parent oracle hits (or when we have a cached scan + result via path 1). This keeps the change scoped — replacing `is_directory` everywhere is its own investigation. +3. Integration tests in a new `delete_volume_reuse_tests.rs`: + - `delete_consumes_preview_id_skips_rescan`: counter-wrapping InMemoryVolume, run scan_preview then + delete_files_start with the same preview_id, assert `list_directory` called once total. + - `delete_without_preview_id_still_walks`: assert correctness when MCP triggers delete with no scan. + - `delete_top_level_files_no_is_directory_probes`: assert `is_directory` call count is zero when the parent listing + is in `LISTING_CACHE` and watched. + - `delete_mid_scan_listing_close`: open the source dir in pane A and a subfolder in pane B, start delete, close pane + B mid-recursion. Assert correctness (mid-walk fallback). + +**Docs**: `write_operations/CLAUDE.md`: update the data-flow diagram for the delete path to note the preview-reuse step. + +**Checks**: `./scripts/check.sh` end-of-milestone. + +### M4 — End-to-end verification + +E2E tests catch the user-visible regression (frozen-looking dialogs, wrong counts) that unit tests can miss. + +1. Playwright spec `mtp-copy-preflight-uses-cache.spec.ts`: connect virtual MTP device, list `/DCIM`, select N files, + press F5, assert the "Verifying" dialog completes within a tight bound (e.g. 300 ms) with the right `filesFound` + count. +2. Playwright spec `mtp-delete-no-double-scan.spec.ts`: same setup, press F8, confirm, assert the operation dialog + transitions Scanning → Deleting once (not twice), and the count never goes backwards. +3. Manual QA pass on a real Android device: copy + delete workflows on `/DCIM/Camera` with 5k+ photos. Verify the + climbing-counter behavior is gone. +4. Run `./scripts/check.sh --include-slow` to catch anything CI's main lane doesn't. + +**Docs**: refresh `apps/desktop/src-tauri/src/file_system/write_operations/CLAUDE.md`'s top-of-file purpose paragraph to +mention the watcher-backed reuse contract. + +## What can be parallelized + +- **Within M1**: backend implementations of `listing_is_watched` (step 2's four sub-tasks) are independent once the + trait method exists. Could go in any order. Not worth a worktree. +- **Within M2a and M2b**: tests can be written alongside implementation. + +Milestone-to-milestone dependency is strict: M2a depends on M1's oracle, M2b depends on M2a's walker, M3 depends on +M2a's walker (and benefits from M2b but doesn't require it), M4 depends on all three. Run sequentially. + +## Out of scope + +- Adding inode tracking to `FileEntry` for true hardlink dedup on the oracle path. Filed as a follow-up. +- Pre-walking subtrees to populate the listing cache before a copy. The oracle is opportunistic; it doesn't warm the + cache. +- Changing the FE `startScanPreview` signature or the SCAN_PREVIEW_RESULTS schema. Stays the same on purpose; this whole + change is invisible to the FE. +- Trusting `recursive_size` from the index in any write-op decision. Index data stays UX-only. +- Path-level freshness tracking (last-event-timestamp per listing). The volume-level grain is good enough for now. +- Replacing top-level `is_directory(source)` probes on the no-preview delete path. Scoped to a follow-up. + +## Design principle alignment + +- **Elegance over hacks**: one oracle, one rule, applied uniformly at every level. No backend-specific short-circuits + scattered across the scan paths. +- **The app should feel rock solid**: the freshness contract is bright-line at the watcher boundary. No "5 seconds is + fresh enough" TTL judgment. The per-backend debounce windows are documented honestly. +- **Protect the user's data**: doesn't change any write semantics. Same atomic ops, same rollback, same conflict + resolution. Just makes the pre-flight honest about what it already knows. +- **Be respectful to the user's resources**: the win. F5 on 135 photos in a watched MTP folder goes from re-listing 15k + entries (~17 s USB) to ~5 ms (cache clone + walker over 135 entries). +- **Smart backend, thin frontend**: FE is untouched. All the new logic lives in Rust where it belongs. +- **Invest in testability**: every milestone has named test cases targeting specific races and edge cases the reviewer's + analysis surfaced. Counter-wrapping `InMemoryVolume` is the right test shape for asserting call counts.