Skip to content

fix(unixfs): iterative dir finalization#1638

Draft
fforbeck wants to merge 1 commit intomainfrom
fix/unixfs-finalize-dir
Draft

fix(unixfs): iterative dir finalization#1638
fforbeck wants to merge 1 commit intomainfrom
fix/unixfs-finalize-dir

Conversation

@fforbeck
Copy link
Copy Markdown
Member

@fforbeck fforbeck commented Apr 14, 2025

Optimization of Directory Processing in UnixFS

Problem

The finalize method in UnixFSDirectoryBuilder is the most likely place to cause stack overflow errors when processing directories with large numbers of files (100K+). While we couldn't directly simulate the specific error condition, analysis of the code structure indicates that the recursive approach used for directory traversal would be prone to call stack limitations when dealing with large directory structures.

Potential Solution

Refactored the finalize method to use an iterative approach instead of recursion:

  1. Implemented a discovery phase that collects all directories without processing them
  2. Added a separate phase to collect and process files in batches
  3. Introduced a depth-based sorting mechanism to process directories from deepest to shallowest
  4. Added batch processing for large file collections to improve performance
  5. Optimized memory usage by maintaining clear mappings between paths and their corresponding links

Resolves: storacha/w3cli#215

@fforbeck fforbeck self-assigned this Apr 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error uploading folder with 100K files

1 participant