Draft
Conversation
4696938 to
8e6cabb
Compare
alanshaw
reviewed
May 1, 2025
0b72a0f to
5ec8fe8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Summary: Fix Stack Overflow in Large Dataset Processing
Context
This PR addresses a series of stack overflow issues encountered when processing large datasets (100K+ files). Due to the difficulty in simulating the exact conditions that caused the "Maximum call stack size exceeded" error, I took a trial-and-error approach, working closely with the user to validate each fix. I focused on improving the code paths that had the highest probability of causing a stack overflow, based on the error patterns and user feedback.
The fixes were implemented and tested incrementally:
1. UnixFS Directory Processing
Problem
UnixFSDirectoryBuilderinunixfs.jswas using recursive directory buildingfinalizeoperationSolution
Tests
2. ShardedDAGIndex Archive Process
Problem
archivefunction insharded-dag-index.jswas processing all shards and slices at onceSolution
Tests
3. Index Add Process
Problem
addfunction inupload-api/src/index/add.jswas processing all shard allocations concurrentlySolution
Promise.allwithin each batch for concurrent processingTests