Skip to content

Feat blossom stream file from disk#341

Merged
1-leo merged 23 commits intomasterfrom
feat-blossom-stream-file-from-disk
Feb 23, 2026
Merged

Feat blossom stream file from disk#341
1-leo merged 23 commits intomasterfrom
feat-blossom-stream-file-from-disk

Conversation

@1-leo
Copy link
Contributor

@1-leo 1-leo commented Dec 21, 2025

fixes: #211
dependent on: #332 (merge before)

This PR introduces file streaming so large files are not loaded entirely into memory.
Its non-breaking, old method signatures have not been altered.
Both the files usecase and blossom usecase have the new streaming methods.

  • file streaming
  • progress report (stream of progress values)
  • mirrorToServers()

@1-leo 1-leo self-assigned this Dec 21, 2025
@1-leo 1-leo added the enhancement New feature or request label Dec 21, 2025
@1-leo 1-leo added this to ndk-dev Dec 21, 2025
@1-leo 1-leo moved this to In Progress in ndk-dev Dec 21, 2025
@codecov
Copy link

codecov bot commented Jan 23, 2026

Codecov Report

❌ Patch coverage is 84.59215% with 51 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.90%. Comparing base (9c82ef7) to head (3ed54c3).
⚠️ Report is 24 commits behind head on master.

Files with missing lines Patch % Lines
.../data_layer/repositories/blossom/blossom_impl.dart 85.93% 18 Missing ⚠️
.../ndk/lib/data_layer/data_sources/http_request.dart 72.50% 11 Missing ⚠️
packages/ndk/lib/data_layer/io/file_io_native.dart 76.74% 10 Missing ⚠️
...s/ndk/lib/domain_layer/usecases/files/blossom.dart 87.30% 8 Missing ⚠️
...ib/domain_layer/entities/blob_upload_progress.dart 82.35% 3 Missing ⚠️
.../lib/domain_layer/entities/file_hash_progress.dart 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #341      +/-   ##
==========================================
+ Coverage   75.37%   75.90%   +0.53%     
==========================================
  Files         148      152       +4     
  Lines        5908     6201     +293     
==========================================
+ Hits         4453     4707     +254     
- Misses       1455     1494      +39     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@1-leo 1-leo requested review from frnandu and nogringo January 24, 2026 10:51
@1-leo 1-leo marked this pull request as ready for review January 24, 2026 11:35
@nogringo
Copy link
Collaborator

Is it possible to cancel an upload ?

@1-leo
Copy link
Contributor Author

1-leo commented Jan 24, 2026

Is it possible to cancel an upload ?

for uploadFromFile yes, you can just cancel the returned stream .cancel()
for the in memory no

@1-leo 1-leo requested a review from nogringo January 28, 2026 09:27
@1-leo
Copy link
Contributor Author

1-leo commented Jan 28, 2026

  • add to demo app

Copy link
Collaborator

@frnandu frnandu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Copy link
Collaborator

@nogringo nogringo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the file is read twice. We can optimize it to read it once if the server does not require auth.

UploadFromFile flow

verifyHash = false

  1. Send bytes to the server
    • If server requires auth:
      1. Calculate hash locally
      2. Create the auth event
      3. Resend bytes
  2. Server sends the blob descriptor

verifyHash = true

  1. Start reading file (bytes are used for both hash + upload in parallel)
    • If server asks for auth:
      1. Continue calculating hash, stop uploading
      2. Once hash is calculated, create the auth event
      3. Resend bytes to the server
  2. Both hash and blob descriptor are done
  3. Verify blob descriptor hash matches local hash

@nogringo
Copy link
Collaborator

nogringo commented Feb 5, 2026

I got this Idea for optimizing the flow in some cases but IMO it add too much complexity so the idea is here but we don't need to implement it.

@nogringo
Copy link
Collaborator

nogringo commented Feb 5, 2026

The implementation that we need is BUD-06

@nogringo
Copy link
Collaborator

nogringo commented Feb 5, 2026

Currently, the progress only reflects the upload phase. Hash calculation and mirroring phases have no progress reporting.

Report progress for each phase separately:

class BlobUploadProgress {                                                   
  final UploadPhase phase;         // hashing | uploading | mirroring        
  final double progress;           // 0.0 - 1.0 for current phase           
                                                                             
  // For mirroring phase                                                     
  final int mirrorsTotal;                                                    
  final int mirrorsCompleted;                                                
} 

Example flow

Phase: hashing    → progress: 0% ... 50% ... 100%                            
Phase: uploading  → progress: 0% ... 50% ... 100%                            
Phase: mirroring  → progress: 0/3 ... 2/3 ... 3/3                            

This lets developers show accurate progress for each phase instead of the current behavior where progress stays at 0% during hashing.

@1-leo
Copy link
Contributor Author

1-leo commented Feb 7, 2026

Start reading file (bytes are used for both hash + upload in parallel)

Currently, the file is read twice. We can optimize it to read it once if the server does not require auth.

UploadFromFile flow

verifyHash = false

1. Send bytes to the server
   
   * If server requires auth:
     
     1. Calculate hash locally
     2. Create the auth event
     3. Resend bytes

2. Server sends the blob descriptor

verifyHash = true

1. Start reading file (bytes are used for both hash + upload in parallel)
   
   * If server asks for auth:
     
     1. Continue calculating hash, stop uploading
     2. Once hash is calculated, create the auth event
     3. Resend bytes to the server

2. Both hash and blob descriptor are done

3. Verify blob descriptor hash matches local hash

The described verifyHash = true flow won't work in parallel because we need the hash before uploading, and reading the file once would require storing it in memory.
It could work if the developer provides the hash, but when is this the case?

@1-leo 1-leo requested a review from nogringo February 18, 2026 09:54
Copy link
Collaborator

@nogringo nogringo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlobUploadProgress and UploadPhase are not exported from the public API.

@1-leo 1-leo requested a review from nogringo February 23, 2026 09:59
@1-leo 1-leo merged commit 76981e7 into master Feb 23, 2026
7 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in ndk-dev Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

blossom support large file uploads

3 participants