Skip to content

Conversation

@akronim26
Copy link

Fixes #378

@akronim26 akronim26 changed the title repot remaining regress quotas chore: report remaining regress quotas Jan 24, 2026
@bajtos bajtos changed the title chore: report remaining regress quotas feat: report remaining regress quotas Jan 27, 2026
Copy link
Contributor

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution; this is a good start!

Can you please revert the changes made in all worker-configuration.d.ts files? They are not relevant to this pull request and make the diff difficult to review. (Feel free to open a different pull request to update those .d.ts files.)

Please add some tests to verify your implementation.

We need to discuss how to handle the following use case:

  • The client requests (walletAddress,pieceCid) that's in our system, but where no dataset has sufficient egress quota.
  • FilBeam should respond with 429 Too Many Requests, and response headers should indicate the remaining CDN & cache-miss egress quotas.

If we think that's out of the scope of this pull request, then that's fine, and let's open a follow-up pull request.

Comment on lines 206 to 214
const contentLengthHeader =
retrievalResult.response.headers.get('content-length')
const estimatedEgress = contentLengthHeader
? Number.parseInt(contentLengthHeader, 10) || 0
: 0
const remainingCdn =
retrievalCandidate.cdnEgressQuota - BigInt(estimatedEgress)
const remainingCacheMiss =
retrievalCandidate.cacheMissEgressQuota - BigInt(estimatedEgress)
Copy link
Contributor

@bajtos bajtos Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about relying on content-length. It's not guaranteed that the response will have this header. I am concerned that the remaining egress reported by FilBeam will be unpredictable.

Also, there may be other requests in progress that will further reduce the remaining egress quotas - I think it's impossible to give the client a precise value.

I propose simplifying the report and reporting the remaining egress quota as it was at the start of the response, at the time sent back the headers.

@pyropy @juliangruber thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about relying on content-length. It's not guaranteed that the response will have this header. I am concerned that the remaining egress reported by FilBeam will be unpredictable.

@bajtos You once mentioned that content length is encoded in CID V2, would it be possible to use that instead of the content-length headers?

Also, there may be other requests in progress that will further reduce the remaining egress quotas - I think it's impossible to give the client a precise value.

That's true, we should document that this may report an imprecise amount back to the user.

I propose simplifying the report and reporting the remaining egress quota as it was at the start of the response, at the time sent back the headers.

That might be a good start. I think these headers might be used to raise alerts and trigger top-ups (either manual or automated) and not for precise accounting so that could work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about relying on content-length. It's not guaranteed that the response will have this header. I am concerned that the remaining egress reported by FilBeam will be unpredictable.

@bajtos You once mentioned that content length is encoded in CID V2, would it be possible to use that instead of the content-length headers?

Yes, we can extract the piece size from the CID, but this will not work for range requests (e.g. when downloading a 1MB chunk of a video file, as browsers typically do).

Copy link
Member

@juliangruber juliangruber Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bajtos +1 to simplifying the report, first get it simple and correct, we can improve it later

@akronim26
Copy link
Author

Please add some tests to verify your implementation.

I am following the thread regarding content-length vs. the CID-based approach. I haven't finalized this part of the code or the tests yet because I wanted to make sure that I am aligned with the preferred approach. imo, the proposal to report the quota as it was at the start of the response seems like a good approach, especially given the range request issues @bajtos mentioned. Should I go ahead and implement that simplified version, or would you prefer I wait until you have reached a final decision?

@juliangruber
Copy link
Member

@akronim26 please go ahead with the simplified version 👍

@akronim26 akronim26 requested review from bajtos and pyropy January 29, 2026 14:31
Copy link
Contributor

@pyropy pyropy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove remaining egress validation as that duplicates logic in getRetrievalCandidatesAndValidatePayer.

@akronim26 akronim26 requested a review from pyropy January 30, 2026 07:44
)
response.headers.set(
'FB-Cdn-Egress-Remaining',
String(retrievalCandidate.cdnEgressQuota),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we confident it's good enough to respond with the quota for this one retrieval candidate? This means if you have stored with multiple SPs, you will get different responses randomly.

What about instead summing up the quotas of all retrieval candidates, since that's effectively how much you have left?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, got it. There can be multiple retrieval candidates, so we should sum up the quotas from all of them. I'll make the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose returning two sets of headers: one with the remaining quotas for the selected data-set (as indicated in the X-Data-Set-ID header), and another for the total remaining quota.

If we feel the per-data-set quotas are not needed, then I am okay to not implement such response headers.

Either way, let's rename the headers returning the sum of all quotas to clearly indicate that the value is a sum across all datasets.

@juliangruber juliangruber changed the title feat: report remaining regress quotas feat: report remaining egress quotas Feb 2, 2026
Copy link
Contributor

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are making great progress! 👏🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Report remaining egress quotas in response headers

4 participants