Skip to content
This repository was archived by the owner on Mar 12, 2026. It is now read-only.

Add RFC for time series gem downloads#48

Open
segiddins wants to merge 3 commits into
rubygems:masterfrom
segiddins:segiddins/time-series-gem-downloads
Open

Add RFC for time series gem downloads#48
segiddins wants to merge 3 commits into
rubygems:masterfrom
segiddins:segiddins/time-series-gem-downloads

Conversation

@segiddins
Copy link
Copy Markdown
Contributor

@indirect would appreciate your help rounding this off

@segiddins segiddins requested a review from indirect March 27, 2023 14:25
@indirect
Copy link
Copy Markdown

indirect commented Apr 4, 2023

@segiddins this looks good to me, I think. my only remaining question is how we will ensure the main app doesn't break if e.g. timescale disappears. for the work outlined in this RFC, I think the only thing that might break is the background job that writes to timescale? if so, that means it's mainly a question for the future work that will display download counts and graphs.

@segiddins
Copy link
Copy Markdown
Contributor Author

So my thought there is, we can have a cron job that copies all time download numbers to the main postgres database on some cadence?

@segiddins segiddins marked this pull request as ready for review April 4, 2023 05:27
@indirect
Copy link
Copy Markdown

indirect commented Apr 4, 2023

Oh yeah, that seems good. That plus a circuit-breaker type thing on the requests to the timescaledb so puma can continue to serve other rails requests sounds like it should work.

@segiddins
Copy link
Copy Markdown
Contributor Author

That plus a circuit-breaker type thing on the requests to the timescaledb so puma can continue to serve other rails requests sounds like it should work.

I'm not quite sure what you mean by that? My thinking was we'd use stimulus to show download counts on pages, so as to be able to cache download numbers separately, and that would give us a natural place to keep reads from timescale off the critical path

@indirect
Copy link
Copy Markdown

indirect commented Apr 5, 2023

That's pretty much what I was thinking--something out of the critical path to show the numbers, something out of the critical path to write into timescale. Thanks for writing this up!

- Adding another datastore complicates contributing to the application
- Another datastore means we can't natively write joins against the download data
- Timescale cloud costs money
- There is no terraform provider for timescale cloud (yet) requiring us to write one or manually provision
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Another datastore means we can't natively write joins against the download data
- Timescale cloud costs money
- There is no terraform provider for timescale cloud (yet) requiring us to write one or manually provision
- This is still high cardinality data, and there's no knowing for sure how well timescale will compress it
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help testing and benchmarking it. Is there any data available to seed the database? Or if anyone can share any numbers to give me some insights about how big is the cardinality.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found the weekly backup ❤️

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants