feat(store): store popularity backend (GitHub stars)#936
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughA new ChangesStore Popularity Feature
Sequence Diagram(s)sequenceDiagram
participant AppLifespan as App Lifespan
participant Warmer as warm_popularity_cache
participant Cache as _star_cache
participant GitHub as GitHub API
AppLifespan->>Warmer: warm repos from agent homepages (every 10 min)
Warmer->>Cache: check _has_fresh_entry(repo)
alt stale or missing
Warmer->>GitHub: GET /repos/{owner}/{repo} via fetch_stars
GitHub-->>Warmer: stargazers_count / error
Warmer->>Cache: write stars + TTL expiry
Warmer->>Cache: _persist_cache() atomic write
end
participant Client as HTTP Client
participant CatalogRoute as list_catalog / list_popularity
Client->>CatalogRoute: GET /api/store/catalog
CatalogRoute->>Cache: _popularity_by_app_id(apps) — cache-only reads
Cache-->>CatalogRoute: {app_id: {repo, github_stars, score}}
CatalogRoute-->>Client: items with repo, stars, popularity fields
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| _star_cache[repo] = (time.time() + ttl, stars) | ||
| _persist_cache() |
There was a problem hiding this comment.
💡 Performance: Full cache file rewritten + blocking I/O on every fetch_stars
fetch_stars calls _persist_cache() after every single repo lookup (store_popularity.py:213). _persist_cache serializes the entire _star_cache dict and does a synchronous write_text (store_popularity.py:303-310). During a warm pass over the ~137 GitHub homepages this rewrites the whole JSON file ~137 times, and because write_text is blocking I/O executed on the asyncio event loop, each write briefly stalls the loop (and thus any in-flight request handlers). Impact is small at current catalog size but grows with the catalog and is wasteful.
Suggested fix: persist once at the end of a warm pass instead of per fetch (e.g. call _persist_cache() from warm_popularity_cache after asyncio.gather), and/or offload the write via asyncio.to_thread so it does not block the loop.
Was this helpful? React with 👍 / 👎
Prevents popularity-cache corruption if the process crashes mid-write. Writes to a sibling .tmp then atomically replaces the target.
Code Review 👍 Approved with suggestions 5 resolved / 6 findingsImplements a non-blocking, cache-aware GitHub popularity backend for the store catalog. Address the minor issue where the cache file is rewritten and blocks I/O on every individual star fetch to improve performance. 💡 Performance: Full cache file rewritten + blocking I/O on every fetch_stars📄 tinyagentos/store_popularity.py:212-213 📄 tinyagentos/store_popularity.py:303-312 📄 tinyagentos/store_popularity.py:237-251
Suggested fix: persist once at the end of a warm pass instead of per fetch (e.g. call ✅ 5 resolved✅ Bug: Rate-limit (403/429) failures cached for full 6h TTL
✅ Performance: Unbounded concurrent GitHub fetches exhaust rate limit on cold cache
✅ Edge Case: parse_repo false-positives on github.com subdomains/paths
✅ Performance: 8s GitHub timeout can stall catalog list endpoint
✅ Edge Case: Cache persist is non-atomic; crash mid-write corrupts file
🤖 Prompt for agentsOptionsAuto-apply is off → Gitar will not commit updates to this branch. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
| if (r := store_popularity.parse_repo(getattr(a, "homepage", "") or "")) | ||
| }) | ||
| if repos: | ||
| await store_popularity.warm_popularity_cache(repos) |
There was a problem hiding this comment.
WARNING: First popularity warm pass can block app startup
warm_popularity_cache awaits all uncached repos with an 8s per-request timeout. On a cold cache this runs before _startup_complete is set, so slow or unreachable GitHub can keep the server in startup/503 for minutes instead of warming in the background.
| await store_popularity.warm_popularity_cache(repos) | |
| await _asyncio.wait_for(store_popularity.warm_popularity_cache(repos), timeout=30) |
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
| if time.time() < _rate_limited_until: | ||
| return # a sibling fetch hit the limit; stop spending budget | ||
| async with sem: | ||
| await fetch_stars(repo, client=client) |
There was a problem hiding this comment.
SUGGESTION: Re-check the rate-limit gate after acquiring the semaphore
A task can pass the pre-semaphore _rate_limited_until check, wait for the semaphore, then call fetch_stars after sibling tasks have already armed the rate-limit back-off gate. Re-check inside the semaphore before fetching so the warmer stops spending GitHub budget promptly.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
Code Review SummaryStatus: 2 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)CRITICAL
WARNING
SUGGESTION
Other Observations (not in diff)None. Files Reviewed (5 files)
Fix Link: Fix these issues in Kilo Cloud Reviewed by nex-n2-pro:free · 425,463 tokens |
Adds a store popularity backend sourced from GitHub stars today and structured to accept real install telemetry later (#15).
What it does:
Telemetry-ready shape (forward-compatible, no breaking change when #15 lands):
installs is null today; score is derived from whatever signals exist (just stars now, stars plus weighted installs later).
Graceful degradation: any GitHub failure (rate-limit 403/429, 404, network error, bad JSON) degrades that entry to github_stars=null and never raises. Negative results are cached so a known-bad repo is not retried on every request. Entries whose homepage is not a github.com/owner/repo URL get a popularity shape with github_stars=null; no stars are fabricated. 137 of 266 catalog manifests already carry a github.com homepage, so they get real star counts; the rest need a repo homepage added to their manifest to surface stars.
Frontend: the Store frontend already reads repo and stars off the catalog response (and tolerates their absence), so no frontend change was needed.
Tests: tests/test_store_popularity.py (unit, mocked httpx) and tests/routes/test_store_popularity_route.py (endpoint) assert a github homepage gets stars plus a score, a 404 and a rate-limit both yield null without raising, caching avoids a second call within the TTL, and a non-github homepage gets null without a call. All green; existing store route tests unaffected; create_app() ok.
Summary by CodeRabbit
/api/store/popularityendpoint to query popularity information by app ID with optional type filtering