⚡ Bolt: Optimized Blockchain Integrity and Admin Stats#516
Conversation
- Implemented O(1) blockchain integrity verification by adding 'previous_integrity_hash' column to Issue model. - Added 'blockchain_last_hash_cache' to ThreadSafeCache for high-performance issue creation. - Optimized admin statistics query to use a single aggregate scan with func.sum(case) instead of multiple counts. - Updated database migration logic to handle new columns and indexes efficiently. - Reduced database round-trips for core system monitoring.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
✅ Deploy Preview for fixmybharat canceled.
|
🙏 Thank you for your contribution, @RohanExploit!PR Details:
Quality Checklist:
Review Process:
Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken. |
|
Caution Review failedPull request was closed or merged during review No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a new nullable, indexed Changes
Sequence DiagramsequenceDiagram
participant Client
participant IssueRouter as Issue Router
participant Cache as Blockchain Cache
participant DB as Database
rect rgba(100, 150, 200, 0.5)
Note over Client,DB: Issue Creation Flow
Client->>IssueRouter: Create Issue
IssueRouter->>Cache: Get Last Hash
alt Cache Hit
Cache-->>IssueRouter: Return Cached Hash
else Cache Miss
IssueRouter->>DB: Query Most Recent Hash
DB-->>IssueRouter: Return Hash
IssueRouter->>Cache: Cache Hash
end
IssueRouter->>IssueRouter: Compute new integrity_hash using previous_hash
IssueRouter->>DB: Store Issue with previous_integrity_hash
DB-->>IssueRouter: Confirmed
IssueRouter->>Cache: Update Cache
Cache-->>IssueRouter: Acknowledged
IssueRouter-->>Client: Issue Created
end
rect rgba(150, 200, 100, 0.5)
Note over Client,DB: Blockchain Verification Flow
Client->>IssueRouter: Verify Blockchain Integrity
IssueRouter->>DB: Fetch Issue (including previous_integrity_hash)
DB-->>IssueRouter: Return Issue Record
alt Has previous_integrity_hash
IssueRouter->>IssueRouter: Recompute hash (O(1)) using previous_hash
IssueRouter-->>Client: Integrity Valid/Invalid
else Missing previous_integrity_hash
IssueRouter->>DB: Query Prior Records (legacy)
DB-->>IssueRouter: Return History
IssueRouter->>IssueRouter: Recompute full chain
IssueRouter-->>Client: Integrity Valid/Invalid
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
1 issue found across 5 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/routers/issues.py">
<violation number="1" location="backend/routers/issues.py:190">
P1: Do not update `blockchain_last_hash_cache` before the issue is committed; a failed DB save leaves a phantom tail hash and corrupts subsequent chain links.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest() | ||
|
|
||
| # Update cache for next report | ||
| blockchain_last_hash_cache.set(data=integrity_hash, key="last_hash") |
There was a problem hiding this comment.
P1: Do not update blockchain_last_hash_cache before the issue is committed; a failed DB save leaves a phantom tail hash and corrupts subsequent chain links.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/routers/issues.py, line 190:
<comment>Do not update `blockchain_last_hash_cache` before the issue is committed; a failed DB save leaves a phantom tail hash and corrupts subsequent chain links.</comment>
<file context>
@@ -172,16 +172,23 @@ async def create_issue(
integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest()
+ # Update cache for next report
+ blockchain_last_hash_cache.set(data=integrity_hash, key="last_hash")
+
# RAG Retrieval (New)
</file context>
There was a problem hiding this comment.
Pull request overview
This PR optimizes the backend’s “blockchain integrity” feature by persisting the previous hash on each Issue and adds a cache for the chain tail to reduce DB lookups during issue creation. It also consolidates /admin/stats into a single aggregate query to reduce DB round-trips.
Changes:
- Add
Issue.previous_integrity_hashand populate it during issue creation; update integrity verification to use it (with legacy fallback). - Introduce/use
blockchain_last_hash_cacheto avoid querying the last hash on every issue creation. - Replace multiple
/admin/statscount()queries with one aggregate query (count+sum(case)), and add a supporting DB index migration.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/routers/issues.py | Uses cached last-hash + stores previous_integrity_hash; verification reads stored previous hash with fallback. |
| backend/routers/admin.py | Consolidates stats into a single aggregate query. |
| backend/models.py | Adds previous_integrity_hash column (indexed) to Issue. |
| backend/init_db.py | Adds migration/index creation for previous_integrity_hash. |
| .jules/bolt.md | Documents the optimization learnings and intended approach. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| # Update cache for next report | ||
| blockchain_last_hash_cache.set(data=integrity_hash, key="last_hash") | ||
|
|
There was a problem hiding this comment.
blockchain_last_hash_cache is updated to the newly computed integrity_hash before the issue is committed. If save_issue_db/db.commit fails, the cache will still advance, and subsequent issues may chain off a hash that was never persisted. Update/advance the cache only after a successful commit, and invalidate/refresh it on exceptions.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
There was a problem hiding this comment.
I have refactored the code to update blockchain_last_hash_cache only after a successful database commit. If the commit fails, the cache is now invalidated to ensure the next request fetches the correct state from the database.
| # Performance Boost: Use thread-safe cache to eliminate DB query for last hash | ||
| prev_hash = blockchain_last_hash_cache.get("last_hash") | ||
| if prev_hash is None: | ||
| # Cache miss: Fetch only the last hash from DB | ||
| prev_issue = await run_in_threadpool( | ||
| lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first() |
There was a problem hiding this comment.
The get→compute→set sequence for "last_hash" isn’t atomic. Under concurrent create_issue requests, multiple workers/threads can read the same prev_hash and produce a forked chain (and with multiple Uvicorn/Gunicorn workers, each process has its own cache). Consider serializing hash generation with a lock or deriving prev_hash from the database within the same transaction to preserve a single chain.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
There was a problem hiding this comment.
I've implemented atomic hash calculation and record persistence by using a single database transaction with with_for_update() on the tail record. This serializes concurrent creations and prevents chain forks, as requested.
| # Determine previous hash (use stored link or fallback for legacy records) | ||
| prev_hash = current_issue.previous_integrity_hash | ||
|
|
||
| prev_hash = prev_issue_hash[0] if prev_issue_hash and prev_issue_hash[0] else "" | ||
| if prev_hash is None: | ||
| # Fallback for legacy records created before O(1) optimization |
There was a problem hiding this comment.
verify_blockchain_integrity now trusts Issue.previous_integrity_hash (and only falls back to a DB lookup when it’s NULL). This can report "valid" even if previous_integrity_hash doesn’t match the actual predecessor record in the DB (e.g., due to concurrent creation/cache races). To preserve chain continuity, also fetch the DB predecessor hash (single additional query or a scalar subquery) and treat a mismatch as invalid.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
There was a problem hiding this comment.
I've enhanced verify_blockchain_integrity to double-verify the previous_integrity_hash against the actual predecessor record in the database. This ensures that we don't blindly trust the stored link and maintains absolute chain continuity.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/routers/issues.py`:
- Around line 175-190: The read-compute-write on blockchain_last_hash_cache
(prev_hash = blockchain_last_hash_cache.get(...); compute integrity_hash;
blockchain_last_hash_cache.set(...)) is not atomic and allows concurrent forks;
instead, move the "read previous hash" into the same DB transaction that inserts
the new Issue using a SELECT ... FOR UPDATE (e.g. query =
db.query(Issue.integrity_hash).order_by(Issue.id.desc()).with_for_update()) so
the previous hash is locked, compute integrity_hash from that locked value,
insert the new Issue inside that transaction, then update
blockchain_last_hash_cache.set(...) after commit; use the same
run_in_threadpool/transaction helper you already use to perform the locked read
+ insert atomically.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 83ae62ca-6df2-40c7-887d-e5fb3acf4669
📒 Files selected for processing (5)
.jules/bolt.mdbackend/init_db.pybackend/models.pybackend/routers/admin.pybackend/routers/issues.py
| # Performance Boost: Use thread-safe cache to eliminate DB query for last hash | ||
| prev_hash = blockchain_last_hash_cache.get("last_hash") | ||
| if prev_hash is None: | ||
| # Cache miss: Fetch only the last hash from DB | ||
| prev_issue = await run_in_threadpool( | ||
| lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first() | ||
| ) | ||
| prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else "" | ||
| blockchain_last_hash_cache.set(data=prev_hash, key="last_hash") | ||
|
|
||
| # Simple but effective SHA-256 chaining | ||
| # Simple but effective SHA-256 chaining | ||
| hash_content = f"{description}|{category}|{prev_hash}" | ||
| integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest() | ||
|
|
||
| # Update cache for next report | ||
| blockchain_last_hash_cache.set(data=integrity_hash, key="last_hash") |
There was a problem hiding this comment.
Race condition breaks blockchain chain integrity.
The read-compute-write sequence for blockchain_last_hash_cache is not atomic. Per backend/cache.py (lines 9-65), ThreadSafeCache only locks individual get() and set() operations, not the entire sequence.
Scenario:
- Request A calls
get("last_hash")→ returnshash_X - Request B calls
get("last_hash")→ returnshash_X(same value) - Request A computes
integrity_hash_Afromhash_X, callsset("last_hash", integrity_hash_A) - Request B computes
integrity_hash_Bfromhash_X, callsset("last_hash", integrity_hash_B)
Both issues now have previous_integrity_hash = hash_X, creating a fork in the chain and making subsequent verification unreliable.
🔒 Recommended fix: Use database-level sequencing
The simplest reliable fix is to query the previous hash from the database within the same transaction that inserts the new issue, leveraging DB-level locking:
try:
# Save to DB only if no nearby issues found or deduplication failed
if deduplication_info is None or not deduplication_info.has_nearby_issues:
- # Blockchain feature: calculate integrity hash for the report
- # Performance Boost: Use thread-safe cache to eliminate DB query for last hash
- prev_hash = blockchain_last_hash_cache.get("last_hash")
- if prev_hash is None:
- # Cache miss: Fetch only the last hash from DB
- prev_issue = await run_in_threadpool(
- lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first()
- )
- prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else ""
- blockchain_last_hash_cache.set(data=prev_hash, key="last_hash")
-
- # Simple but effective SHA-256 chaining
- hash_content = f"{description}|{category}|{prev_hash}"
- integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest()
-
- # Update cache for next report
- blockchain_last_hash_cache.set(data=integrity_hash, key="last_hash")
+ # Blockchain feature: calculate integrity hash for the report
+ # Query previous hash within the same transaction for consistency
+ prev_issue = await run_in_threadpool(
+ lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).with_for_update().first()
+ )
+ prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else ""
+
+ # Simple but effective SHA-256 chaining
+ hash_content = f"{description}|{category}|{prev_hash}"
+ integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest()Using with_for_update() serializes concurrent inserts, ensuring each new issue gets the correct previous hash. The cache can still be used for read-only verification scenarios.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Performance Boost: Use thread-safe cache to eliminate DB query for last hash | |
| prev_hash = blockchain_last_hash_cache.get("last_hash") | |
| if prev_hash is None: | |
| # Cache miss: Fetch only the last hash from DB | |
| prev_issue = await run_in_threadpool( | |
| lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).first() | |
| ) | |
| prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else "" | |
| blockchain_last_hash_cache.set(data=prev_hash, key="last_hash") | |
| # Simple but effective SHA-256 chaining | |
| # Simple but effective SHA-256 chaining | |
| hash_content = f"{description}|{category}|{prev_hash}" | |
| integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest() | |
| # Update cache for next report | |
| blockchain_last_hash_cache.set(data=integrity_hash, key="last_hash") | |
| # Blockchain feature: calculate integrity hash for the report | |
| # Query previous hash within the same transaction for consistency | |
| prev_issue = await run_in_threadpool( | |
| lambda: db.query(Issue.integrity_hash).order_by(Issue.id.desc()).with_for_update().first() | |
| ) | |
| prev_hash = prev_issue[0] if prev_issue and prev_issue[0] else "" | |
| # Simple but effective SHA-256 chaining | |
| hash_content = f"{description}|{category}|{prev_hash}" | |
| integrity_hash = hashlib.sha256(hash_content.encode()).hexdigest() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/routers/issues.py` around lines 175 - 190, The read-compute-write on
blockchain_last_hash_cache (prev_hash = blockchain_last_hash_cache.get(...);
compute integrity_hash; blockchain_last_hash_cache.set(...)) is not atomic and
allows concurrent forks; instead, move the "read previous hash" into the same DB
transaction that inserts the new Issue using a SELECT ... FOR UPDATE (e.g. query
= db.query(Issue.integrity_hash).order_by(Issue.id.desc()).with_for_update()) so
the previous hash is locked, compute integrity_hash from that locked value,
insert the new Issue inside that transaction, then update
blockchain_last_hash_cache.set(...) after commit; use the same
run_in_threadpool/transaction helper you already use to perform the locked read
+ insert atomically.
|
@RohanExploit I've opened a new pull request, #518, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@RohanExploit I've opened a new pull request, #519, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@RohanExploit I've opened a new pull request, #520, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
@RohanExploit I've opened a new pull request, #521, to work on those changes. Once the pull request is ready, I'll request review from you. |
Implemented the requested changes based on the review feedback. The blockchain operations are now atomic and integrity verification is more robust against race conditions. |
This PR introduces two significant performance optimizations:
O(1) Blockchain Verification: By storing the$O(N)$ or secondary lookups during report creation and verification. This reduces latency on the critical report submission path.
previous_integrity_hashdirectly on each issue and utilizing a thread-safe cache for the tail of the chain, we eliminate the need forSingle-Pass Admin Stats: The
/admin/statsendpoint was previously performing multiple separatecount()queries. This has been unified into a single aggregate query usingfunc.countandfunc.sum(case), reducing database network round-trips and scan overhead.Verified with unit tests for blockchain integrity and issue creation.
PR created automatically by Jules for task 3885603136094984226 started by @RohanExploit
Summary by cubic
Optimizes blockchain integrity checks to O(1) and makes
/admin/statsa single-pass aggregate, cutting latency and DB round-trips. Adds a linkedprevious_integrity_hashand an in-memory last-hash cache for faster issue creation.Issue.previous_integrity_hash(indexed) for O(1) verification; endpoint uses it with a legacy fallback when null.blockchain_last_hash_cacheto avoid DB reads for the last hash; cache updated on write./admin/statsinto one aggregate query usingfunc.countandfunc.sum(case)to returntotal_users,admin_count, andactive_users.Written for commit f032349. Summary will update on new commits.
Summary by CodeRabbit