Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .jules/bolt.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,7 @@
## 2026-04-20 - Async File I/O in Voice Submission
**Learning:** Saving audio recordings (up to 10MB) synchronously in a FastAPI async endpoint blocks the main event loop, significantly increasing tail latency for all concurrent users during high-traffic periods.
**Action:** Wrap blocking synchronous File I/O operations like `f.write()` in `run_in_threadpool` to offload them to a separate thread, keeping the event loop responsive for other requests.

## 2026-05-15 - Serialization Caching Bypass
**Learning:** Caching raw Python objects (like SQLAlchemy models or Pydantic instances) in a high-traffic API still incurs significant overhead because FastAPI/Pydantic must re-serialize the data on every request.
**Action:** Serialize data to a JSON string using `json.dumps()` BEFORE caching. On cache hits, return a raw `fastapi.Response(content=..., media_type="application/json")`. This bypasses the validation and serialization layer, resulting in significant performance gains (up to 50x in benchmarks).
3 changes: 3 additions & 0 deletions backend/cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,3 +185,6 @@ def invalidate(self):
evidence_audit_last_hash_cache = ThreadSafeCache(ttl=3600, max_size=1)
closure_last_hash_cache = ThreadSafeCache(ttl=3600, max_size=1)
user_issues_cache = ThreadSafeCache(ttl=300, max_size=50) # 5 minutes TTL
grievance_list_cache = ThreadSafeCache(ttl=300, max_size=50)
escalation_stats_cache = ThreadSafeCache(ttl=300, max_size=10)
visit_stats_cache = ThreadSafeCache(ttl=300, max_size=10)
Comment on lines +188 to +190
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description claims correctness/performance were verified with backend/tests/test_cache_unit.py and backend/tests/test_cache_perf.py, but those test files don’t appear to exist in the repository. Either add the referenced tests (preferred) or update the PR description to point to the actual test coverage/benchmark artifacts used.

Copilot uses AI. Check for mistakes.
6 changes: 5 additions & 1 deletion backend/escalation_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from backend.models import Grievance, Jurisdiction, EscalationAudit, GrievanceStatus, JurisdictionLevel, EscalationReason, SeverityLevel
from backend.database import SessionLocal
from backend.config import get_auth_config
from backend.cache import audit_last_hash_cache
from backend.cache import audit_last_hash_cache, grievance_list_cache, escalation_stats_cache
from backend.routing_service import RoutingService
from backend.sla_config_service import SLAConfigService

Expand Down Expand Up @@ -289,6 +289,10 @@ def _escalate_grievance(self, grievance: Grievance, reason: EscalationReason,
# Update cache for next audit AFTER successful DB commit
audit_last_hash_cache.set(data=integrity_hash, key="last_hash")

# Invalidate grievance caches
grievance_list_cache.clear()
escalation_stats_cache.clear()

return True

except Exception as e:
Expand Down
11 changes: 10 additions & 1 deletion backend/grievance_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from backend.routing_service import RoutingService
from backend.sla_config_service import SLAConfigService
from backend.escalation_engine import EscalationEngine
from backend.cache import grievance_last_hash_cache
from backend.cache import grievance_last_hash_cache, grievance_list_cache, escalation_stats_cache

class GrievanceService:
"""
Expand Down Expand Up @@ -133,6 +133,10 @@ def create_grievance(self, grievance_data: Dict[str, Any], db: Session = None) -
# Update cache after successful commit
grievance_last_hash_cache.set(data=integrity_hash, key="last_hash")

# Invalidate grievance caches
grievance_list_cache.clear()
escalation_stats_cache.clear()
Comment on lines +136 to +138
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache invalidation for grievance_list_cache/escalation_stats_cache is added here, but there are other write paths that mutate grievance status (e.g., ClosureService resolves grievances and commits) that don’t clear these caches. That can leave /api/grievances and /api/escalation-stats stale for up to the TTL after closure actions. Consider invalidating these caches in those other commit paths as well (or centralizing invalidation in a single service layer).

Copilot uses AI. Check for mistakes.
Comment on lines 133 to +138
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

Cache-invalidation race: stale data can linger up to the full TTL (5 min).

The ordering here is correct (invalidate after commit), but there's a classic read-then-write race with the readers in backend/routers/grievances.py:

  1. Reader A enters get_grievances, sees a cache miss, runs the DB query and gets a pre-commit snapshot.
  2. Writer (this code) commits the new/updated grievance and calls grievance_list_cache.clear().
  3. Reader A now calls grievance_list_cache.set(...) with the stale payload.

The stale entry then survives for up to ttl=300s. Given this is a mutation path and the whole point of clearing is freshness, consider one of:

  • Record a monotonic "invalidation epoch" per cache and have readers CAS their set against the epoch captured at query start (discard set if epoch changed).
  • Or shorten the list-cache TTL to something like 30–60s so the worst-case staleness is bounded, given writes are relatively infrequent compared to reads.

Same pattern applies in backend/escalation_engine.py::_escalate_grievance and backend/routers/field_officer.py (officer_check_in, officer_check_out, verify_visit); fixing it in the cache layer would address all call sites at once.

πŸ€– Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/grievance_service.py` around lines 133 - 138, The cache invalidation
has a read-then-write race: readers in backend/routers/grievances.py
(get_grievances) can produce stale grievance_list_cache entries after a writer
clears the cache in backend/grievance_service.py (grievance_list_cache.clear())
because the reader sets the old result; fix by adding an invalidation epoch to
the cache layer: when you clear (or update) in grievance_service (and similarly
in escalation_engine::_escalate_grievance and field_officer handlers
officer_check_in/officer_check_out/verify_visit) increment a monotonic epoch
counter (e.g., grievance_list_epoch), have readers capture the epoch at query
start and only perform grievance_list_cache.set(...) if the current epoch still
equals the captured epoch (discard the set if it changed), which prevents stale
writes; alternatively, if you prefer a quicker mitigation, reduce the list-cache
TTL to 30–60s to bound staleness.

Comment on lines +137 to +138
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Post-commit cache clear errors can incorrectly return False after the status update has already been committed.

Prompt for AI agents
Check if this issue is valid β€” if so, understand the root cause and fix it. At backend/grievance_service.py, line 137:

<comment>Post-commit cache clear errors can incorrectly return `False` after the status update has already been committed.</comment>

<file context>
@@ -133,6 +133,10 @@ def create_grievance(self, grievance_data: Dict[str, Any], db: Session = None) -
             grievance_last_hash_cache.set(data=integrity_hash, key="last_hash")
 
+            # Invalidate grievance caches
+            grievance_list_cache.clear()
+            escalation_stats_cache.clear()
+
</file context>
Suggested change
grievance_list_cache.clear()
escalation_stats_cache.clear()
try:
grievance_list_cache.clear()
escalation_stats_cache.clear()
except Exception as cache_error:
print(f"Warning: failed to invalidate grievance caches: {cache_error}")
Fix with Cubic


return grievance

except Exception as e:
Expand Down Expand Up @@ -222,6 +226,11 @@ def update_grievance_status(self, grievance_id: int, status: GrievanceStatus,
issue.assigned_to = grievance.assigned_authority

db.commit()

# Invalidate grievance caches
grievance_list_cache.clear()
escalation_stats_cache.clear()

return True

except Exception as e:
Expand Down
46 changes: 33 additions & 13 deletions backend/routers/field_officer.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,13 @@
Issue #288: Field Officer Check-In System With Location Verification
"""

from fastapi import APIRouter, Depends, HTTPException, UploadFile, File, Form
from fastapi import APIRouter, Depends, HTTPException, UploadFile, File, Form, Response
from sqlalchemy.orm import Session
from sqlalchemy import func, case
from typing import List, Optional
import logging
import os
import json
from datetime import datetime, timezone

from backend.database import get_db
Expand All @@ -31,7 +32,7 @@
calculate_visit_metrics,
get_geofencing_service
)
from backend.cache import visit_last_hash_cache
from backend.cache import visit_last_hash_cache, visit_stats_cache
from backend.schemas import BlockchainVerificationResponse

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -148,6 +149,9 @@ def officer_check_in(request: OfficerCheckInRequest, db: Session = Depends(get_d
# Update cache for next visit AFTER successful DB commit
visit_last_hash_cache.set(data=visit_hash, key="last_hash")

# Invalidate visit stats cache
visit_stats_cache.clear()

logger.info(
f"Officer {request.officer_name} checked in at issue {request.issue_id}. "
f"Distance: {distance:.2f}m, Within fence: {within_fence}"
Expand Down Expand Up @@ -226,6 +230,9 @@ def officer_check_out(request: OfficerCheckOutRequest, db: Session = Depends(get
db.commit()
db.refresh(visit)

# Invalidate visit stats cache
visit_stats_cache.clear()

logger.info(f"Officer checked out from visit {request.visit_id}")

return FieldOfficerVisitResponse(
Expand Down Expand Up @@ -421,11 +428,15 @@ def get_issue_visit_history(
@router.get("/field-officer/visit-stats", response_model=VisitStatsResponse)
def get_visit_statistics(db: Session = Depends(get_db)):
"""
Get aggregate statistics for all field officer visits using optimized SQL queries

Returns metrics like total visits, verification status, geo-fence compliance, etc.
Get aggregate statistics for all field officer visits using optimized SQL queries.
Optimized: Uses serialization caching to bypass Pydantic overhead.
"""
try:
cache_key = "global_visit_stats"
cached_json = visit_stats_cache.get(cache_key)
if cached_json:
return Response(content=cached_json, media_type="application/json")

# Optimized: Use a single aggregate query to fetch multiple statistics in one database roundtrip
stats = db.query(
func.count(FieldOfficerVisit.id).label('total'),
Expand All @@ -449,14 +460,20 @@ def get_visit_statistics(db: Session = Depends(get_db)):
else:
average_distance = 0.0

return VisitStatsResponse(
total_visits=total_visits,
verified_visits=verified_visits,
within_geofence_count=within_geofence_count,
outside_geofence_count=outside_geofence_count,
unique_officers=unique_officers,
average_distance_from_site=average_distance
)
result_data = {
"total_visits": total_visits,
"verified_visits": verified_visits,
"within_geofence_count": within_geofence_count,
"outside_geofence_count": outside_geofence_count,
"unique_officers": unique_officers,
"average_distance_from_site": average_distance
}

# Cache serialized JSON
json_data = json.dumps(result_data)
visit_stats_cache.set(data=json_data, key=cache_key)

return Response(content=json_data, media_type="application/json")

except Exception as e:
logger.error(f"Error calculating visit statistics: {e}", exc_info=True)
Expand Down Expand Up @@ -493,6 +510,9 @@ def verify_visit(

db.commit()

# Invalidate visit stats cache
visit_stats_cache.clear()

logger.info(f"Visit {visit_id} verified by {verifier_email}")

return {"message": "Visit verified successfully", "visit_id": visit_id}
Expand Down
99 changes: 60 additions & 39 deletions backend/routers/grievances.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from fastapi import APIRouter, Depends, HTTPException, Query, Request
from fastapi import APIRouter, Depends, HTTPException, Query, Request, Response
from sqlalchemy.orm import Session, joinedload, selectinload
from sqlalchemy import func, case
from typing import List, Optional
Expand All @@ -23,6 +23,7 @@
)
from backend.grievance_service import GrievanceService
from backend.closure_service import ClosureService
from backend.cache import grievance_list_cache, escalation_stats_cache

logger = logging.getLogger(__name__)

Expand All @@ -38,9 +39,14 @@ def get_grievances(
):
"""
Get list of grievances with escalation history.
Optimized: Uses selectinload for audit_logs to avoid Cartesian product and improve O(N) fetching.
Optimized: Uses serialization caching to bypass Pydantic overhead.
"""
try:
cache_key = f"grievances_{status}_{category}_{limit}_{offset}"
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Cache key collision: Python None and the literal string "None" produce identical cache keys, enabling cache poisoning. A request like GET /grievances?status=None returns zero results and caches them under the same key as the unfiltered GET /grievances, serving stale empty results for up to 5 minutes.

Use a delimiter/sentinel that cannot appear in normal parameter values, or explicitly serialize None differently.

Prompt for AI agents
Check if this issue is valid β€” if so, understand the root cause and fix it. At backend/routers/grievances.py, line 45:

<comment>Cache key collision: Python `None` and the literal string `"None"` produce identical cache keys, enabling cache poisoning. A request like `GET /grievances?status=None` returns zero results and caches them under the same key as the unfiltered `GET /grievances`, serving stale empty results for up to 5 minutes.

Use a delimiter/sentinel that cannot appear in normal parameter values, or explicitly serialize `None` differently.</comment>

<file context>
@@ -38,9 +39,14 @@ def get_grievances(
+    Optimized: Uses serialization caching to bypass Pydantic overhead.
     """
     try:
+        cache_key = f"grievances_{status}_{category}_{limit}_{offset}"
+        cached_json = grievance_list_cache.get(cache_key)
+        if cached_json:
</file context>
Fix with Cubic

cached_json = grievance_list_cache.get(cache_key)
if cached_json:
return Response(content=cached_json, media_type="application/json")
Comment on lines +45 to +48
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

Cache-key collision risk with free-form category / status values.

f"grievances_{status}_{category}_{limit}_{offset}" uses _ as both separator and as a plausible character inside user-supplied query values. Two different filter combinations can map to the same cache key, e.g.:

  • status="a_b", category="c" β†’ grievances_a_b_c_50_0
  • status="a", category="b_c" β†’ grievances_a_b_c_50_0

status is usually bounded (open / in_progress / resolved / escalated) so this is low severity today, but category is free-form at the Query level. A collision would serve one filter's results to another filter's caller until the entry expires or is invalidated.

Simple fixes: use a delimiter that can't appear in the inputs (e.g., "|" with a sanitize/escape), use repr((status, category, limit, offset)), or hash a canonical tuple:

πŸ”§ Suggested
-        cache_key = f"grievances_{status}_{category}_{limit}_{offset}"
+        cache_key = repr(("grievances", status, category, limit, offset))
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cache_key = f"grievances_{status}_{category}_{limit}_{offset}"
cached_json = grievance_list_cache.get(cache_key)
if cached_json:
return Response(content=cached_json, media_type="application/json")
cache_key = repr(("grievances", status, category, limit, offset))
cached_json = grievance_list_cache.get(cache_key)
if cached_json:
return Response(content=cached_json, media_type="application/json")
πŸ€– Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routers/grievances.py` around lines 45 - 48, The cache key creation
using f"grievances_{status}_{category}_{limit}_{offset}" can collide because
status/category may contain underscores; update the logic that creates cache_key
(the variable named cache_key used with grievance_list_cache.get / .set) to
build a canonical, collision-free key β€” e.g., create a tuple (status, category,
limit, offset), serialize it deterministically (json.dumps or repr) and then
either use that string directly prefixed with "grievances:" or compute a short
hash (sha256) of the serialized tuple and use "grievances:{hash}"; ensure the
same construction is used for both cache get and set so lookups remain
consistent.


query = db.query(Grievance).options(
selectinload(Grievance.audit_logs),
joinedload(Grievance.jurisdiction)
Expand All @@ -54,40 +60,44 @@ def get_grievances(
grievances = query.offset(offset).limit(limit).all()

# Convert to response format
result = []
result_data = []
for grievance in grievances:
escalation_history = [
EscalationAuditResponse(
id=audit.id,
grievance_id=audit.grievance_id,
previous_authority=audit.previous_authority,
new_authority=audit.new_authority,
timestamp=audit.timestamp,
reason=audit.reason.value
)
{
"id": audit.id,
"grievance_id": audit.grievance_id,
"previous_authority": audit.previous_authority,
"new_authority": audit.new_authority,
"timestamp": audit.timestamp.isoformat() if audit.timestamp else None,
"reason": audit.reason.value
}
for audit in grievance.audit_logs
]

result.append(GrievanceSummaryResponse(
id=grievance.id,
unique_id=grievance.unique_id,
category=grievance.category,
severity=grievance.severity.value,
pincode=grievance.pincode,
city=grievance.city,
district=grievance.district,
state=grievance.state,
current_jurisdiction_id=grievance.current_jurisdiction_id,
assigned_authority=grievance.assigned_authority,
sla_deadline=grievance.sla_deadline,
status=grievance.status.value,
created_at=grievance.created_at,
updated_at=grievance.updated_at,
resolved_at=grievance.resolved_at,
escalation_history=escalation_history
))

return result
result_data.append({
"id": grievance.id,
"unique_id": grievance.unique_id,
"category": grievance.category,
"severity": grievance.severity.value,
"pincode": grievance.pincode,
"city": grievance.city,
"district": grievance.district,
"state": grievance.state,
"current_jurisdiction_id": grievance.current_jurisdiction_id,
"assigned_authority": grievance.assigned_authority,
"sla_deadline": grievance.sla_deadline.isoformat() if grievance.sla_deadline else None,
"status": grievance.status.value,
"created_at": grievance.created_at.isoformat() if grievance.created_at else None,
"updated_at": grievance.updated_at.isoformat() if grievance.updated_at else None,
"resolved_at": grievance.resolved_at.isoformat() if grievance.resolved_at else None,
"escalation_history": escalation_history
})

# Cache serialized JSON to bypass Pydantic validation/serialization on hits
json_data = json.dumps(result_data)
grievance_list_cache.set(data=json_data, key=cache_key)

return Response(content=json_data, media_type="application/json")

except Exception as e:
logger.error(f"Error getting grievances: {e}", exc_info=True)
Expand Down Expand Up @@ -149,9 +159,14 @@ def get_grievance(grievance_id: int, db: Session = Depends(get_db)):
def get_escalation_stats(db: Session = Depends(get_db)):
"""
Get escalation statistics.
Optimized: Uses a single GROUP BY query instead of 4 separate count queries.
Optimized: Uses serialization caching and a single GROUP BY query.
"""
try:
cache_key = "global_stats"
cached_json = escalation_stats_cache.get(cache_key)
if cached_json:
return Response(content=cached_json, media_type="application/json")

# Perform aggregation in a single query for performance
status_counts = db.query(
Grievance.status,
Expand All @@ -168,13 +183,19 @@ def get_escalation_stats(db: Session = Depends(get_db)):

escalation_rate = (escalated_grievances / total_grievances * 100) if total_grievances > 0 else 0

return EscalationStatsResponse(
total_grievances=total_grievances,
escalated_grievances=escalated_grievances,
active_grievances=active_grievances,
resolved_grievances=resolved_grievances,
escalation_rate=escalation_rate
)
result_data = {
"total_grievances": total_grievances,
"escalated_grievances": escalated_grievances,
"active_grievances": active_grievances,
"resolved_grievances": resolved_grievances,
"escalation_rate": escalation_rate
}

# Cache serialized JSON
json_data = json.dumps(result_data)
escalation_stats_cache.set(data=json_data, key=cache_key)

return Response(content=json_data, media_type="application/json")

except Exception as e:
logger.error(f"Error getting escalation stats: {e}", exc_info=True)
Expand Down
Loading