Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .jules/bolt.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,11 @@
## 2025-02-13 - API Route Prefix Consistency
**Learning:** Inconsistent application of `/api` prefixes between `main.py` router mounting and test suite request paths can lead to 404 errors during testing, even if the logic is correct. This is especially prevalent when multiple agents work on the same codebase with different assumptions about global prefixes.
**Action:** Always verify that `app.include_router` in `backend/main.py` uses `prefix="/api"` if the test suite (e.g., `tests/test_blockchain.py`) expects it. If a router is mounted without a prefix, ensure tests are updated or the prefix is added to `main.py` to maintain repository-wide consistency.

## 2026-02-14 - Cache Consistency in Blockchain Chaining
**Learning:** When using in-memory caches (like `ThreadSafeCache`) to store the "last hash" for blockchain chaining, updating the cache *before* a successful database commit can lead to cache poisoning if the transaction fails. Subsequent records would then chain to a hash that doesn't exist in the database.
**Action:** Always update the "last hash" cache ONLY after a successful `db.commit()`. Additionally, when retrieving the previous hash, perform a quick check against the database to ensure the cached hash matches the actual last record, providing a fail-safe against cache inconsistency in multi-worker environments.

## 2026-02-14 - Optimized Evidence Verification
**Learning:** Materializing all evidence records for a grievance using `.all()` just to get the count or the latest record is inefficient, especially as the system scales.
**Action:** Use `.count()` for existence checks and `.order_by(Model.id.desc()).first()` to fetch only the latest record. This reduces memory pressure and database transfer overhead.
1 change: 1 addition & 0 deletions backend/cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,5 +179,6 @@ def invalidate(self):
user_upload_cache = ThreadSafeCache(ttl=3600, max_size=1000) # 1 hour TTL for upload limits
blockchain_last_hash_cache = ThreadSafeCache(ttl=3600, max_size=1)
grievance_last_hash_cache = ThreadSafeCache(ttl=3600, max_size=1)
resolution_last_hash_cache = ThreadSafeCache(ttl=3600, max_size=1)
visit_last_hash_cache = ThreadSafeCache(ttl=3600, max_size=2)
user_issues_cache = ThreadSafeCache(ttl=300, max_size=50) # 5 minutes TTL
28 changes: 28 additions & 0 deletions backend/init_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,34 @@ def index_exists(table, index_name):
if not index_exists("field_officer_visits", "ix_field_officer_visits_previous_visit_hash"):
conn.execute(text("CREATE INDEX IF NOT EXISTS ix_field_officer_visits_previous_visit_hash ON field_officer_visits (previous_visit_hash)"))

# Resolution Evidence Table Migrations
if inspector.has_table("resolution_evidence"):
if not column_exists("resolution_evidence", "integrity_hash"):
conn.execute(text("ALTER TABLE resolution_evidence ADD COLUMN integrity_hash VARCHAR"))
logger.info("Added integrity_hash column to resolution_evidence")

if not column_exists("resolution_evidence", "previous_integrity_hash"):
conn.execute(text("ALTER TABLE resolution_evidence ADD COLUMN previous_integrity_hash VARCHAR"))
logger.info("Added previous_integrity_hash column to resolution_evidence")

if not index_exists("resolution_evidence", "ix_resolution_evidence_previous_integrity_hash"):
conn.execute(text("CREATE INDEX IF NOT EXISTS ix_resolution_evidence_previous_integrity_hash ON resolution_evidence (previous_integrity_hash)"))
logger.info("Created index ix_resolution_evidence_previous_integrity_hash")

# Resolution Proof Tokens Table Migrations
if inspector.has_table("resolution_proof_tokens"):
if not column_exists("resolution_proof_tokens", "valid_from"):
conn.execute(text("ALTER TABLE resolution_proof_tokens ADD COLUMN valid_from DATETIME"))
logger.info("Added valid_from column to resolution_proof_tokens")

if not column_exists("resolution_proof_tokens", "valid_until"):
conn.execute(text("ALTER TABLE resolution_proof_tokens ADD COLUMN valid_until DATETIME"))
logger.info("Added valid_until column to resolution_proof_tokens")

if not column_exists("resolution_proof_tokens", "nonce"):
conn.execute(text("ALTER TABLE resolution_proof_tokens ADD COLUMN nonce VARCHAR"))
logger.info("Added nonce column to resolution_proof_tokens")
Comment on lines +223 to +235
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle legacy resolution_proof_tokens rows during this migration.

Lines 225-234 only add valid_from, valid_until, and nonce as nullable columns. ResolutionProofService.validate_token() now treats those fields as required when checking expiry and rebuilding the HMAC, so any still-live token created before this migration can start failing validation or blow up on valid_until.tzinfo. Add a data migration that backfills valid_from/valid_until from the legacy timestamps and explicitly expires or regenerates rows that have no nonce.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/init_db.py` around lines 223 - 235, After adding the three nullable
columns to resolution_proof_tokens, add a data migration that (1) backfills
valid_from/valid_until from the table's existing legacy timestamp column(s)
(e.g., created_at or issued_at) so validate_token() has timezone-aware
datetimes, and (2) handles rows missing nonce by either setting valid_until to
now (explicitly expiring them) or generating a new nonce and rebuilding the HMAC
using ResolutionProofService (so future calls to
ResolutionProofService.validate_token() won't fail); implement this immediately
after the ALTER TABLEs using the same conn.execute/inspector flow (use
column_exists, conn.execute(text(...)) and ResolutionProofService methods to
recompute tokens as needed).


Comment on lines +225 to +236
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration adds valid_from, valid_until, and nonce to resolution_proof_tokens but doesn’t backfill existing rows. Any still-valid pre-migration token rows will have NULL in these columns, which can break validate_token() (it reads token.valid_until / token.valid_from / token.nonce).

Suggestion: add a one-time backfill step after adding the columns (e.g., set valid_from=generated_at, valid_until=expires_at where null). If nonce is required for signature verification, either backfill from existing token payload data (if stored) or adjust signature verification to handle legacy tokens explicitly.

Suggested change
if not column_exists("resolution_proof_tokens", "valid_from"):
conn.execute(text("ALTER TABLE resolution_proof_tokens ADD COLUMN valid_from DATETIME"))
logger.info("Added valid_from column to resolution_proof_tokens")
if not column_exists("resolution_proof_tokens", "valid_until"):
conn.execute(text("ALTER TABLE resolution_proof_tokens ADD COLUMN valid_until DATETIME"))
logger.info("Added valid_until column to resolution_proof_tokens")
if not column_exists("resolution_proof_tokens", "nonce"):
conn.execute(text("ALTER TABLE resolution_proof_tokens ADD COLUMN nonce VARCHAR"))
logger.info("Added nonce column to resolution_proof_tokens")
has_valid_from = column_exists("resolution_proof_tokens", "valid_from")
if not has_valid_from:
conn.execute(text("ALTER TABLE resolution_proof_tokens ADD COLUMN valid_from DATETIME"))
logger.info("Added valid_from column to resolution_proof_tokens")
has_valid_from = True
has_valid_until = column_exists("resolution_proof_tokens", "valid_until")
if not has_valid_until:
conn.execute(text("ALTER TABLE resolution_proof_tokens ADD COLUMN valid_until DATETIME"))
logger.info("Added valid_until column to resolution_proof_tokens")
has_valid_until = True
has_nonce = column_exists("resolution_proof_tokens", "nonce")
if not has_nonce:
conn.execute(text("ALTER TABLE resolution_proof_tokens ADD COLUMN nonce VARCHAR"))
logger.info("Added nonce column to resolution_proof_tokens")
has_nonce = True
has_generated_at = column_exists("resolution_proof_tokens", "generated_at")
has_expires_at = column_exists("resolution_proof_tokens", "expires_at")
if has_valid_from and has_generated_at:
conn.execute(text(
"UPDATE resolution_proof_tokens "
"SET valid_from = generated_at "
"WHERE valid_from IS NULL AND generated_at IS NOT NULL"
))
logger.info("Backfilled valid_from from generated_at for legacy resolution_proof_tokens rows")
if has_valid_until and has_expires_at:
conn.execute(text(
"UPDATE resolution_proof_tokens "
"SET valid_until = expires_at "
"WHERE valid_until IS NULL AND expires_at IS NOT NULL"
))
logger.info("Backfilled valid_until from expires_at for legacy resolution_proof_tokens rows")
if has_nonce:
if column_exists("resolution_proof_tokens", "token"):
conn.execute(text(
"UPDATE resolution_proof_tokens "
"SET nonce = token "
"WHERE nonce IS NULL AND token IS NOT NULL"
))
logger.info("Backfilled nonce from token for legacy resolution_proof_tokens rows")
else:
logger.warning(
"resolution_proof_tokens.nonce exists but no legacy source column was found for backfill; "
"legacy rows may require explicit compatibility handling during validation."
)

Copilot uses AI. Check for mistakes.
logger.info("Database migration check completed successfully.")

except Exception as e:
Expand Down
7 changes: 7 additions & 0 deletions backend/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,10 @@ class ResolutionEvidence(Base):
server_signature = Column(String, nullable=True)
verification_status = Column(Enum(VerificationStatus), default=VerificationStatus.PENDING)

# Blockchain integrity chaining
integrity_hash = Column(String, nullable=True)
previous_integrity_hash = Column(String, nullable=True, index=True)

# Relationships
grievance = relationship("Grievance", back_populates="resolution_evidence")
audit_logs = relationship("EvidenceAuditLog", back_populates="evidence")
Expand All @@ -299,6 +303,9 @@ class ResolutionProofToken(Base):
token = Column(String, unique=True, index=True, nullable=True)
token_id = Column(String, unique=True, index=True, nullable=True) # UUID string
authority_email = Column(String, nullable=True)
valid_from = Column(DateTime, default=lambda: datetime.datetime.now(datetime.timezone.utc))
valid_until = Column(DateTime, nullable=True)
nonce = Column(String, nullable=True)
generated_at = Column(DateTime, default=lambda: datetime.datetime.now(datetime.timezone.utc))
expires_at = Column(DateTime, nullable=False)
Comment on lines +306 to 310
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new valid_from/valid_until/nonce columns are defined as nullable, but the service logic (validate_token() and timestamp window checks) assumes these fields are always present and will raise if they are NULL.

Suggestion: make these columns non-nullable (and ensure generate_proof_token() always sets them), or update service logic to fall back to legacy fields (generated_at/expires_at) when the new columns are missing so existing rows don’t break runtime validation.

Copilot uses AI. Check for mistakes.
is_used = Column(Boolean, default=False)
Expand Down
35 changes: 29 additions & 6 deletions backend/resolution_proof_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
EvidenceAuditLog, VerificationStatus, GrievanceStatus
)
from backend.config import get_config
from backend.cache import resolution_last_hash_cache

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -197,6 +198,7 @@ def generate_proof_token(
geofence_radius_meters=geofence_radius,
valid_from=now,
valid_until=valid_until,
expires_at=valid_until, # Explicitly set for DB constraint/legacy compatibility
nonce=nonce,
token_signature=signature,
is_used=False,
Expand Down Expand Up @@ -368,6 +370,19 @@ def submit_evidence(
bundle_str = json.dumps(metadata_bundle, sort_keys=True)
server_signature = ResolutionProofService._sign_payload(bundle_str)

# 5b. Implement cryptographic chaining (Issue #BLOCKCHAIN-003)
# Performance Boost: Use thread-safe cache for O(1) last hash retrieval
prev_hash = resolution_last_hash_cache.get("last_hash")
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The cache-based prev_hash lookup is not atomic with the subsequent insert+commit, so concurrent submit_evidence calls will read the same prev_hash and produce two records chaining off the same predecessor — forking the integrity chain. A SELECT ... FOR UPDATE or database-level advisory lock is needed to serialize chain extension and ensure each new record links to the true latest hash.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/resolution_proof_service.py, line 375:

<comment>The cache-based `prev_hash` lookup is not atomic with the subsequent insert+commit, so concurrent `submit_evidence` calls will read the same `prev_hash` and produce two records chaining off the same predecessor — forking the integrity chain. A `SELECT ... FOR UPDATE` or database-level advisory lock is needed to serialize chain extension and ensure each new record links to the true latest hash.</comment>

<file context>
@@ -368,6 +370,19 @@ def submit_evidence(
 
+        # 5b. Implement cryptographic chaining (Issue #BLOCKCHAIN-003)
+        # Performance Boost: Use thread-safe cache for O(1) last hash retrieval
+        prev_hash = resolution_last_hash_cache.get("last_hash")
+        if prev_hash is None:
+            # Cache miss: fetch ONLY the last hash from DB
</file context>
Fix with Cubic

if prev_hash is None:
# Cache miss: fetch ONLY the last hash from DB
last_record = db.query(ResolutionEvidence.integrity_hash).order_by(ResolutionEvidence.id.desc()).first()
prev_hash = last_record[0] if last_record and last_record[0] else ""
resolution_last_hash_cache.set(data=prev_hash, key="last_hash")

Comment on lines +374 to +381
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submit_evidence() derives prev_hash from an in-memory cache without validating it against the current DB tail. In multi-worker deployments (or after another process writes newer evidence), this can cause previous_integrity_hash to point to a hash that is no longer the latest persisted record, weakening the intended append-only chain semantics.

Suggestion: treat the cache as a hint only—fetch (last_id, last_hash) from DB and compare against cached values (or store a single cached tuple) before computing the new integrity_hash; refresh the cache on mismatch. If strict linear chaining is required under concurrency, compute prev_hash inside a serialized/locked transaction rather than relying on the cache.

Suggested change
# Performance Boost: Use thread-safe cache for O(1) last hash retrieval
prev_hash = resolution_last_hash_cache.get("last_hash")
if prev_hash is None:
# Cache miss: fetch ONLY the last hash from DB
last_record = db.query(ResolutionEvidence.integrity_hash).order_by(ResolutionEvidence.id.desc()).first()
prev_hash = last_record[0] if last_record and last_record[0] else ""
resolution_last_hash_cache.set(data=prev_hash, key="last_hash")
# Treat the cache as a hint only: validate against the current DB tail
cached_tail = resolution_last_hash_cache.get("last_evidence_tail")
last_record = (
db.query(ResolutionEvidence.id, ResolutionEvidence.integrity_hash)
.order_by(ResolutionEvidence.id.desc())
.first()
)
db_tail = (
(last_record[0], last_record[1] or "")
if last_record
else (None, "")
)
if cached_tail != db_tail:
resolution_last_hash_cache.set(data=db_tail, key="last_evidence_tail")
prev_hash = db_tail[1]

Copilot uses AI. Check for mistakes.
# Chaining logic: hash(evidence_hash|token_id|prev_hash)
chain_payload = f"{evidence_hash}|{token.token_id}|{prev_hash}"
integrity_hash = ResolutionProofService._sign_payload(chain_payload)

# 6. Create evidence record
evidence = ResolutionEvidence(
grievance_id=token.grievance_id,
Expand All @@ -380,6 +395,8 @@ def submit_evidence(
metadata_bundle=metadata_bundle,
server_signature=server_signature,
verification_status=VerificationStatus.VERIFIED,
integrity_hash=integrity_hash,
previous_integrity_hash=prev_hash
)

db.add(evidence)
Expand All @@ -391,6 +408,9 @@ def submit_evidence(
db.commit()
db.refresh(evidence)

# Update cache AFTER successful commit to prevent poisoning
resolution_last_hash_cache.set(data=integrity_hash, key="last_hash")

# 8. Create audit log
ResolutionProofService._create_audit_log(
evidence_id=evidence.id,
Expand Down Expand Up @@ -435,11 +455,12 @@ def verify_evidence(grievance_id: int, db: Session) -> Dict[str, Any]:
Returns:
Verification result dictionary
"""
evidence_records = db.query(ResolutionEvidence).filter(
# Performance Boost: Use .count() for existence check instead of materializing all records
evidence_count = db.query(ResolutionEvidence).filter(
ResolutionEvidence.grievance_id == grievance_id
).all()
).count()

if not evidence_records:
if evidence_count == 0:
return {
"grievance_id": grievance_id,
"is_verified": False,
Expand All @@ -452,8 +473,10 @@ def verify_evidence(grievance_id: int, db: Session) -> Dict[str, Any]:
"message": "No resolution evidence found for this grievance"
}

# Use the most recent evidence
evidence = evidence_records[-1]
# Performance Boost: Fetch only the most recent evidence record
evidence = db.query(ResolutionEvidence).filter(
ResolutionEvidence.grievance_id == grievance_id
).order_by(ResolutionEvidence.id.desc()).first()

# Re-verify the server signature
bundle_str = json.dumps(evidence.metadata_bundle, sort_keys=True)
Expand Down Expand Up @@ -494,7 +517,7 @@ def verify_evidence(grievance_id: int, db: Session) -> Dict[str, Any]:
"location_match": location_match,
"evidence_integrity": signature_valid,
"evidence_hash": evidence.evidence_hash,
"evidence_count": len(evidence_records),
"evidence_count": evidence_count,
"message": (
"Resolution verified with cryptographic proof"
if is_verified
Expand Down
60 changes: 59 additions & 1 deletion backend/routers/resolution_proof.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,13 @@
from sqlalchemy.orm import Session

from backend.database import get_db
from backend.models import ResolutionEvidence
from backend.resolution_proof_service import ResolutionProofService
from backend.schemas import (
GenerateRPTRequest, RPTResponse,
SubmitEvidenceRequest, EvidenceResponse,
VerificationResponse, AuditTrailResponse,
DuplicateCheckResponse,
DuplicateCheckResponse, BlockchainVerificationResponse
)

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -217,3 +218,60 @@ def flag_duplicate_evidence(
except Exception as e:
logger.error(f"Error checking duplicates: {e}", exc_info=True)
raise HTTPException(status_code=500, detail="Failed to check for duplicates")


@router.get("/{evidence_id}/blockchain-verify", response_model=BlockchainVerificationResponse)
def verify_evidence_blockchain(
evidence_id: int,
db: Session = Depends(get_db)
):
"""
Verify the cryptographic integrity of a resolution evidence record using blockchain-style chaining.
Optimized: Uses previous_integrity_hash column for O(1) verification.
"""
try:
evidence = db.query(
ResolutionEvidence.evidence_hash,
ResolutionEvidence.token_id,
ResolutionEvidence.integrity_hash,
ResolutionEvidence.previous_integrity_hash
).filter(ResolutionEvidence.id == evidence_id).first()

if not evidence:
raise HTTPException(status_code=404, detail="Evidence not found")

# Determine previous hash (O(1) from stored column)
prev_hash = evidence.previous_integrity_hash or ""

# Fetch token_id string for chaining logic consistency
from backend.models import ResolutionProofToken
token = db.query(ResolutionProofToken.token_id).filter(ResolutionProofToken.id == evidence.token_id).first()
token_id_str = token[0] if token else ""

Comment on lines +233 to +250
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/{evidence_id}/blockchain-verify currently does an extra query to fetch the token UUID (ResolutionProofToken.token_id) and silently falls back to "" when the token row is missing. That can both (a) undercut the stated O(1) optimization (2 DB round-trips) and (b) report a misleading “integrity check failed” when the real problem is a missing/invalid token reference.

Suggestion: fetch ResolutionProofToken.token_id via a join in the initial query (single round-trip), and if the token is missing, return an explicit error/invalid state indicating the evidence’s token reference can’t be resolved.

Copilot uses AI. Check for mistakes.
# Chaining logic: hash(evidence_hash|token_id|prev_hash)
chain_payload = f"{evidence.evidence_hash}|{token_id_str}|{prev_hash}"
computed_hash = ResolutionProofService._sign_payload(chain_payload)

if evidence.integrity_hash is None:
is_valid = False
message = "No integrity hash present for this record; cryptographic integrity cannot be verified."
else:
is_valid = (computed_hash == evidence.integrity_hash)
message = (
"Integrity verified. This resolution evidence record is cryptographically sealed."
if is_valid
else "Integrity check failed! The evidence data does not match its cryptographic seal."
)

return BlockchainVerificationResponse(
is_valid=is_valid,
current_hash=evidence.integrity_hash,
computed_hash=computed_hash,
message=message
)

except HTTPException:
raise
except Exception as e:
logger.error(f"Error verifying evidence blockchain for {evidence_id}: {e}", exc_info=True)
raise HTTPException(status_code=500, detail="Failed to verify evidence integrity")
Loading