- Log every API call (endpoint, tokens, cost, user_id) - Aggregate daily usage per user - Check quota before LLM call: generated_count < limit? - Increment counter after successful generation - Expose GET /api/quota endpoint - Warning when 80% quota reached - Graceful handling when over quota (429 error)