Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -30,4 +30,7 @@ tmp/
.venv

todo.md
development/caddy/data
development/caddy/data

readme.ai
.scratch/*
19 changes: 19 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Agent Instructions

## Repository Overview
This monorepo contains tools for collecting and managing municipal official data for CivicPatch.

## Projects

| Directory | Purpose |
|-----------|---------|
| `civicpatch/` | Core Python package for municipal data collection |
| `api.civicpatch.org/` | FastAPI backend service |

## Global Conventions
- OCD-IDs (Open Civic Data Identifiers) are used for jurisdiction identification
- YAML for data files or files meant to be modified by humans
- JSON for generated files
- Functional style preferred

See individual project `AGENTS.md` files for project-specific guidance.
41 changes: 41 additions & 0 deletions api.civicpatch.org/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# api.civicpatch.org — Agent Instructions

## Overview
FastAPI backend service that handles job artifacts, storage, and GitHub workflow triggers.

## Tech Stack
- Python 3.x
- FastAPI
- Google Sheets (for cost tracking)
- AWS S3 (storage)
- GitHub Actions (workflow triggers)

## Key Paths
- `src/job_service/people_collector.py` — Handles artifact submission and processing
- `src/services/` — External service integrations (storage, GitHub, Google Sheets)
- `src/schemas/` — Pydantic request/response models
- `src/utils/file_utils.py` — File handling utilities

## Artifact Processing Flow
1. Receive ZIP upload with collected data
2. Extract and validate against expected patterns
3. Upload debug files to storage
4. Process images and update data with CDN URLs
5. Trigger GitHub data intake workflow
6. Track costs in Google Sheets

## File Patterns
**Artifact files** (zipped for GitHub):
- `data/*/local/*.yml`
- `data_source/*/local/*/workflow_context.json`

**Debug files** (uploaded to storage):
- `data_source/*/local/*/cache/*`
- `data_source/*/local/*/images/*`
- `data_source/*/local/*/costs.json`
- `data_source/*/local/*/workflow.log`

## Conventions
- Use `background_tasks` for non-blocking operations (e.g., cost tracking)
- Presigned URLs for S3 uploads
- Request IDs used for file organization in storage
20 changes: 16 additions & 4 deletions api.civicpatch.org/poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions api.civicpatch.org/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ dependencies = [
"asyncio (>=4.0.0,<5.0.0)",
"aiofiles (>=25.1.0,<26.0.0)",
"google-api-python-client (>=2.189.0,<3.0.0)",
"nameparser (>=1.1.3,<2.0.0)",
]

[tool.poetry]
Expand Down
52 changes: 26 additions & 26 deletions api.civicpatch.org/src/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,36 +27,36 @@ def to_iso(dt):
return dt.isoformat()
return None

async def maybe_insert_user(provider, provider_user_id, email):
async with pool.connection() as conn:
# Try to insert the user; check if it was newly created
result = await conn.execute(
async def create_update_user(provider, provider_user_id, email, teams: List[str]):
async with pool.connection() as conn, conn.cursor() as cur:
# Upsert user
await cur.execute(
"""
INSERT INTO users (provider, provider_user_id, email)
VALUES (%s, %s, %s)
ON CONFLICT (provider, provider_user_id) DO NOTHING
ON CONFLICT (provider, provider_user_id)
DO UPDATE SET email = EXCLUDED.email
""",
(provider, provider_user_id, email),
)
# If result.rowcount > 0, the user was newly inserted
if result.rowcount > 0:
# Insert 'unverified' role for new user
await conn.execute(
"""
INSERT INTO user_roles (provider, provider_user_id, role)
VALUES (%s, %s, %s)
""",
(provider, provider_user_id, "unverified"),
)
else:
# User already exists, just update email if needed
await conn.execute(
"""
UPDATE users SET email = %s
WHERE provider = %s AND provider_user_id = %s
""",
(email, provider, provider_user_id),
)
# Remove existing teams for user
await cur.execute(
"""
DELETE FROM user_roles
WHERE provider = %s AND provider_user_id = %s
""",
(provider, provider_user_id),
)
# Insert new teams for user (always present)
await cur.executemany(
"""
INSERT INTO user_roles (provider, provider_user_id, role)
VALUES (%s, %s, %s)
ON CONFLICT (provider, provider_user_id, role)
DO NOTHING
""",
[(provider, provider_user_id, team) for team in teams],
)


async def create_api_key(provider, provider_user_id):
Expand Down Expand Up @@ -134,7 +134,7 @@ async def get_user(provider, provider_user_id):
"email": row[0],
"server_url": row[1],
"created_at": row[2],
"roles": row[3],
"teams": row[3],
}

async def get_user_by_api_key(api_key):
Expand Down Expand Up @@ -167,7 +167,7 @@ async def get_user_by_api_key(api_key):
"email": row[2],
"server_url": row[3],
"created_at": row[4],
"roles": row[5],
"teams": row[5],
}

async def get_user_details(provider, provider_user_id):
Expand Down
4 changes: 2 additions & 2 deletions api.civicpatch.org/src/frontend/templates/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<div style="display: flex; align-items: center; justify-content: space-between;">
<h1>civicpatch - api key management</h1>
{% if user %}
<a href="/auth/logout" class="secondary" style="margin-left: auto; font-size: 1rem; padding: 0.4em 1.2em;">Logout</a>
<a href="/api/v1/auth/logout" class="secondary" style="margin-left: auto; font-size: 1rem; padding: 0.4em 1.2em;">Logout</a>
{% endif %}
</div>
<p>Manage your API keys for your civicpatch server here.</p>
Expand All @@ -32,7 +32,7 @@ <h1>civicpatch - api key management</h1>

<p>If you are setting this up for the first time, contact the <a href="mailto:{{ maintainer_email }}">maintainer</a>
to approve your account. You will want to provide your <strong>provider_user_id</strong> after logging in.</p>
<a href="/auth/github/login" class="contrast" role="button">Login with GitHub</a>
<a href="/api/v1/auth/github/login" class="contrast" role="button">Login with GitHub</a>
{% endif %}
</main>
</body>
Expand Down
13 changes: 7 additions & 6 deletions api.civicpatch.org/src/frontend/templates/partials/api_keys.html
Original file line number Diff line number Diff line change
Expand Up @@ -21,29 +21,30 @@ <h2>API Usage</h2>
</footer>
</article>

<h2>Your Roles</h2>
<h2>Your Teams</h2>
<hr />
{% if user.roles and user.roles|length > 0 %}
{% if user.teams and user.teams|length > 0 %}
<p>Different endpoints require different roles. Refer to the
<a href="/docs">API documentation</a> for details.
</p>
<p>Contact support if you need additional roles.</p>
<p>Any API keys you create here will inherit the permissions of your teams.</p>
<table>
<thead>
<tr>
<th>Role</th>
<th>Team</th>
</tr>
</thead>
<tbody>
{% for role in user.roles %}
{% for team in user.teams %}
<tr>
<td>{{ role }}</td>
<td>{{ team }}</td>
</tr>
{% endfor %}
</tbody>
</table>
{% else %}
<p>You have no roles assigned.</p>
<p>You have no teams assigned.</p>
{% endif %}


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@
SEARCH_ENGINES_SHEET_NAME = "Cost Search Engines"
STORAGE_SHEET_NAME = "Cost Storage"

STORAGE_ENDPOINT = os.getenv("STORAGE_ENDPOINT")
INSTANCE_DOMAIN = "civicpatch.org" # Just hardcode it for now...

logger = logging.getLogger(__name__)

async def handle_submit_job_artifacts(
Expand Down Expand Up @@ -112,7 +115,12 @@ async def _process_images(debug_file_dir: str, filenames_to_urls: dict, data: Li
for person in data:
if person.get("image"):
if person["image"] in image_map and image_map[person["image"]] in filenames_to_urls:
person["cdn_image"] = filenames_to_urls[image_map[person["image"]]]
storage_url = filenames_to_urls[image_map[person["image"]]]
# convert storage URL to the civicpatch-artifacts bucket
# temp TODO move
cdn_image_url = storage_url.replace(f"{STORAGE_ENDPOINT}/civicpatch-artifacts", f"https://civicpatch-artifacts.{INSTANCE_DOMAIN}")

person["cdn_image"] = cdn_image_url
return data

async def _upload_debug_files(debug_file_dir: str, file_suffix_without_ext: str) -> dict:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
from shared.utils import review_utils

def extract_review_data(workflow_context: dict, people: list) -> dict:
data = workflow_context.get("data", {})
review = review_utils.generate_review(
research_people=data.get("research_municipality_step", {}).get("elected_officials", []),
people_by_llm=data.get("merge_records_within_llm_step", {}).get("people_by_llm", {}),
people=people,
# TODO: pass in identities from the database
)

return review
Loading