Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions .github/workflows/post_on_merge.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: POST New Documents on Merge

on:
push:
branches:
- main
workflow_dispatch:

jobs:
post:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt

- name: Get added files in latest commit
id: added
run: |
files=$(git diff --diff-filter=A HEAD~1 HEAD --name-only | grep -E '^(sources|claims|proofs)/' || true)
echo "Added files:"
echo "$files"
echo "ADDED_FILES<<EOF" >> "$GITHUB_ENV"
echo "$files" >> "$GITHUB_ENV"
echo "EOF" >> "$GITHUB_ENV"

- name: POST new documents
if: env.ADDED_FILES != ''
run: python scripts/post_requests.py
env:
API_BASE_URL: ${{ secrets.API_BASE_URL }}
API_KEY: ${{ secrets.API_KEY }}

- name: Nothing to POST
if: env.ADDED_FILES == ''
run: echo "No new request documents to POST."
6 changes: 4 additions & 2 deletions .github/workflows/validate.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
name: Validate Request Documents

on:
workflow_run:
pull_request:
branches:
- main
paths:
- 'grant_requests/**'
- 'admission_requests/**'
- 'sources/**'
- 'claims/**'
- 'proofs/**'

jobs:
validate:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
**/__pycache__/**
123 changes: 94 additions & 29 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,29 +2,35 @@

## Repository Overview

This repository stores structured request documents organized by type. Each folder corresponds to a schema defined in `openapi.yaml`. All documents must conform to their folder's schema.
This repository stores structured request documents organized by type. Each folder corresponds to a schema defined in `oapi.yaml`. All documents must conform to their folder's schema.

After merge to `main`, new documents are automatically POSTed to their respective API endpoints.

### Directory Layout

```text
.
├── openapi.yaml # Single source of truth for all schemas
├── grant_requests/ # Grant application documents
├── admission_requests/ # Admission application documents
├── oapi.yaml # Schemas + API endpoint paths
├── sources/ # Source documents
├── claims/ # Claims documents
├── proofs/ # Proofs documents
├── scripts/
│ └── validate.py # Validates documents against schemas
│ ├── validate.py # Validates documents against schemas
│ └── post_requests.py # POSTs newly added documents to APIs
└── .github/workflows/
└── validate.yml # Runs validate.py on every PR
├── validate.yml # Runs validate.py on PRs to main
└── post_on_merge.yml # Runs post_requests.py on push to main
Comment thread
semmet95 marked this conversation as resolved.
```

## What the Validation Script Does

`scripts/validate.py` performs the following:

1. Loads `openapi.yaml` and extracts schemas from `components.schemas`.
1. Loads `oapi.yaml` and extracts schemas from `components.schemas`.
2. Maps folders to schema names:
- `grant_requests/` → `GrantRequest`
- `admission_requests/` → `AdmissionRequest`
- `sources/` → `SourceInput`
- `claims/` → `ClaimInput`
- `proofs/` → `ProofInput`
3. Scans each tracked folder for `.yaml`, `.yml`, and `.json` files.
4. Validates every found document against its corresponding schema using JSON Schema Draft 2020-12.
5. Reports pass/fail per file with specific field-level errors.
Expand All @@ -36,51 +42,110 @@ This repository stores structured request documents organized by type. Each fold
python scripts/validate.py

# Validate specific files only
python scripts/validate.py grant_requests/my-project.yaml
python scripts/validate.py sources/source1.yaml
```

### Dependencies
## What the POST Script Does

`scripts/post_requests.py` performs the following:

1. Loads `oapi.yaml` and extracts API paths from the `paths` section.
2. Maps schema names to POST endpoints by matching `$ref` in `requestBody.content.application/json.schema`.
3. Reads the list of newly added files from the `ADDED_FILES` environment variable.
4. For each file:
- Loads the YAML document.
- Constructs the full URL as `API_BASE_URL + path` (e.g., `https://api.example.com/v1/source`).
- POSTs the document as JSON with `X-API-Key: API_KEY`.
5. Reports pass/fail per file with HTTP status and response body.

### Environment Variables

| Variable | Required | Source |
|----------|----------|--------|
| `API_BASE_URL` | Yes | GitHub Secret `secrets.API_BASE_URL` |
| `API_KEY` | Yes | GitHub Secret `secrets.API_KEY` |
| `ADDED_FILES` | Yes | CI workflow computes this from `git diff` |

The script **exits with code 1 immediately** if `API_BASE_URL` or `API_KEY` is missing.

### Running Locally

```bash
pip install -r requirements.txt
# Set required env vars
export API_BASE_URL="https://api.example.com/v1"
export API_KEY="sk_live_abc123"
export ADDED_FILES="sources/source1.yaml"

# POST the document
python scripts/post_requests.py
```

## Workflows

### PR Validation (`validate.yml`)

| Trigger | Target branch | Paths |
|---------|---------------|-------|
| `pull_request` | `main` | `sources/**`, `claims/**`, `proofs/**` |

Runs on the PR source branch and validates all tracked files using `validate.py`.

### POST on Merge (`post_on_merge.yml`)

| Trigger | When it runs |
|---------|--------------|
| `push` to `main` | After PR merge |
| `workflow_dispatch` | Manual trigger from Actions tab |

For both triggers, the workflow:
1. Checks out the repo with full history (`fetch-depth: 0`).
2. Computes `git diff --diff-filter=A HEAD~1 HEAD --name-only` to find files **added in the latest commit**.
3. Passes those files to `post_requests.py` via the `ADDED_FILES` env var.
4. Only runs the POST step if at least one tracked file was added.

## Workflow Rules & Assumptions

- **New documents must be valid before merge.** The GitHub Action blocks merge if validation fails.
- **Documents must live in the correct folder.** Files placed in the wrong folder are ignored by the validator but may still trigger the CI workflow.
- **Schema changes are rare and separate.** If `openapi.yaml` ever changes, all existing documents must be re-validated and updated in the same or an earlier PR.
- **PRs only add new files.** They never modify existing documents or the `oapi.yaml` schema.
- **New documents must be valid before merge.** The PR validation workflow blocks merge if validation fails.
- **Documents must live in the correct folder.** Files placed in the wrong folder are ignored by both scripts but may still trigger CI workflows.
- **Schema changes are rare and separate.** If `oapi.yaml` ever changes, all existing documents must be re-validated and updated in the same or an earlier PR.
- **POST workflow requires secrets.** `API_BASE_URL` and `API_KEY` must be configured in repository settings for the merge workflow to succeed.

## Instructions for AI Agents

### When generating a new document

1. Read `openapi.yaml` to identify the correct schema for the target folder.
1. Read `oapi.yaml` to identify the correct schema for the target folder.
2. Produce a document that satisfies **all** `required` fields and respects type constraints (`minimum`, `maximum`, `minLength`, `enum`, etc.).
3. Use YAML unless JSON is explicitly requested.
4. Save the file directly into the appropriate folder (e.g., `grant_requests/`).
4. Save the file directly into the appropriate folder (e.g., `sources/`).
5. Run `python scripts/validate.py <file>` locally before suggesting the change.

### When reviewing a PR

1. Confirm the new file(s) are in the correct tracked folder.
2. Check that no existing files were modified (per repository policy).
3. If the PR changes `openapi.yaml`, flag it — schema changes must be handled separately.
3. If the PR changes `oapi.yaml`, flag it — schema changes must be handled separately.
4. Verify the document satisfies required fields and constraint bounds from the schema.
5. Suggest running `python scripts/validate.py` if validation results are not visible in CI.

### When modifying the script or CI
### When modifying scripts or CI

- Keep `validate.py` dependency-free except for `pyyaml`, `jsonschema`, and `referencing`.
- `validate.py` should default to scanning all tracked files when called without arguments.
- `post_requests.py` must crash with a clear error if `API_BASE_URL` or `API_KEY` is unset.
- Never add `git diff` logic into the Python scripts; the CI checkout already provides the correct working tree.
- Maintain the folder-to-schema mapping in a single dictionary at the top of both scripts.
- Keep the `oapi.yaml` `paths` section in sync with the schema `$ref` mappings used by the scripts.

- Keep the script dependency-free except for `pyyaml`, `jsonschema`, and `referencing`.
- The script should default to scanning all tracked files when called without arguments.
- Never add `git diff` logic into the Python script; the CI checkout already provides the correct working tree.
- Maintain the folder-to-schema mapping in a single dictionary at the top of `validate.py`.
## Schema & API Reference

## Schema Reference
| Folder | Schema Name | POST Path | Key Constraints |
|--------|-------------|-----------|-----------------|
| `sources/` | `SourceInput` | `/api/v1/source` | `name`: required non-empty; `summary`: required non-empty; `tags`: required (comma-separated, no spaces); `uri`: required HTTPS URL |
| `claims/` | `ClaimInput` | `/api/v1/claim` | `sourceUriDigest`: required (SHA-256) non-empty; `title`: required non-empty; `summary`: required non-empty; `uri`: required HTTPS URL |
| `proofs/` | `ProofInput` | `/api/v1/proof` | `claimUriDigest`: required non-empty (no spaces); `reviewedBy`: required non-empty (no spaces); `uri`: required HTTPS URL; `supportsClaim`: required boolean |

| Folder | Schema Name | Key Constraints |
|--------|-------------|-----------------|
| `grant_requests/` | `GrantRequest` | `applicant` ≥ 2 chars, `amount` ≥ 1000, `purpose` ≥ 20 chars, `timeline_months` 1–36 |
| `admission_requests/` | `AdmissionRequest` | `name` ≥ 2 chars, `program` ∈ {undergraduate, graduate, phd}, `gpa` 0.0–4.0, `statement` ≥ 50 chars |
The `oapi.yaml` `paths` section must contain a `post` operation for each schema with `requestBody.content.application/json.schema.$ref` pointing to the corresponding schema. The POST script uses this `$ref` to map schemas to their endpoint paths.

For full schema details, inspect `openapi.yaml` directly.
For full schema and API details, inspect `oapi.yaml` directly.
26 changes: 26 additions & 0 deletions scripts/common.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/env python3
"""Shared utilities for scripts in this folder.

Exports:
- load_oapi(path) -> dict
- load_doc(path) -> dict (YAML with JSON fallback)
"""

from pathlib import Path
import json
import yaml
from typing import Any, Dict


def load_oapi(path: str) -> Dict[str, Any]:
with open(path) as f:
return yaml.safe_load(f)


def load_doc(path: str) -> Dict[str, Any]:
with open(path) as f:
content = f.read()
try:
return yaml.safe_load(content)
except yaml.YAMLError:
return json.loads(content)
122 changes: 122 additions & 0 deletions scripts/post_requests.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
#!/usr/bin/env python3
"""POST newly added request documents to their respective APIs.

Requires environment variables:
API_BASE_URL API server base URL
API_KEY API token for authentication
"""

import json
import os
import sys
import urllib.error
import urllib.request
from pathlib import Path

# Shared utilities (try package import first, fallback to local module)
try:
from scripts.common import load_oapi, load_doc
except ImportError:
sys.path.insert(0, os.path.dirname(__file__))
from common import load_oapi, load_doc

# folder -> schema name
SCHEMA_MAP = {
Comment thread
semmet95 marked this conversation as resolved.
"sources": "SourceInput",
"claims": "ClaimInput",
"proofs": "ProofInput",
}

def extract_post_paths(spec: dict) -> dict[str, str]:
"""Map schema names to path suffixes from the OpenAPI spec."""
paths = {}
for path, methods in spec.get("paths", {}).items():
post = methods.get("post")
if not post:
continue

content = post.get("requestBody", {}).get("content", {})
json_schema = content.get("application/json", {}).get("schema", {})
ref = json_schema.get("$ref", "")

if ref.startswith("#/components/schemas/"):
schema_name = ref.split("/")[-1]
paths[schema_name] = path

return paths

def post(url: str, data: dict, api_key: str) -> tuple[int, str]:
payload = json.dumps(data).encode()
headers = {
Comment thread
semmet95 marked this conversation as resolved.
"Content-Type": "application/json",
"X-API-Key": f"{api_key}",
}

req = urllib.request.Request(url, data=payload, headers=headers, method="POST")
try:
with urllib.request.urlopen(req) as resp:
return resp.status, resp.read().decode()
except urllib.error.HTTPError as e:
return e.code, e.read().decode()


def main() -> int:
base_url = os.environ.get("API_BASE_URL", "").rstrip("/")
Comment thread
semmet95 marked this conversation as resolved.
api_key = os.environ.get("API_KEY", "")

if not base_url:
print("API_BASE_URL environment variable is not set", file=sys.stderr)
return 1
if not api_key:
print("API_KEY environment variable is not set", file=sys.stderr)
return 1

files = [f for f in os.environ.get("ADDED_FILES", "").splitlines() if f.strip()]
if not files:
print("No added files to process.")
return 0

spec = load_oapi("oapi.yaml")
schema_paths = extract_post_paths(spec)

failed = False
for f in files:
f = f.strip()
parts = Path(f).parts
if not parts or parts[0] not in SCHEMA_MAP:
continue

folder = parts[0]
schema_name = SCHEMA_MAP[folder]
path = schema_paths.get(schema_name)
if not path:
print(f"No POST path found for schema {schema_name}, skipping {f}")
failed = True
continue

url = f"{base_url}{path}"

if not Path(f).exists():
print(f"{f}: File not found")
failed = True
continue

try:
data = load_doc(f)
except Exception as e:
print(f"{f}: Failed to parse: {e}")
failed = True
continue

status, body = post(url, data, api_key)
if 200 <= status < 300:
print(f"{f} → {url} ({status})")
else:
print(f"{f} → {url} ({status}): {body[:200]}")
failed = True

return 1 if failed else 0


if __name__ == "__main__":
sys.exit(main())
Loading