feat(backend): add ControlVerificationTemplate table and CRUD endpoints by AaronAlijani · Pull Request #229 · Hardhat-Enterprises/AutoAudit

AaronAlijani · 2026-05-08T10:20:06Z

Summary

Phase 1 of the semi-automated manual verification system. Adds a per-control template table that stores auditor instructions, keywords, severity, and expected evidence type for the 14 manual CIS controls that AutoAudit cannot automate via the M365 collectors.

This is the foundation that later phases build on:

Phase 2 (subsequent commits to this branch), confidence scoring algorithm, seed the 14 templates, link evidence uploads to scan_result, auditor workflow.
Phase 3 (subsequent commits), status propagation from verdict to scan_result.status, scan finalisation for manual controls.

Companion to Ash Alijani's GRC documentation work on classifying manual controls and authoring the per-control templates JSON.

Type of Change

New feature

Affected Components

/backend-api

Motivation

Khan asked us to connect the evidence scanner with manual verification to reduce auditor workload. The idea: each manual control gets a verification template with specific auditor instructions and per-control keywords, so when an auditor uploads evidence, the existing OCR and keyword matching can validate it automatically and suggest a verdict with a confidence score. The auditor confirms or overrides.

This PR delivers the foundation, the table that holds those templates and the CRUD endpoints to manage them.

What's Included

New table control_verification_template (Alembic revision 8a7b91ea95d9, autogenerated)
- control_id (unique, indexed)
- title, instructions, keywords (JSONB), severity, evidence_type
- created_at / updated_at with now() defaults
Pydantic schemas: Create, Update, Read
CRUD endpoints under /v1/verification-templates:
- POST / create (admin only)
- GET / list all
- GET /{control_id} fetch one
- PATCH /{control_id} partial update (admin only)
- DELETE /{control_id} delete (admin only)
Write operations gated via the existing require_admin permission
Read operations available to any authenticated user (auditors/viewers need to see instructions in the UI)
Unique constraint on control_id returns 409 Conflict on duplicate creates

Testing Done

Tested manually — describe how:

Smoke tested locally end-to-end via curl:

alembic upgrade head applied the migration cleanly with no errors
POST /v1/verification-templates/ with admin token created a template for control 1.1.2 and returned 201 Created with the full record including id, created_at, updated_at
GET /v1/verification-templates/1.1.2 returned 200 OK with the same record, JSONB keywords array intact
Duplicate POST for the same control_id returned 409 Conflict with the expected error message, confirms unique constraint and conflict-handling logic
Confirmed the new endpoints appear in /docs (Swagger) under the "Verification Templates" tag

Security Considerations

All write endpoints (POST / PATCH / DELETE) gated to admin role via require_admin from app/core/permissions.py (existing pattern).
All read endpoints require an authenticated user via get_current_user.
No secrets, tokens, or sensitive data introduced.
keywords field is stored as JSONB, same pattern as EvidenceValidation.matches_json and ScanResult.evidence, so no new attack surface.

Breaking Changes

No breaking changes

The new table is additive. No existing schemas, contracts, or workflows are modified.

Rollback Plan

Revert commit is sufficient

If a rollback is required:

docker compose exec backend-api uv run alembic downgrade j1k2l3m4n567

The migration's downgrade() drops the new index and the new table cleanly.

Checklist

Code follows project conventions
No secrets, credentials, or tokens committed
Relevant documentation updated (if applicable), docs are Ash's parallel work
CI/CD workflows pass on this branch
PR is focused on one thing

Screenshots

All 5 endpoints visible in Swagger UI under the "Verification Templates" tag:

GET request returns the created template with all fields and timestamps:

Phase 1 of the semi-automated manual verification system. Adds a per-control template that stores auditor instructions, keywords, severity, and expected evidence type for the 14 manual CIS controls. - New table control_verification_template (Alembic 8a7b91ea95d9) - Pydantic schemas: Create, Update, Read - CRUD endpoints under /v1/verification-templates - Write operations (POST/PATCH/DELETE) gated to admin role - Read operations available to any authenticated user - Unique constraint on control_id with 409 Conflict on duplicates Smoke-tested locally: POST creates row, GET returns it, duplicate POST returns 409. Refs: Khan's product brief on manual control verification.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 358f6754df

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-08T10:21:53Z

+    update_fields = data.model_dump(exclude_unset=True)
+    for field, value in update_fields.items():
+        setattr(template, field, value)


Reject nulls before applying template updates

When a PATCH request explicitly sends null for non-nullable fields such as title, instructions, keywords, or severity, the update schema accepts it and exclude_unset=True still includes that field, so this loop writes None to columns declared nullable=False. In that scenario the subsequent commit raises a database integrity error and returns a 500 instead of a client error; either disallow None in the update schema or reject provided nulls before assigning.

Useful? React with 👍 / 👎.

du-dhartley

@AaronAlijani Great work on this one so far. The concept makes sense and has been well thought through.
There are couple of comments on specific lines in files to address, however there is one major issue that needs to be considered before this can be merged.

The short summary is that control_id alone is not a suitable unique identifier and we're completely missing the concept of a benchmark/framework version when saving this info.

When a user runs a scan they're selecting a framework, a benchmark and a version, and the policies directory already carries v3.1.0, v4.0.0 and v6.0.0 of the M365 benchmark side by side. CIS renumbers and rewrites controls between versions, so "1.1.2" in v3 isn't the same control as "1.1.2" in v6, and once the Azure benchmarks come online we'll have a second framework sharing the same numeric space. The new table keys uniquely on control_id alone, which means it can only hold one set of instructions per control across the whole platform — there's no way to say "these are the v6.0.0 instructions for 1.1.2" distinctly from the v3.1.0 ones. That'll bite us in Phase 2 the moment we try to seed: either we pick one version and quietly mismatch every other scan, or the second seed run trips the unique constraint. Could you fold framework, benchmark and version onto the table now, and make uniqueness composite across those plus control_id? The lookup pattern in scans.py already passes that tuple through end to end, so it'll plug in cleanly and keep the Phase 2 seeding honest about which version each template belongs to.

du-dhartley · 2026-05-10T06:46:24Z

 from app.models.evidence_validation import EvidenceValidation
 from app.models.contact import ContactSubmission, SubmissionNote, SubmissionHistory
 from app.models.user_settings import UserSettings
+from app.models.user_settings import UserSettings


This line is a duplicate of the one above and should be removed

du-dhartley · 2026-05-10T06:48:30Z

+    keywords: Mapped[list] = mapped_column(JSONB, nullable=False)
+
+    # Risk severity: "high" | "medium" | "low".
+    severity: Mapped[str] = mapped_column(String(20), nullable=False)


The comment above shows what the values should be, but there's no actual enforcement that what's coming in here matches those values. If this is what we expect to see, we should enforce this - Pydantic supports the concept of a Literal that does this, such as

from typing import Literal Severity = Literal["high", "medium", "low"]

AaronAlijani · 2026-05-14T03:17:24Z

Hi David, thanks for the thorough review. All four points addressed in the latest commit:

Composite key : control_id alone is not enough. Templates are now keyed on the full (framework, benchmark, version, control_id) tuple, mirroring the same fields Scan already carries (String(50)/(100)/(20) respectively). Column types and lookup shape match scans.py end-to-end, so Phase 2 seeding can pass the tuple straight through. The unique constraint is composite (uq_template_framework_benchmark_version_control), and control_id is still indexed for lookups, but no longer uniquely so. Endpoints now use the path pattern /{framework}/{benchmark}/{version}/{control_id}, and the list endpoint accepts optional framework/benchmark/version query filters for scoped seeding.
Severity enforcement. Added Severity = Literal["high", "medium", "low"] (and EvidenceType = Literal["screenshot", "pdf_export", "comment_only"] for the same reason) on Create and Update. Invalid values now return 422 at the API layer instead of being silently accepted as strings. DB column kept as String(20) for forward-compat.
PATCH null handling (Codex review). Added a @model_validator on Update that rejects explicit nulls on title, instructions, keywords, and severity (the non-nullable columns). Clients now get a clean 422 instead of triggering a 500 through the integrity error.
Duplicate UserSettings import. Removed.

Migration chain note: PR #166 and this PR both branch from j1k2l3m4n567 as their down_revision. That's fine while they're both unmerged, but whichever merges second will need its down_revision updated to point at the first one's revision, so Alembic has a single linear head.

du-dhartley

@AaronAlijani Thanks for the changes, this PR looks good now.
I've approved this, but agree with your comment about the migration - if you merge this one, after you address the last remaining comment in your other PR, push up the migration change and that can be approved also.

github-actions · 2026-06-04T07:09:21Z

CI: Backend API

Job	Result
Security analysis (CodeQL + Bandit)	❌ `failure`
Lint	✅ `success`

One or more checks failed. View logs

AaronAlijani requested review from 6igby, du-dhartley and liyunze-coding as code owners May 8, 2026 10:20

chatgpt-codex-connector Bot reviewed May 8, 2026

View reviewed changes

du-dhartley requested changes May 10, 2026

View reviewed changes

du-dhartley mentioned this pull request May 10, 2026

docs(compliance): add manual control verification templates for CIS M… #227

Open

24 tasks

feat: key on (framework, benchmark, version, control_id) tuple

24c1216

du-dhartley self-requested a review May 22, 2026 11:59

du-dhartley approved these changes May 22, 2026

View reviewed changes

docs: add manual control verification user guide

d6a7b53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backend): add ControlVerificationTemplate table and CRUD endpoints#229

feat(backend): add ControlVerificationTemplate table and CRUD endpoints#229
AaronAlijani wants to merge 3 commits into
mainfrom
feature/26T1-BAC-AA-verification-templates

AaronAlijani commented May 8, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Uh oh!

du-dhartley left a comment

Uh oh!

du-dhartley May 10, 2026

Uh oh!

du-dhartley May 10, 2026

Uh oh!

AaronAlijani commented May 14, 2026

Uh oh!

du-dhartley left a comment

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AaronAlijani commented May 8, 2026

Summary

Type of Change

Affected Components

Motivation

What's Included

Testing Done

Security Considerations

Breaking Changes

Rollback Plan

Checklist

Screenshots

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

du-dhartley left a comment

Choose a reason for hiding this comment

Uh oh!

du-dhartley May 10, 2026

Choose a reason for hiding this comment

Uh oh!

du-dhartley May 10, 2026

Choose a reason for hiding this comment

Uh oh!

AaronAlijani commented May 14, 2026

Uh oh!

du-dhartley left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 4, 2026

CI: Backend API

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants