feat: Add LLM Confidence Validation and Human-In-The-Loop review (Fixes #222) by RITVIKKAMASETTY · Pull Request #223 · fireform-core/FireForm

RITVIKKAMASETTY · 2026-03-11T21:13:41Z

What does this PR do?

This PR addresses the critical liability risk of the LLM silently hallucinating missing or ambiguous fields (like names, badge numbers, or incident codes) directly onto official PDF documents.

It implements a "Human-in-the-loop" validation pipeline by updating the LLM extraction to output structured JSON with confidence scores.

Changes Made

Structured LLM Output: src/llm.py now uses prompt engineering to guarantee Mistral returns JSON ({"value": "...", "confidence": 0.95}).
Confidence Thresholding: Fields with confidence < 0.85 are intercepted instead of blindly trusted.
Fail-Safe PDF Generation: src/filler.py now maps values by explicit semantic field names, and writes [REVIEW REQUIRED] into the PDF for any low-confidence fields so responders can spot them instantly.
API Handoff: Added a needs_review JSON column to api/db/models.py and the FastAPI response schema so the frontend can highlight flagged fields in the UI.

Testing Performed

Added a full unit testing suite tests/test_llm_confidence.py covering high/low confidence branching, edge cases, and JSON parse failures (6 passing tests).
Manually verified end-to-end extraction against ambiguous transcripts using the Ollama Mistral model.

Fixes #222

feat: add LLM confidence validation and Human-in-the-loop review

f114fe9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add LLM Confidence Validation and Human-In-The-Loop review (Fixes #222)#223

feat: Add LLM Confidence Validation and Human-In-The-Loop review (Fixes #222)#223
RITVIKKAMASETTY wants to merge 1 commit intofireform-core:mainfrom
RITVIKKAMASETTY:feat/llm-confidence-validation

RITVIKKAMASETTY commented Mar 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RITVIKKAMASETTY commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Changes Made

Testing Performed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RITVIKKAMASETTY commented Mar 11, 2026 •

edited

Loading