Summary
Add AI refinement as the next correction layer after US-009. AI must not re-grade the whole exercise or create a new diff from scratch. It only receives unresolved Pro comparison ranges after static comparison and deterministic cleanup, then removes, shrinks, or splits those ranges so the UI highlights the actual correction more precisely.
Free and anonymous users never reach this AI step.
Key Behavior
Correction pipeline:
- Load server-owned
originalText from propositionId.
- Run current static comparison.
- Run US-009 deterministic cleanup for Pro users.
- If no unresolved comparisons remain, return without AI.
- If unresolved comparisons remain and user is Pro, call AI refinement.
- Validate AI output enough to protect the UI.
- Return
ai_refined on valid output or fallback on invalid/error output.
AI may:
- remove unresolved comparisons that are actually equivalent
- shrink broad ranges
- split one broad comparison into multiple smaller comparisons
AI must not:
- rewrite
originalText or userText
- create corrections outside pre-AI unresolved comparison spans
- invent new mistakes unrelated to static output
- handle billing limits or daily usage caps in this story
API And Types
Keep the submit request from US-006/009:
Keep existing response fields:
originalText
userText
comparisons
accuracyPercentage
Use the US-009 metadata:
correctionMode: static, normalized, ai_refined, or fallback
aiAttempted: true only when AI was called
Add an internal structured AI response model, not exposed directly to the frontend:
- refined comparisons with original/user start and end indexes
- optional reason/debug field for logs only
Use Microsoft.Extensions.AI structured output through IChatClient.GetResponseAsync<T>().
Implementation Changes
- Add an AI refinement service, for example
ITextComparisonAiRefiner.
- The AI refiner input is full original text, full user text, and unresolved
TextComparison ranges after US-009 cleanup.
- The AI refiner output is refined ranges only.
- Reuse the existing
IChatClient registered in the Web API.
- Update the correction orchestration service from US-009.
- Pro users with unresolved comparisons call the AI refiner.
- Free/anonymous users never call the AI refiner.
- Pro users with zero unresolved comparisons skip AI and keep
aiAttempted = false.
Prompt contract:
- Tell the model it is refining highlight ranges for a listen-and-write correction UI.
- Provide full texts for index context.
- Provide only unresolved comparison ranges/snippets.
- Instruct it to return only corrections that should still be shown.
- Instruct it to omit equivalent ranges.
- Instruct it to shrink or split broad ranges when only part of the range is wrong.
Build final TextComparison objects server-side:
- Trust AI ranges only after validation.
- Slice
OriginalText and UserText from the full texts on the server.
- Recalculate
accuracyPercentage.
- Preserve full
originalText and userText.
Minimum Validation And Fallback
US-010 must include minimum validation because invalid AI output must not corrupt highlights.
Accept AI output only if:
- structured output deserializes successfully
- ranges are inside original/user text bounds
initialIndex <= finalIndex
- ranges are not empty unless explicitly supported by the current UI
- every returned range is contained inside one of the unresolved pre-AI comparison spans
- snippets can be sliced safely from the full texts
If AI throws, returns invalid data, or validation fails:
- return the pre-AI comparison result from static + US-009 cleanup
- set
correctionMode = fallback
- set
aiAttempted = true
Full edge-case hardening, overlap rules, fuzzing, timeout policy, and detailed invalid-response telemetry remain for US-011/012/015.
Acceptance Criteria
- Static comparison still runs first.
- US-009 deterministic cleanup runs before AI.
- AI is called only for Pro users with unresolved comparisons.
- AI receives only full texts plus unresolved comparison ranges/snippets.
- AI uses
Microsoft.Extensions.AI structured output.
- AI can remove, shrink, or split unresolved correction ranges.
- Invalid AI output falls back to the pre-AI result.
- Free and anonymous users never call AI.
- Existing result UI remains compatible.
Test Plan
Pro AI path:
- Pro user with unresolved comparisons calls AI.
- Pro user with no unresolved comparisons skips AI.
- AI can remove an equivalent unresolved comparison.
- AI can shrink
as some think. A vs as something. A.
- AI can split a broad range such as
a cruise ship. There vs a cruseship, there.
- AI keeps real mistakes such as
says vs saids.
Safety/fallback:
- Invalid structured output returns fallback.
- Out-of-bounds ranges return fallback.
- Ranges outside pre-AI spans return fallback.
- Final result preserves full
originalText and userText.
- Accuracy is recalculated from accepted final comparisons.
Non-Pro behavior:
- Free users never call AI.
- Anonymous users never call AI.
- Existing static result rendering remains compatible.
Assumptions
- US-006 and US-009 are already implemented before US-010.
- “AI runs on every Pro submission” means every Pro submission that still has unresolved comparisons after deterministic cleanup.
- US-010 includes minimum validation required for safe fallback, while US-011 owns exhaustive validation edge cases.
- US-012 owns strict timeout policy.
- US-014 owns daily usage limits.
- US-015 owns detailed AI cost and quality telemetry.
Summary
Add AI refinement as the next correction layer after US-009. AI must not re-grade the whole exercise or create a new diff from scratch. It only receives unresolved Pro comparison ranges after static comparison and deterministic cleanup, then removes, shrinks, or splits those ranges so the UI highlights the actual correction more precisely.
Free and anonymous users never reach this AI step.
Key Behavior
Correction pipeline:
originalTextfrompropositionId.ai_refinedon valid output orfallbackon invalid/error output.AI may:
AI must not:
originalTextoruserTextAPI And Types
Keep the submit request from US-006/009:
propositionIduserTextKeep existing response fields:
originalTextuserTextcomparisonsaccuracyPercentageUse the US-009 metadata:
correctionMode:static,normalized,ai_refined, orfallbackaiAttempted:trueonly when AI was calledAdd an internal structured AI response model, not exposed directly to the frontend:
Use
Microsoft.Extensions.AIstructured output throughIChatClient.GetResponseAsync<T>().Implementation Changes
ITextComparisonAiRefiner.TextComparisonranges after US-009 cleanup.IChatClientregistered in the Web API.aiAttempted = false.Prompt contract:
Build final
TextComparisonobjects server-side:OriginalTextandUserTextfrom the full texts on the server.accuracyPercentage.originalTextanduserText.Minimum Validation And Fallback
US-010 must include minimum validation because invalid AI output must not corrupt highlights.
Accept AI output only if:
initialIndex <= finalIndexIf AI throws, returns invalid data, or validation fails:
correctionMode = fallbackaiAttempted = trueFull edge-case hardening, overlap rules, fuzzing, timeout policy, and detailed invalid-response telemetry remain for US-011/012/015.
Acceptance Criteria
Microsoft.Extensions.AIstructured output.Test Plan
Pro AI path:
as some think. Avsas something. A.a cruise ship. Therevsa cruseship, there.saysvssaids.Safety/fallback:
originalTextanduserText.Non-Pro behavior:
Assumptions