feat: drop ge/le schema bounds on LLM finding confidence and start_line by wernerkasselman-au · Pull Request #133 · NVIDIA/SkillSpector

wernerkasselman-au · 2026-06-21T23:44:52Z

Closes #132. Split out of #19 at @rng1995's request, since it changes the LLM structured-output contract and deserves its own rationale.

Problem

LLMFinding and MetaAnalyzerFinding express their numeric ranges with Pydantic ge/le (start_line >= 1, confidence in [0.0, 1.0]). Pydantic emits those as JSON-schema minimum / maximum in the response schema. Some OpenAI-compatible structured-output and tool-calling endpoints validate that schema and reject minimum / maximum, so the LLM call fails before a single finding comes back.

Change

Drop the ge/le Field bounds on LLMFinding.start_line, LLMFinding.confidence, and MetaAnalyzerFinding.confidence, and convey the ranges in the field descriptions.
Enforce the ranges with runtime field_validators that clamp into range: confidence to [0.0, 1.0], start_line to >= 1.

Clamping rather than raising is deliberate: an LLM occasionally returns a value just outside the range (confidence 1.01, or start_line 0 for a whole-file finding), and normalising it is better than a ValidationError that fails the whole structured-output parse and drops the batch.

Result

The emitted JSON schema no longer carries minimum / maximum, so it stays portable across structured-output backends, and the in-code [0, 1] / >= 1 guarantee is unchanged. Tests assert both: that neither model's model_json_schema() contains minimum or maximum, and that the validators clamp.

Notes

625 passed, 12 skipped, ruff clean. The two existing tests that asserted a ValueError on out-of-range confidence now assert the clamp.
meta_analyzer already imported field_validator (for the overall_assessment parser), so this reuses it.

Pydantic ge/le constraints emit JSON-schema minimum/maximum. Some OpenAI-compatible structured-output and tool-calling endpoints reject a response schema that carries those keywords, so the LLM call fails before any finding is returned. Express the ranges in the field descriptions and enforce them with runtime field_validators that clamp into range instead. Clamping rather than raising means a slightly out-of-range model value normalises rather than failing the whole structured-output parse and dropping the batch: - LLMFinding.confidence and MetaAnalyzerFinding.confidence clamp to [0.0, 1.0] - LLMFinding.start_line clamps to >= 1 Only the numeric bound is removed, not the requiredness: start_line stays a required field (no default), so a finding with no location is still rejected rather than materialised at line 1. The emitted JSON schema no longer contains minimum/maximum, so it stays portable across structured-output backends, while the in-code guarantee is unchanged. Adds tests asserting the schema carries no numeric bounds, that start_line stays required, and that the validators clamp. Signed-off-by: Werner Kasselman <145896621+wernerkasselman-au@users.noreply.github.com>

rng1995

Review: APPROVE

Thanks for splitting this out of #19 with its own rationale. The change is focused, the intent is well-documented in both the code comments and the description, and the tests lock in the guarantees that matter.

Verification against the stated contract

(a) ge/le removed — Confirmed on LLMFinding.start_line, LLMFinding.confidence, and MetaAnalyzerFinding.confidence; each is now a bare Field(description=...) with the range conveyed in the description.
(b) Validators genuinely clamp — _clamp_confidence returns min(1.0, max(0.0, v)) ([0.0, 1.0]) and _clamp_start_line returns v if v >= 1 else 1 (>= 1). Both are @field_validator in the default ("after") mode, so they run on the already-coerced numeric value, which is the right choice here — a "before" validator would see raw/uncoerced input. Non-numeric input still fails type coercion before reaching the clamp, which is the desired behavior.
(c) start_line stays required — LLMFinding.start_line has no default, so a finding with no location is still rejected rather than silently materialized at line 1. test_llm_finding_start_line_is_required pins this down. Good, explicit separation between "drop the numeric bound" and "drop requiredness".
(d) Emitted schema is clean — The recursive _numeric_keywords walk asserts neither model's model_json_schema() contains minimum/maximum, which is exactly the portability goal.

Design trade-off (clamp vs. raise)

Reasonable for this contract. A single value that lands a hair outside the range shouldn't fail the whole structured-output parse and drop the rest of the batch. Confidence is advisory, and snapping a whole-file finding's start_line 0 -> 1 keeps the finding instead of losing it over an off-by-one.

Minor / optional (non-blocking): the clamp is silent, so a genuinely misbehaving model (e.g. confidence=5.0 -> 1.0, or start_line=-50 -> 1) is normalized with no trace. For a security tool, emitting a logger.debug/warning when a value is actually out of range (vs. within range) would preserve that signal without changing the resilience behavior. Not required to merge.

Tests

Good coverage: schema-has-no-bounds for both models, clamp high/low for both, start_line clamp, and the required-field guarantee. Converting the two prior "raises on confidence 1.5" tests into clamp assertions is a deliberate, documented contract change, and the updates are consistent with it.

Code quality / compatibility

Clear and minimal. It reuses the field_validator import already present in the meta-analyzer module (used by the existing assessment parser), the comments explain the "why" rather than the "what", and there's no dead code. The only behavior change for callers is the raise -> clamp contract on out-of-range numeric input, which is intentional and covered by the updated tests.

LGTM.

rng1995 approved these changes Jun 22, 2026

View reviewed changes

rng1995 merged commit 3df201b into NVIDIA:main Jun 22, 2026

wernerkasselman-au mentioned this pull request Jun 22, 2026

Bound build_context file reads and fail closed on oversized inputs #19

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: drop ge/le schema bounds on LLM finding confidence and start_line#133

feat: drop ge/le schema bounds on LLM finding confidence and start_line#133
rng1995 merged 1 commit into
NVIDIA:mainfrom
wernerkasselman-au:fix/structured-output-schema

wernerkasselman-au commented Jun 21, 2026

Uh oh!

rng1995 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

wernerkasselman-au commented Jun 21, 2026

Problem

Change

Result

Notes

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Review: APPROVE

Verification against the stated contract

Design trade-off (clamp vs. raise)

Tests

Code quality / compatibility

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants