fix(analyzer): parse structured output from text so OpenAI-compatible endpoints don't crash LLM analysis by bmd1905 · Pull Request #71 · NVIDIA/SkillSpector

bmd1905 · 2026-06-15T09:40:24Z

What this fixes

Closes #69.

When you point SKILLSPECTOR_PROVIDER=openai at an OpenAI-compatible endpoint through OPENAI_BASE_URL, the semantic stage can crash the whole scan with a JSON decode error. Static analysis still runs, but every LLM finding is silently lost.

The cause is in how the analyzers build their structured-output client:

self._structured_llm = self._llm.with_structured_output(self.response_schema)

ChatOpenAI.with_structured_output defaults to method="json_schema", i.e. strict response_format. An endpoint can implement the OpenAI API faithfully and still not honor that, and plenty don't. The ones I hit return valid JSON, just wrapped in a markdown fence:

```json
{"findings": []}


or with a sentence of prose on either side. The strict path hands that straight to `json.loads`, which raises, and the scan exits with `Error: Expecting value: line 1 column 1 (char 0)` or `Invalid JSON ... for LLMAnalysisResult`. Every analyzer that inherits `LLMAnalyzerBase` is affected, including the meta-analyzer.

## The change

I dropped the strict client in favor of the canonical langchain chain:

```python
self._parser = PydanticOutputParser(pydantic_object=self.response_schema)
self._structured_llm = self._llm | self._parser
self._format_instructions = self._parser.get_format_instructions()

PydanticOutputParser already uses langchain's fence-tolerant parse_json_markdown, and it validates into the same Pydantic schema, so parse_response does not change for either the per-file analyzers or the meta-analyzer. Strict endpoints keep working exactly as before; lenient ones now work too.

I also moved the format guidance into one place. A small _with_format helper appends get_format_instructions() at the run-loop chokepoint, so the default prompt and the meta-analyzer's own prompt both get it without each template having to embed JSON instructions by hand. It is a no-op in raw-string mode.

I deliberately did not hand-roll a fence stripper or a brace slicer. langchain's parser already covers the fenced, bare, and prose-wrapped cases, and reusing it keeps the surface small.

Tests

Added a regression suite that pushes bare, fenced, and prose-wrapped JSON through the real PydanticOutputParser via a tiny fake Runnable, plus an empty-findings case and a check that the format instructions land on the prompt.
Updated the three existing tests that asserted the now-removed with_structured_output seam. They use the same direct _structured_llm mock pattern that the other run-loop tests already rely on.
make test: 605 passed, 11 skipped. make lint and make format are clean.

How I verified it end to end

Against a non-strict OpenAI-compatible endpoint, main aborts in the semantic stage with the JSON error above. With this change the same scan completes and returns LLM-backed findings. Repeated runs show no more decode crashes; the only non-zero exits are the intended risk_score > 50 gate (exit 1), not parser errors (exit 2).

…e endpoints The LLM analyzers built their structured-output client with `ChatOpenAI.with_structured_output(schema)`, which defaults to `method="json_schema"` (strict `response_format`). Endpoints that speak the OpenAI API but do not enforce that contract are free to ignore it, and several return the JSON wrapped in a markdown code fence or surrounded by prose. The strict parser then receives non-JSON and raises, so every semantic analyzer fails and the whole scan exits with a JSON decode error. Static analysis still runs, but all LLM findings are lost. Swap the strict client for the canonical langchain chain `llm | PydanticOutputParser`. The parser uses langchain's fence-tolerant `parse_json_markdown` under the hood and validates into the same Pydantic schema, so `parse_response` is unchanged for both the per-file analyzers and the meta-analyzer. Strict endpoints keep working; lenient ones now work too. Inject the parser's `get_format_instructions()` through a single `_with_format` helper at the run-loop chokepoint so the default prompt and the meta-analyzer's overridden prompt both get JSON guidance without duplicating it. Add regression tests that drive fenced and prose-wrapped JSON through the real parser, and update the three tests that asserted the removed `with_structured_output` seam to the direct-mock pattern the other tests use. Closes NVIDIA#69 Signed-off-by: Minh-Duc Bui <minhduc2k31905@gmail.com>

rng1995

LGTM. This is a well-diagnosed fix. Swapping the strict with_structured_output(method="json_schema") client for self._llm | PydanticOutputParser(...) plus appended get_format_instructions() is the right call: langchain's parse_json_markdown tolerates fenced / prose-wrapped JSON, so OpenAI-compatible endpoints that don't honor strict response_format no longer crash the whole semantic stage, while the output still validates into the same Pydantic schema. Centralizing the instruction-append in _with_format so subclasses (the meta-analyzer) inherit it is clean.

Tests are solid — _FakeChatModel | parser exercises the real parser end-to-end across bare JSON, fenced JSON, and prose-wrapped JSON, plus the empty-findings and format-instructions cases, and the meta-analyzer tests were updated consistently.

Minor / optional (non-blocking):

A genuinely malformed (non-JSON) response will now raise OutputParserException from the chain. Since the stated goal is to stop endpoint quirks from crashing the scan, it's worth confirming there's a try/except around the invoke/ainvoke so a single bad batch degrades gracefully (skip/log) rather than aborting the run.
Heads-up: the branch is currently conflicted (not mergeable) and will need a rebase.

rng1995 · 2026-06-22T19:42:07Z

@bmd1905 - Please resolve the conflicts, minor issues and merge the PR.

TroyOMTV mentioned this pull request Jun 18, 2026

LLM stage crashes on OpenAI-compatible/local endpoints (Ollama): models return confidence 0-100 but output schema validates 0-1 #89

Closed

rng1995 approved these changes Jun 21, 2026

View reviewed changes

This was referenced Jun 23, 2026

[Feature] Add Ollama provider for free local/offline LLM scanning #173

Open

[Feature] Add generic OpenAI-compatible provider for Groq, Together AI, Mistral, and other endpoints #175

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(analyzer): parse structured output from text so OpenAI-compatible endpoints don't crash LLM analysis#71

fix(analyzer): parse structured output from text so OpenAI-compatible endpoints don't crash LLM analysis#71
bmd1905 wants to merge 1 commit into
NVIDIA:mainfrom
bmd1905:fix/openai-compatible-structured-output

bmd1905 commented Jun 15, 2026

Uh oh!

rng1995 left a comment

Uh oh!

rng1995 commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bmd1905 commented Jun 15, 2026

What this fixes

Tests

How I verified it end to end

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

rng1995 commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants