Skip to content

prompt-rag JSONL parser fails on Claude/non-OpenAI providers (document-rag times out) #900

@beca-oc

Description

@beca-oc

Version: trustgraph-flow 2.3.21 (Docker image docker.io/trustgraph/trustgraph-flow:2.3.21)

Repro

  1. Configure text-completion + text-completion-rag processors with trustgraph.model.text_completion.claude.Processor instead of openai.Processor.
  2. Set CLAUDE_KEY env, restart workers, restart flow with --param llm-model=claude-haiku-4-5-20251001 --param llm-rag-model=claude-haiku-4-5-20251001.
  3. Verify bare LLM works:
    tg-invoke-llm -f sizzl-ontology "" "Say PONG"
    # -> "PONG" in ~2s, HTTP 200 from api.anthropic.com
    
  4. Try tg-invoke-document-rag:
    tg-invoke-document-rag -u http://api-gateway:8088/ -f sizzl-ontology -C sizzl-market -d 2 -q "What does X sell?"
    # -> Exception (empty body) at ~33s
    

Observed logs (docker logs deploy-rag-1)

prompt-rag - WARNING - JSONL parse error on line 1: Expecting value: line 1 column 1 (char 0)
prompt-rag - WARNING - JSONL parse error on line 3: Expecting value: line 1 column 1 (char 0)
prompt-rag - WARNING - JSONL parse error on line 5: Expecting value: line 1 column 1 (char 0)
prompt-rag - WARNING - JSONL parse returned no valid objects
... (repeats)

document-rag - ERROR - Exception processing response:
asyncio.exceptions.CancelledError ...
TimeoutError

text-completion-rag DID get HTTP 200 from Anthropic with token counts logged. The response is plain prose. prompt-rag's parser expects JSONL, gets nothing parseable, returns empty, document-rag times out at 30s.

Root cause

The kg-synthesis / prompt-rag pipeline assumes JSONL output. OpenAI models follow that instruction; Claude (and likely Gemini, Mistral) return prose by default. There's no provider-specific output coercion.

Suggested fix paths

  1. Per-provider prompt overlays — register the kg-synthesis template with an Anthropic-friendly variant that uses pre-fill (messages=[..., {"role":"assistant","content":"{"}]) to force JSON.
  2. Output coercion in prompt-rag worker — if the response doesn't parse as JSONL, attempt to extract the JSON block from a fenced code block (e.g. json ... ), then re-parse.
  3. Document this limitation in the provider docs so operators know Anthropic/Gemini are not drop-in replacements for OpenAI in RAG flows.

I can prepare a PR for (2) if helpful.

Workaround we're using

We bypass the bus on the read path entirely (direct Bolt to Memgraph + direct provider call from the app tier). Works fine; loses TG-mediated tracing.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions