Summary
The inference layer attempts Responses API fallback on ollama, but ollama has no /v1/responses endpoint. Every attempt returns 404 and fires a Sentry error event — 1,598 events and escalating.
Problem
What happened: When a chat completions request to ollama fails, the code falls back to the Responses API (/v1/responses). Ollama doesn't have this endpoint, so every fallback returns 404 page not found. Each 404 fires a Sentry error.
Expected: Either:
- Don't attempt Responses API fallback for ollama (it will never work)
- Or suppress the 404 from Sentry since it's an expected fallback failure
Impact: 1,598 events in Sentry (TAURI-RUST-59Y), escalating. Tags: provider=ollama, operation=responses_api, model=gemma3:1b-it-qat.
Version / Platform: openhuman@0.56.0, macOS 26.5.0
Solution (optional)
Check the compatible provider's fallback logic in src/openhuman/inference/provider/compatible.rs:
- Is there a flag to disable Responses API for specific providers?
- Can ollama be marked as "no Responses API support"?
- Or add "404 page not found" from the Responses API fallback to the config-rejection/noise suppression list
Acceptance criteria
Related
Summary
The inference layer attempts Responses API fallback on ollama, but ollama has no
/v1/responsesendpoint. Every attempt returns 404 and fires a Sentry error event — 1,598 events and escalating.Problem
What happened: When a chat completions request to ollama fails, the code falls back to the Responses API (
/v1/responses). Ollama doesn't have this endpoint, so every fallback returns404 page not found. Each 404 fires a Sentry error.Expected: Either:
Impact: 1,598 events in Sentry (TAURI-RUST-59Y), escalating. Tags: provider=ollama, operation=responses_api, model=gemma3:1b-it-qat.
Version / Platform: openhuman@0.56.0, macOS 26.5.0
Solution (optional)
Check the compatible provider's fallback logic in
src/openhuman/inference/provider/compatible.rs:Acceptance criteria
Related