Skip to content

fix: honour vision=false by stripping image_url blocks before API call#1717

Open
Veyal wants to merge 1 commit into
agent0ai:mainfrom
Veyal:fix/vision-flag-strips-images-from-messages
Open

fix: honour vision=false by stripping image_url blocks before API call#1717
Veyal wants to merge 1 commit into
agent0ai:mainfrom
Veyal:fix/vision-flag-strips-images-from-messages

Conversation

@Veyal

@Veyal Veyal commented Jun 22, 2026

Copy link
Copy Markdown

Problem

When a user sets vision: false on their chat model, Agent Zero still forwards image_url content blocks to the LLM provider. This causes providers that do not support image input (most text-only models on OpenRouter and others) to return:

{"error":{"message":"No endpoints found that support image input","code":404}}

The vision field on ModelConfig was defined but never consulted when building the message payload in LiteLLMChatWrapper._convert_messages. All message content, including images, was passed through images.prepare_content() unconditionally.

Fix

helpers/images.py — adds strip_images(content) which removes image_url blocks from a content list:

  • Image-only content (no text) → returns "" instead of [] to avoid empty-content API errors
  • Single remaining text block → collapsed from [{"type":"text","text":"..."}] back to a plain string
  • Multiple remaining text blocks → returned as-is

models.py_convert_messages now reads self.a0_model_conf.vision and calls strip_images() when vision is disabled.

Tests

7 new unit tests added to tests/test_vision_load_image_refs.py covering:

  • String passthrough (no-op)
  • Image-only list → ""
  • Multi-image-only list → ""
  • Mixed text+image → collapsed to plain string
  • Multiple text blocks with image → list with images removed
  • Text-only list → collapsed to plain string

All 7 tests pass.

Reproduction

  1. Set chat model to any text-only model (e.g. deepseek/deepseek-r2) on OpenRouter
  2. Disable vision in settings
  3. Send a message — if any prior message in context contains an image attachment, the error No endpoints found that support image input is returned

The vision flag on ModelConfig was defined but never consulted when
building the message payload sent to LiteLLM. Any conversation with
image attachments forwarded image_url content to the model even when
vision was disabled, causing OpenRouter (and other providers) to return:
  {"error":{"message":"No endpoints found that support image input"}}

- Add strip_images() to helpers/images.py that removes image_url blocks
  from a content list and collapses a single leftover text block back to
  a plain string.
- In LiteLLMChatWrapper._convert_messages, read vision from
  a0_model_conf and call strip_images() when vision is False.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant