feat(fetch-url): support fetching images as multimodal content by bj456736 · Pull Request #1100 · MoonshotAI/kimi-code

bj456736 · 2026-06-25T10:18:44Z

Summary

This PR enables the FetchURL tool to download images and return them as multimodal content, allowing vision-capable models to analyze images fetched from URLs.

Changes

UrlFetchKind: Added image kind to distinguish image responses from text passthrough/extracted content
UrlFetchResult: Added optional image field with { mimeType, base64 } for image data
LocalFetchURLProvider:
- Detects image content types from HTTP responses
- Downloads binary data via response.arrayBuffer()
- Converts to base64 and returns kind: image
- Respects the same 10 MiB size limit
FetchURLTool:
- When image field is present, returns ContentPart[] with image_url type
- Falls back to existing text behavior for non-image content
Tool description: Updated fetch-url.md to mention image support
Tests: Added test for image content kind detection in local-fetch-url.test.ts

Why this approach

Leverages existing multimodal infrastructure (ContentPart[] and image_url type)
Minimal changes to the tool interface — backward compatible for text URLs
Consistent with how ReadMediaFile handles image input
No changes needed to MoonshotFetchURLProvider — it falls back to local fetcher on failure

Testing

All 3172 existing tests pass + 1 new test for image detection
Verified image fetch returns correct base64 and mime type

- Extend UrlFetchKind with 'image' to distinguish image responses - Add image field to UrlFetchResult for base64-encoded image data - Update LocalFetchURLProvider to detect image/* content types and download binary data, converting to base64 - Update FetchURLTool to return ContentPart[] with image_url when fetching images, enabling multimodal models to 'see' the image - Update tool description to mention image support - Add test for image content kind detection Fixes: Fetch tool supports pulling images (P1 Backlog)

changeset-bot · 2026-06-25T10:18:51Z

⚠️ No Changeset found

Latest commit: a97fad4

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ed6d2a1a4d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-25T10:22:35Z

+            type: 'image_url',
+            imageUrl: { url: `data:${image.mimeType};base64,${image.base64}` },


Gate fetched images on image-capable models

When the active model lacks image_in, FetchURL is still registered whenever a URL fetcher exists (checked packages/agent-core/src/agent/tool/index.ts, where only ReadMediaFile is capability-gated). Fetching any image/* URL now returns an image_url part, which providers serialize as image input on the next request, so text-only aliases can fail after a successful fetch instead of receiving an actionable tool error. Pass model capabilities into this tool or degrade/error before emitting the image_url part.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-25T10:22:35Z

+      return {
+        content: '',
+        kind: 'image',
+        image: { mimeType: contentType, base64 },


Restrict fetched images to supported MIME types

This accepts every image/* content type and stores the full header as the MIME type. For URLs returning image/svg+xml, image/avif, or a header with parameters such as image/png; charset=binary, FetchURLTool emits a data: image URL that our Anthropic converter rejects because it only allows image/png, image/jpeg, image/gif, and image/webp (packages/kosong/src/providers/anthropic.ts:418-430). Restrict or sanitize the MIME type before returning kind: 'image', otherwise a successful fetch can make the following model request fail.

Useful? React with 👍 / 👎.

- Add PageMetadata to UrlFetchResult (url, mime, title) matching internal FetchResponse.page - Remove XML wrapping around image_url content, return clean image_url part directly - Add imageUrl.id from original URL for traceability - Update LocalFetchURLProvider to include page metadata for all content kinds - Update tests to match new return shape

itxaiohanglover · 2026-06-26T03:51:34Z

Nice work adding multimodal image support to FetchURL — the size check before base64 encoding and keeping the new fields optional for backward compat are both good calls. One small thing: the contentType header value is used directly as the data URL mime type, but it could include parameters like image/png; charset=utf-8. A quick contentType.split(';')[0].trim() would keep the data URL clean, though it's unlikely to cause issues in practice.

chatgpt-codex-connector Bot reviewed Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(fetch-url): support fetching images as multimodal content#1100

feat(fetch-url): support fetching images as multimodal content#1100
bj456736 wants to merge 2 commits into
MoonshotAI:mainfrom
bj456736:auto-pr-20260625-1800

bj456736 commented Jun 25, 2026

Uh oh!

changeset-bot Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Uh oh!

itxaiohanglover commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		type: 'image_url',
		imageUrl: { url: `data:${image.mimeType};base64,${image.base64}` },

Uh oh!

Conversation

bj456736 commented Jun 25, 2026

Summary

Changes

Why this approach

Testing

Related

Uh oh!

changeset-bot Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

itxaiohanglover commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

changeset-bot Bot commented Jun 25, 2026 •

edited

Loading