feat: support URL-fetch uploads on the hosted transport#49
Conversation
On the hosted HTTP transport the server runs remotely with no access to
the client's filesystem, so local path uploads always fail and the only
fallback (inline base64) requires the model to emit the file's bytes —
infeasible for real images.
Add a server-side URL-fetch input form ({"url": "https://..."}): the
server downloads the bytes via httpx and uploads them as InputFile bytes,
so the file no longer has to travel through the model. The fetch is
SSRF-guarded (http/https only; rejects private/loopback/link-local/
reserved IPs incl. the cloud metadata endpoint; re-validates after
redirects; 25 MB cap, 30s timeout, 5 redirects).
Also:
- cap inline base64 at 256 KB with an error pointing to `url`
- gate local-path uploads to stdio; on hosted, reject with guidance to
use `url` or inline base64 (no stdio/self-host wording)
- add an upload sentence to the http instructions
- add a `url` property to the InputFile JSON schema
Upload behavior is selected via a module-level flag set once in
build_mcp_server, so no re-plumbing of the execute call chain.
Greptile SummaryThis PR adds a server-side URL-fetch upload path for the hosted HTTP transport, where the server has no access to the client's filesystem. It introduces an SSRF guard (
Confidence Score: 3/5The URL-fetch path introduces a real SSRF exposure: a redirect from a public host to a private IP reaches the internal network before the re-validation guard fires. The core SSRF mitigation has a structural gap — follow_redirects=True means httpx resolves and connects to every hop before the post-redirect _validate_fetch_url call runs. An attacker who controls a public redirect target can route the hosted server to 169.254.169.254 or any private IP. This should be resolved before the feature goes live on the hosted transport. src/mcp_server_appwrite/server.py — specifically the _fetch_input_file function and its redirect-handling configuration
|
| Filename | Overview |
|---|---|
| src/mcp_server_appwrite/server.py | Adds URL-fetch upload path with SSRF guard, inline size cap, and transport-gated path rejection; the post-redirect re-validation fires after httpx has already connected to the final URL, leaving a redirect-to-private-IP bypass open. |
| src/mcp_server_appwrite/service.py | Schema-only change: adds url property to _input_file_schema and updates descriptions to reflect the new URL-fetch and local-path guidance; no logic changed. |
| tests/unit/test_server.py | Adds 9 new unit tests covering URL object/bare-string fetch, private-IP SSRF rejection, non-http scheme, stream size cap, inline size cap, path gating, and instruction content — but does not cover the redirect-to-private-IP bypass scenario. |
Reviews (1): Last reviewed commit: "feat: support URL-fetch and hardened upl..." | Re-trigger Greptile
| with httpx.Client( | ||
| timeout=_FETCH_TIMEOUT_SECONDS, | ||
| follow_redirects=True, | ||
| max_redirects=_FETCH_MAX_REDIRECTS, | ||
| limits=httpx.Limits(max_connections=1), | ||
| ) as client: | ||
| with client.stream("GET", url) as resp: | ||
| resp.raise_for_status() | ||
| # The final URL after redirects must also be public. | ||
| _validate_fetch_url(str(resp.url)) |
There was a problem hiding this comment.
Post-redirect SSRF validation fires too late
client.stream("GET", url) with follow_redirects=True means httpx follows every redirect internally and opens a TCP connection to the final destination before the context manager body runs. By the time _validate_fetch_url(str(resp.url)) is called on line 436, the HTTP GET request to the final URL (which may be http://169.254.169.254/latest/meta-data/ or any private IP) has already been sent. The check blocks reading the response body but does not prevent the network request from crossing the trust boundary.
A redirect chain like https://legit.example.com/file → http://10.0.0.1/internal would bypass the pre-request validation entirely. The simplest fix is to disable automatic redirect following and either error on any redirect or validate each Location header before following manually.
| declared = resp.headers.get("content-length") | ||
| if declared is not None and declared.isdigit(): |
There was a problem hiding this comment.
Content-Length skip when value contains a sign or spaces
declared.isdigit() returns False for values like " 1024", "+1024", or the rarely-seen negative sentinel -1, so those bypass the early-exit check. In practice a server lying about Content-Length would still be caught by the streaming cap on line 450, but the declared-size check silently skips any non-digit string and might miss a large file until the whole body has been partially buffered.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Problem
On the hosted HTTP transport (
mcp.appwrite.io/mcp), the MCP server runs in Appwrite Cloud with no access to the client's local filesystem. The two existing file-input forms both break there:InputFile.from_path()reads the server's filesystem, so it can never see a file on the user's machine — it always fails on hosted.{filename, content, encoding}→ requires the model to emit the file's bytes as base64 in the tool-call args. A 180 KB PNG is ~240 KB of base64 the model can't faithfully reproduce → the upload corrupts or the request drops mid-generation. Only sub-~1 KB files work.This is structural: MCP has no out-of-band binary channel for tool inputs, so bytes can only reach a remote server via the model's text or via a URL the server fetches.
Change
Add a server-side URL-fetch input form so the model can pass
{"url": "https://..."}(or a bare http(s) string); the server downloads the bytes and uploads them asInputFile.from_bytes. The SDK chunks the bytes, so size isn't model-bound._validate_fetch_url): http/https only; rejects hosts resolving to private / loopback / link-local / reserved / multicast / unspecified addresses (covers the cloud metadata IP169.254.169.254,10/8,127/8,::1, …); re-validates the final URL after redirects. 25 MB streamed cap (handles absent/lyingContent-Length), 30 s timeout, 5 redirects.urlfor larger files.build_mcp_serverlets hosted reject local paths with guidance (steering tourlor inline base64 — no stdio/self-host wording) and adds an upload sentence to the http instructions. stdio behavior is unchanged._input_file_schema()gains aurlproperty and clarified descriptions.No re-plumbing of the
execute_registered_tool → _prepare_arguments → _coerce_argumentchain (the_coerce_argumentsignature the tests assert on is untouched).Tests
9 new unit tests in
tests/unit/test_server.py(network mocked): URL object + bare-string, SSRF private-IP rejection, non-http scheme rejection, fetch + inline size caps, path rejection on http vs allowed on stdio, and http-instruction guidance._UPLOAD_TRANSPORTis reset per test for isolation.All four CI gates pass locally: ruff, black,
unittest discover -s tests/unit(76 tests), anddocker build.Known limitation / follow-up
A file that exists only on the user's local disk still has to be reachable by a public URL first — structural to a remote MCP server. DNS-rebinding (resolve-then-reconnect) is an accepted residual gap; IP-pinning via a custom httpx transport is a possible follow-up.