fix(gemini): avoid duplicate model route for full api_base#26982
Closed
StatPan wants to merge 116 commits into
Closed
fix(gemini): avoid duplicate model route for full api_base#26982StatPan wants to merge 116 commits into
StatPan wants to merge 116 commits into
Conversation
- Add pop_vertex_request_labels / vertex_request_labels_from_litellm_params in common_utils - Vertex embeddings: pass litellm_params, set predict body labels; Gemini uses shared helper - Imagen: top-level labels from metadata; rerank: userLabels for Discovery Engine Rank API - Thread litellm_params through rerank handler and all BaseRerankConfig implementations Made-with: Cursor
merge main
merge litellm_internal_staging
…on-streaming Non-streaming path required len(tool_calls)==1 to unwrap json_tool_call, so mixed user tools leaked the internal tool. Align with Bedrock converse handling: strip internal tools, merge structured JSON into content. Made-with: Cursor
…ags. Add tests to validate the condition improve the conditional readability by naming the plain-tag check explicitly.
…sset roots The unauthenticated ``/get_image`` and ``/get_favicon`` endpoints accept the admin-set env vars ``UI_LOGO_PATH`` and ``LITELLM_FAVICON_URL`` and return whatever bytes they resolve to, with a hard-coded ``image/jpeg`` or ``image/x-icon`` content-type. Two attack shapes: * ``UI_LOGO_PATH=/etc/passwd`` (or any other readable file path) — any unauthenticated caller exfiltrates the file via ``GET /get_image``. The previous gate was ``os.path.exists(logo_path)`` which fires on every readable file. Same shape for the favicon endpoint. * ``UI_LOGO_PATH=http://169.254.169.254/iam`` (or any internal HTTP service the admin pointed at) — the proxy fetches it server-side and streams the response body to the unauthenticated caller. No URL validation, no Content-Type validation; ``application/json`` AWS metadata gets tunneled out under the ``image/jpeg`` wrapper. New helper module ``litellm/proxy/common_utils/static_asset_utils.py``: * ``resolve_local_asset_path(candidate, allowed_roots)`` — returns the resolved absolute path only if it lives within one of the allowed asset roots. Uses ``realpath`` so symlinks pointing outside the roots are caught. * ``fetch_validated_image_bytes(url)`` — runs the URL through ``validate_url`` (rejecting private / cloud-metadata / loopback targets) and only returns the response body if the upstream Content-Type is in a small allowlist of image MIME types. Both ``/get_image`` and ``/get_favicon`` are wired through the helpers. The SSRF gate is enforced unconditionally — these endpoints are unauthenticated, so the admin-facing ``litellm.user_url_validation`` toggle does not apply (an admin who opted out of URL validation for LLM provider paths shouldn't also expose ``/get_image`` to SSRF). Tests: - ``TestResolveLocalAssetPath``: 10 cases covering legitimate paths, ``/etc/passwd``, ``/proc/self/environ``, symlink-out, ``..`` traversal, directories, missing files, and root list edge cases. - ``TestFetchValidatedImageBytes``: 7 cases covering SSRF block, non- image content-type rejection, valid image passthrough, non-200 response, fetch exception, empty URL, and parametrized coverage of every allowed image MIME type. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The unauthenticated ``/get_logo_url`` endpoint returned the ``UI_LOGO_PATH`` env var verbatim. For HTTP(S) URLs this is intended — the dashboard loads the logo directly from a public/internal CDN. For local filesystem paths it was an information disclosure: any caller could fetch ``/get_logo_url`` and read admin-only filesystem details like ``UI_LOGO_PATH=/etc/litellm/secret-config.json``. Now the endpoint returns the URL only when it begins with ``http://`` or ``https://``. For local paths (or unset) it returns an empty string — the dashboard falls back to ``/get_image`` which serves the file via the path-containment guard added in the previous commit. Tests parametrize the disclosure-blocked cases (``/etc/...``, ``/proc/self/environ``, relative paths) and confirm HTTP / HTTPS URLs still pass through unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Align ``/get_favicon``'s allowed-root list with ``/get_image``'s. Both endpoints now accept paths under any of: * ``LITELLM_ASSETS_PATH`` (or its default — ``/var/lib/litellm/assets`` for non-root, the package dir otherwise) * the package's bundled-asset dir (``proxy/_experimental/out`` for the default favicon, ``proxy/`` for the default logo) * the proxy package dir (``current_dir``) as a final fallback Without this, an admin who put a custom favicon under ``LITELLM_ASSETS_PATH`` (e.g. mounted into the container at ``/var/lib/litellm/assets/favicon.ico``) would have the favicon endpoint silently fall back to the default after the previous commit's path-containment guard. The logo endpoint already accepted this root. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pdate legacy tests
Three CI failures from the previous push, all addressed:
* ``lint`` (mypy): ``async_client.get(url, **request_kwargs)`` confused
mypy because ``AsyncHTTPHandler.get``'s second positional arg is typed
``bool | None``. Switched to an explicit branch:
``await async_client.get(rewritten_url, headers={"host": host_header})``
for the HTTP-rewritten case, plain ``get(rewritten_url)`` otherwise.
* ``proxy-infra`` /
``test_get_image_custom_local_logo_bypasses_cache``: the existing
test set ``UI_LOGO_PATH=/app/custom_logo.jpg`` with no
``LITELLM_ASSETS_PATH``, asserting the path was served verbatim. That
was the LFI behaviour the new path-containment guard closes. Updated
the test to set ``LITELLM_ASSETS_PATH=/app`` so the path is inside an
allowed root, and patched the helper's ``realpath`` / ``isfile`` to
go along with the mocked filesystem. Test intent (bypass cache when
``UI_LOGO_PATH`` is local) is preserved.
* ``auth-and-jwt`` / ``test_get_image_cache_logic``: existing test
built a ``Mock`` response without ``headers``, so the new
Content-Type check tripped on ``Mock().split(";")[0]``. Two fixes:
1. Set ``mock_response.headers = {"content-type": "image/jpeg"}``
on the test (matches the real upstream contract — a logo CDN
always sets a Content-Type).
2. Make ``fetch_validated_image_bytes`` defensive: if the
Content-Type header is missing or non-string, treat as non-image
and fall back to default. Closes a subtle hole — pre-fix, an
upstream that omits Content-Type entirely would have served
arbitrary bytes under the ``image/jpeg`` wrapper.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…on cache miss Three review items addressed: * **Veria (Medium): SSRF via redirect.** ``fetch_validated_image_bytes`` was calling ``validate_url(url)`` once and then fetching with the default httpx client, so a 3xx to an internal IP would have been followed unvalidated. Switched to ``async_safe_get`` (the existing SSRF primitive used elsewhere in the codebase) which walks each redirect hop, re-validates, and rejects redirects to blocked networks. Default ``litellm.user_url_validation`` is True so protection is on out of the box. * **Greptile (P2): SVG can embed JS.** Removed ``image/svg+xml`` from the allowed-Content-Type set. The hardcoded response media type (``image/jpeg`` / ``image/x-icon``) means a real SVG body wouldn't render as SVG anyway in modern browsers — the allowlist entry was giving up XSS surface for no actual SVG-rendering benefit. If real SVG support is wanted later, that's a deliberate feature PR with CSP / nosniff bundled. * **Greptile (P2): cache-write OSError drops validated bytes.** When the upstream fetch succeeded but ``open(cache_path, "wb")`` raised (read-only assets dir), the bytes were discarded and the default logo was served — a silent regression for that deployment. Now serve the validated bytes inline via ``Response(...)`` as a fallback before falling back to default. Tests: - Replaced low-level mocks of ``validate_url`` with mocks of ``async_safe_get`` directly, exercising the helper's contract rather than the SSRF primitive's internals. - New ``test_rejects_svg_content_type`` confirms SVG is blocked. - ``test_get_image_cache_logic`` fixture now sets ``mock_response.is_redirect = False`` so ``async_safe_get`` doesn't treat the Mock's truthy attribute as a redirect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eaner test fixture
Two cleanups from the /simplify review pass:
* ``Response`` was imported inside the ``except OSError`` branch in
``/get_image`` and at the top of ``/get_favicon``. Per the project's
no-inline-imports rule (CLAUDE.md), hoisted to the existing
``from fastapi.responses import (...)`` block at the top of
``proxy_server.py``.
* The test class's ``_patches()`` helper returned a 2-element list of
patch context managers and tests indexed into them via
``self._patches(...)[0], self._patches()[1]`` — two distinct calls
with confusing aliasing semantics. Restructured to:
- module-level ``_patch_async_safe_get(...)`` that returns a single
patch context manager
- autouse fixture that patches ``get_async_httpx_client`` for every
test in the file (it's the same patch in every case)
- small ``_image_response(...)`` factory to deduplicate Mock setup
Tests now read as ``with _patch_async_safe_get(return_value=...):``
with no list-indexing or duplicate Mock construction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…async_safe_get
Variant analysis on the unauthenticated /get_image SSRF surfaced one
related sink in an admin-only endpoint:
``test_hashicorp_vault_connection`` in
``config_override_endpoints.py:402`` calls
``async_client.get(f"{vault_addr}/v1/auth/token/lookup-self")`` with
no SSRF guard. ``vault_addr`` is admin-set, so the threat model is
"admin misconfig (or attacker with admin creds) pivots Vault calls
to cloud metadata or another internal IP."
Same fix shape as the unauthenticated endpoints: wrap in
``async_safe_get`` so each redirect hop is re-validated and private
networks are rejected. Admins running against a legitimate internal
Vault should add the host to ``litellm.user_url_allowed_hosts`` —
the existing escape hatch already used elsewhere in the codebase.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Made-with: Cursor
``Response`` is already imported from the top-level ``fastapi`` package via the multi-line ``from fastapi import (...)`` block at the top of the file (along with ``Depends``, ``HTTPException``, etc.) — ``fastapi.Response`` is the same class that ``fastapi.responses`` re-exports. The earlier ``from fastapi.responses import Response`` addition triggered ruff F811 for redefinition. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… option
Pass-through endpoints configured in
``general_settings.pass_through_endpoints`` defaulted to ``auth: false``
and the safe ``auth: true`` setting was rejected at startup unless the
operator had a LiteLLM Enterprise license. Net result: OSS deployments
had **no safe configuration** — every pass-through admins added without
remembering ``auth: true`` shipped an unauthenticated forwarder, and
remembering ``auth: true`` raised a hard "enterprise-only" error.
Three changes:
* ``litellm/proxy/_types.py`` — flip
``PassThroughGenericEndpoint.auth`` default to ``True``. Operators
who add a pass-through with no explicit ``auth`` value now get a
safe, authenticated forwarder by default. Setting ``auth: false``
remains supported for genuine public-forwarder use cases (e.g.
webhook receivers).
* ``litellm/proxy/pass_through_endpoints/pass_through_endpoints.py``
— drop the ``premium_user`` gate around ``auth: true``. An
unauthenticated forwarder is a deployment choice operators should
be allowed to make explicitly, but the safe option must always be
free. The product-tier decision (which features sit behind the
enterprise license) is separate from "OSS users must always have a
safe option."
* ``litellm/proxy/auth/user_api_key_auth.py`` — the runtime dispatch
pulls pass-through endpoints from ``general_settings`` as raw
dicts, so the Pydantic default doesn't apply. Switched
``endpoint.get("auth")`` to ``endpoint.get("auth", True)`` so a
config dict without an explicit ``auth`` key still requires
authentication at request time.
Tests:
- ``test_passthrough_auth_defaults_to_true`` — Pydantic default is
now safe.
- ``test_passthrough_auth_can_still_be_explicitly_disabled``
— opt-in to ``auth=False`` still works for legitimate
public-forwarder use cases.
- ``test_register_passthrough_with_auth_true_works_for_oss``
— ``premium_user=False`` no longer rejects ``auth=true``.
- ``test_runtime_check_treats_missing_auth_key_as_authenticated``
— raw dict without an ``auth`` key now requires auth (the
previously-unauthenticated forwarder).
- ``test_runtime_check_explicit_auth_false_still_skips_validation``
— explicit opt-in still works.
Closes GHSA-7h34-mmrh-6g58.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit removed the only top-level use of ``CommonProxyErrors`` (the enterprise-gate ``raise ValueError``). Ruff F401 flagged the import as unused; the function-local import at line 2601 in a separate handler is the only remaining caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
chore(proxy): contain UI_LOGO_PATH / LITELLM_FAVICON_URL on unauthenticated asset endpoints
chore(cli): tighten CLI SSO session flow
…cbb6cf [Test] Proxy E2E: Opt In To Client Mock Response For Model Access Tests
The async/sync delete_response_api_handler always passed json=data into
httpx.delete, where data is {} from the transformer. httpx serializes that
to a 2-byte body. The Azure Responses DELETE endpoint now rejects any
request body with code: unexpected_body, breaking
test_basic_openai_responses_delete_endpoint on the llm_responses_api_testing
job. Build the kwargs dict and only set json= when data is truthy.
Add unit tests that patch httpx.delete and assert json/data are not in the
captured kwargs for the Azure DELETE path (sync and async).
…hawking-19bdeb [Fix] Responses API: Omit Empty Body On DELETE
…tHooks Run pre_call_hook on Google generateContent endpoints
…ntMultiPod [Fix] Refresh Redis TTL on counter writes, skip stale in-memory in Redis
…agination Add pagination controls to model health status
…/litellm into litellm_auth_bypass_tag_based_routing
…metadata_labels feat(vertex_ai): propagate metadata labels to embedding, Imagen, rerank
…mode-nonstreaming-mixed-tools fix(anthropic): json response_format + user tools non-streaming
[Infra] Bump Versions
…_based_routing add test(tag-routing): prevent header regex bypass for strict plain t…
Contributor
|
Too many files changed for review. ( |
2 tasks
Author
|
Opened a fully clean replacement PR here: #26983 Reason: #26982 reused the earlier head branch and inherited the wrong ancestry for #26983 is rebuilt from
|
Author
This was referenced May 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/models/{model}:{endpoint}route when Geminiapi_basealready includes the full routeapi_basevalues that already end with/models/{model}by appending only:{endpoint}api_basecasesContext
This retargets the same fix through the OSS staging path after the direct-to-
mainPR hit the repository's main-branch source guard.Prior PR / context:
The bug is still affecting downstream integrations in OpenHands:
Tracking issue:
Test Plan
uv run pytest tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py -k 'custom_proxy or full_route or api_base or credential_project_validation'uv run ruff check litellm/llms/vertex_ai/vertex_llm_base.py tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py