Skip to content

fix(gemini): avoid duplicate model route for full api_base#26982

Closed
StatPan wants to merge 116 commits into
BerriAI:litellm_oss_stagingfrom
StatPan:fix/gemini-full-api-base-duplicate-path
Closed

fix(gemini): avoid duplicate model route for full api_base#26982
StatPan wants to merge 116 commits into
BerriAI:litellm_oss_stagingfrom
StatPan:fix/gemini-full-api-base-duplicate-path

Conversation

@StatPan
Copy link
Copy Markdown

@StatPan StatPan commented May 1, 2026

Summary

  • avoid appending a duplicate /models/{model}:{endpoint} route when Gemini api_base already includes the full route
  • handle api_base values that already end with /models/{model} by appending only :{endpoint}
  • add regression coverage for full-route and model-prefix custom api_base cases

Context

This retargets the same fix through the OSS staging path after the direct-to-main PR hit the repository's main-branch source guard.

Prior PR / context:

The bug is still affecting downstream integrations in OpenHands:

Tracking issue:

Test Plan

  • uv run pytest tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py -k 'custom_proxy or full_route or api_base or credential_project_validation'
  • uv run ruff check litellm/llms/vertex_ai/vertex_llm_base.py tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py

Sameerlite and others added 30 commits April 2, 2026 09:56
- Add pop_vertex_request_labels / vertex_request_labels_from_litellm_params in common_utils
- Vertex embeddings: pass litellm_params, set predict body labels; Gemini uses shared helper
- Imagen: top-level labels from metadata; rerank: userLabels for Discovery Engine Rank API
- Thread litellm_params through rerank handler and all BaseRerankConfig implementations

Made-with: Cursor
…on-streaming

Non-streaming path required len(tool_calls)==1 to unwrap json_tool_call, so mixed user tools leaked the internal tool. Align with Bedrock converse handling: strip internal tools, merge structured JSON into content.

Made-with: Cursor
…ags.

Add tests to validate the condition

 improve the conditional readability by naming the plain-tag check explicitly.
…sset roots

The unauthenticated ``/get_image`` and ``/get_favicon`` endpoints accept
the admin-set env vars ``UI_LOGO_PATH`` and ``LITELLM_FAVICON_URL`` and
return whatever bytes they resolve to, with a hard-coded ``image/jpeg``
or ``image/x-icon`` content-type. Two attack shapes:

* ``UI_LOGO_PATH=/etc/passwd`` (or any other readable file path) — any
  unauthenticated caller exfiltrates the file via ``GET /get_image``.
  The previous gate was ``os.path.exists(logo_path)`` which fires on
  every readable file. Same shape for the favicon endpoint.
* ``UI_LOGO_PATH=http://169.254.169.254/iam`` (or any internal HTTP
  service the admin pointed at) — the proxy fetches it server-side
  and streams the response body to the unauthenticated caller. No
  URL validation, no Content-Type validation; ``application/json``
  AWS metadata gets tunneled out under the ``image/jpeg`` wrapper.

New helper module ``litellm/proxy/common_utils/static_asset_utils.py``:

* ``resolve_local_asset_path(candidate, allowed_roots)`` — returns the
  resolved absolute path only if it lives within one of the allowed
  asset roots. Uses ``realpath`` so symlinks pointing outside the roots
  are caught.
* ``fetch_validated_image_bytes(url)`` — runs the URL through
  ``validate_url`` (rejecting private / cloud-metadata / loopback
  targets) and only returns the response body if the upstream
  Content-Type is in a small allowlist of image MIME types.

Both ``/get_image`` and ``/get_favicon`` are wired through the helpers.
The SSRF gate is enforced unconditionally — these endpoints are
unauthenticated, so the admin-facing ``litellm.user_url_validation``
toggle does not apply (an admin who opted out of URL validation for
LLM provider paths shouldn't also expose ``/get_image`` to SSRF).

Tests:

- ``TestResolveLocalAssetPath``: 10 cases covering legitimate paths,
  ``/etc/passwd``, ``/proc/self/environ``, symlink-out, ``..``
  traversal, directories, missing files, and root list edge cases.
- ``TestFetchValidatedImageBytes``: 7 cases covering SSRF block, non-
  image content-type rejection, valid image passthrough, non-200
  response, fetch exception, empty URL, and parametrized coverage of
  every allowed image MIME type.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The unauthenticated ``/get_logo_url`` endpoint returned the
``UI_LOGO_PATH`` env var verbatim. For HTTP(S) URLs this is intended —
the dashboard loads the logo directly from a public/internal CDN. For
local filesystem paths it was an information disclosure: any caller
could fetch ``/get_logo_url`` and read admin-only filesystem details
like ``UI_LOGO_PATH=/etc/litellm/secret-config.json``.

Now the endpoint returns the URL only when it begins with
``http://`` or ``https://``. For local paths (or unset) it returns an
empty string — the dashboard falls back to ``/get_image`` which
serves the file via the path-containment guard added in the previous
commit.

Tests parametrize the disclosure-blocked cases (``/etc/...``,
``/proc/self/environ``, relative paths) and confirm HTTP / HTTPS URLs
still pass through unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Align ``/get_favicon``'s allowed-root list with ``/get_image``'s. Both
endpoints now accept paths under any of:

* ``LITELLM_ASSETS_PATH`` (or its default — ``/var/lib/litellm/assets``
  for non-root, the package dir otherwise)
* the package's bundled-asset dir (``proxy/_experimental/out`` for the
  default favicon, ``proxy/`` for the default logo)
* the proxy package dir (``current_dir``) as a final fallback

Without this, an admin who put a custom favicon under
``LITELLM_ASSETS_PATH`` (e.g. mounted into the container at
``/var/lib/litellm/assets/favicon.ico``) would have the favicon
endpoint silently fall back to the default after the previous commit's
path-containment guard. The logo endpoint already accepted this root.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pdate legacy tests

Three CI failures from the previous push, all addressed:

* ``lint`` (mypy): ``async_client.get(url, **request_kwargs)`` confused
  mypy because ``AsyncHTTPHandler.get``'s second positional arg is typed
  ``bool | None``. Switched to an explicit branch:
  ``await async_client.get(rewritten_url, headers={"host": host_header})``
  for the HTTP-rewritten case, plain ``get(rewritten_url)`` otherwise.

* ``proxy-infra`` /
  ``test_get_image_custom_local_logo_bypasses_cache``: the existing
  test set ``UI_LOGO_PATH=/app/custom_logo.jpg`` with no
  ``LITELLM_ASSETS_PATH``, asserting the path was served verbatim. That
  was the LFI behaviour the new path-containment guard closes. Updated
  the test to set ``LITELLM_ASSETS_PATH=/app`` so the path is inside an
  allowed root, and patched the helper's ``realpath`` / ``isfile`` to
  go along with the mocked filesystem. Test intent (bypass cache when
  ``UI_LOGO_PATH`` is local) is preserved.

* ``auth-and-jwt`` / ``test_get_image_cache_logic``: existing test
  built a ``Mock`` response without ``headers``, so the new
  Content-Type check tripped on ``Mock().split(";")[0]``. Two fixes:

    1. Set ``mock_response.headers = {"content-type": "image/jpeg"}``
       on the test (matches the real upstream contract — a logo CDN
       always sets a Content-Type).
    2. Make ``fetch_validated_image_bytes`` defensive: if the
       Content-Type header is missing or non-string, treat as non-image
       and fall back to default. Closes a subtle hole — pre-fix, an
       upstream that omits Content-Type entirely would have served
       arbitrary bytes under the ``image/jpeg`` wrapper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…on cache miss

Three review items addressed:

* **Veria (Medium): SSRF via redirect.** ``fetch_validated_image_bytes``
  was calling ``validate_url(url)`` once and then fetching with the
  default httpx client, so a 3xx to an internal IP would have been
  followed unvalidated. Switched to ``async_safe_get`` (the existing
  SSRF primitive used elsewhere in the codebase) which walks each
  redirect hop, re-validates, and rejects redirects to blocked
  networks. Default ``litellm.user_url_validation`` is True so
  protection is on out of the box.

* **Greptile (P2): SVG can embed JS.** Removed ``image/svg+xml`` from
  the allowed-Content-Type set. The hardcoded response media type
  (``image/jpeg`` / ``image/x-icon``) means a real SVG body wouldn't
  render as SVG anyway in modern browsers — the allowlist entry was
  giving up XSS surface for no actual SVG-rendering benefit. If real
  SVG support is wanted later, that's a deliberate feature PR with CSP
  / nosniff bundled.

* **Greptile (P2): cache-write OSError drops validated bytes.** When
  the upstream fetch succeeded but ``open(cache_path, "wb")`` raised
  (read-only assets dir), the bytes were discarded and the default
  logo was served — a silent regression for that deployment. Now
  serve the validated bytes inline via ``Response(...)`` as a fallback
  before falling back to default.

Tests:

- Replaced low-level mocks of ``validate_url`` with mocks of
  ``async_safe_get`` directly, exercising the helper's contract
  rather than the SSRF primitive's internals.
- New ``test_rejects_svg_content_type`` confirms SVG is blocked.
- ``test_get_image_cache_logic`` fixture now sets
  ``mock_response.is_redirect = False`` so ``async_safe_get`` doesn't
  treat the Mock's truthy attribute as a redirect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eaner test fixture

Two cleanups from the /simplify review pass:

* ``Response`` was imported inside the ``except OSError`` branch in
  ``/get_image`` and at the top of ``/get_favicon``. Per the project's
  no-inline-imports rule (CLAUDE.md), hoisted to the existing
  ``from fastapi.responses import (...)`` block at the top of
  ``proxy_server.py``.

* The test class's ``_patches()`` helper returned a 2-element list of
  patch context managers and tests indexed into them via
  ``self._patches(...)[0], self._patches()[1]`` — two distinct calls
  with confusing aliasing semantics. Restructured to:
    - module-level ``_patch_async_safe_get(...)`` that returns a single
      patch context manager
    - autouse fixture that patches ``get_async_httpx_client`` for every
      test in the file (it's the same patch in every case)
    - small ``_image_response(...)`` factory to deduplicate Mock setup

  Tests now read as ``with _patch_async_safe_get(return_value=...):``
  with no list-indexing or duplicate Mock construction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…async_safe_get

Variant analysis on the unauthenticated /get_image SSRF surfaced one
related sink in an admin-only endpoint:
``test_hashicorp_vault_connection`` in
``config_override_endpoints.py:402`` calls
``async_client.get(f"{vault_addr}/v1/auth/token/lookup-self")`` with
no SSRF guard. ``vault_addr`` is admin-set, so the threat model is
"admin misconfig (or attacker with admin creds) pivots Vault calls
to cloud metadata or another internal IP."

Same fix shape as the unauthenticated endpoints: wrap in
``async_safe_get`` so each redirect hop is re-validated and private
networks are rejected. Admins running against a legitimate internal
Vault should add the host to ``litellm.user_url_allowed_hosts`` —
the existing escape hatch already used elsewhere in the codebase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
``Response`` is already imported from the top-level ``fastapi``
package via the multi-line ``from fastapi import (...)`` block at the
top of the file (along with ``Depends``, ``HTTPException``, etc.) —
``fastapi.Response`` is the same class that ``fastapi.responses``
re-exports. The earlier ``from fastapi.responses import Response``
addition triggered ruff F811 for redefinition.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… option

Pass-through endpoints configured in
``general_settings.pass_through_endpoints`` defaulted to ``auth: false``
and the safe ``auth: true`` setting was rejected at startup unless the
operator had a LiteLLM Enterprise license. Net result: OSS deployments
had **no safe configuration** — every pass-through admins added without
remembering ``auth: true`` shipped an unauthenticated forwarder, and
remembering ``auth: true`` raised a hard "enterprise-only" error.

Three changes:

* ``litellm/proxy/_types.py`` — flip
  ``PassThroughGenericEndpoint.auth`` default to ``True``. Operators
  who add a pass-through with no explicit ``auth`` value now get a
  safe, authenticated forwarder by default. Setting ``auth: false``
  remains supported for genuine public-forwarder use cases (e.g.
  webhook receivers).

* ``litellm/proxy/pass_through_endpoints/pass_through_endpoints.py``
  — drop the ``premium_user`` gate around ``auth: true``. An
  unauthenticated forwarder is a deployment choice operators should
  be allowed to make explicitly, but the safe option must always be
  free. The product-tier decision (which features sit behind the
  enterprise license) is separate from "OSS users must always have a
  safe option."

* ``litellm/proxy/auth/user_api_key_auth.py`` — the runtime dispatch
  pulls pass-through endpoints from ``general_settings`` as raw
  dicts, so the Pydantic default doesn't apply. Switched
  ``endpoint.get("auth")`` to ``endpoint.get("auth", True)`` so a
  config dict without an explicit ``auth`` key still requires
  authentication at request time.

Tests:

- ``test_passthrough_auth_defaults_to_true`` — Pydantic default is
  now safe.
- ``test_passthrough_auth_can_still_be_explicitly_disabled``
  — opt-in to ``auth=False`` still works for legitimate
  public-forwarder use cases.
- ``test_register_passthrough_with_auth_true_works_for_oss``
  — ``premium_user=False`` no longer rejects ``auth=true``.
- ``test_runtime_check_treats_missing_auth_key_as_authenticated``
  — raw dict without an ``auth`` key now requires auth (the
  previously-unauthenticated forwarder).
- ``test_runtime_check_explicit_auth_false_still_skips_validation``
  — explicit opt-in still works.

Closes GHSA-7h34-mmrh-6g58.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit removed the only top-level use of
``CommonProxyErrors`` (the enterprise-gate ``raise ValueError``).
Ruff F401 flagged the import as unused; the function-local import at
line 2601 in a separate handler is the only remaining caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
yuneng-berri and others added 22 commits April 30, 2026 17:10
chore(proxy): contain UI_LOGO_PATH / LITELLM_FAVICON_URL on unauthenticated asset endpoints
…cbb6cf

[Test] Proxy E2E: Opt In To Client Mock Response For Model Access Tests
The async/sync delete_response_api_handler always passed json=data into
httpx.delete, where data is {} from the transformer. httpx serializes that
to a 2-byte body. The Azure Responses DELETE endpoint now rejects any
request body with code: unexpected_body, breaking
test_basic_openai_responses_delete_endpoint on the llm_responses_api_testing
job. Build the kwargs dict and only set json= when data is truthy.

Add unit tests that patch httpx.delete and assert json/data are not in the
captured kwargs for the Azure DELETE path (sync and async).
…hawking-19bdeb

[Fix] Responses API: Omit Empty Body On DELETE
…tHooks

Run pre_call_hook on Google generateContent endpoints
…ntMultiPod

[Fix] Refresh Redis TTL on counter writes, skip stale in-memory in Redis
…agination

Add pagination controls to model health status
…metadata_labels

feat(vertex_ai): propagate metadata labels to embedding, Imagen, rerank
…mode-nonstreaming-mixed-tools

fix(anthropic): json response_format + user tools non-streaming
…_based_routing

add test(tag-routing): prevent header regex bypass for strict plain t…
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 1, 2026

Too many files changed for review. (113 files found, 100 file limit)

@StatPan
Copy link
Copy Markdown
Author

StatPan commented May 1, 2026

Opened a fully clean replacement PR here: #26983

Reason: #26982 reused the earlier head branch and inherited the wrong ancestry for litellm_oss_staging, which pulled in unrelated diff and triggered fork/dependency guard noise.

#26983 is rebuilt from origin/litellm_oss_staging and cherry-picks only the intended Gemini fix/test commits. Current diff is limited to:

  • litellm/llms/vertex_ai/vertex_llm_base.py
  • tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py

@StatPan
Copy link
Copy Markdown
Author

StatPan commented May 14, 2026

Closing as superseded by the clean actionable PR #26983. This branch inherited unrelated diff from the wrong ancestry; #26983 is rebuilt from and should be the review path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.