fix(gemini): avoid duplicate model route for full api_base by StatPan · Pull Request #26982 · BerriAI/litellm

StatPan · 2026-05-01T12:52:31Z

Summary

avoid appending a duplicate /models/{model}:{endpoint} route when Gemini api_base already includes the full route
handle api_base values that already end with /models/{model} by appending only :{endpoint}
add regression coverage for full-route and model-prefix custom api_base cases

Context

This retargets the same fix through the OSS staging path after the direct-to-main PR hit the repository's main-branch source guard.

Prior PR / context:

supersedes fix(gemini): avoid duplicate model route for full api_base #26980 for merge routing purposes
follow-up/replacement for fix(gemini): prevent duplicate model path generation when api_base includes full route #24786, which has been inactive since 2026-03-30

The bug is still affecting downstream integrations in OpenHands:

Tracking issue:

Closes [Bug]: Gemini custom api_base with full model route gets duplicate path appended #26979

Test Plan

uv run pytest tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py -k 'custom_proxy or full_route or api_base or credential_project_validation'
uv run ruff check litellm/llms/vertex_ai/vertex_llm_base.py tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py

- Add pop_vertex_request_labels / vertex_request_labels_from_litellm_params in common_utils - Vertex embeddings: pass litellm_params, set predict body labels; Gemini uses shared helper - Imagen: top-level labels from metadata; rerank: userLabels for Discovery Engine Rank API - Thread litellm_params through rerank handler and all BaseRerankConfig implementations Made-with: Cursor

merge main

merge litellm_internal_staging

…etadata_labels

…on-streaming Non-streaming path required len(tool_calls)==1 to unwrap json_tool_call, so mixed user tools leaked the internal tool. Align with Bedrock converse handling: strip internal tools, merge structured JSON into content. Made-with: Cursor

merge main

merge mian

…ags. Add tests to validate the condition improve the conditional readability by naming the plain-tag check explicitly.

…sset roots The unauthenticated ``/get_image`` and ``/get_favicon`` endpoints accept the admin-set env vars ``UI_LOGO_PATH`` and ``LITELLM_FAVICON_URL`` and return whatever bytes they resolve to, with a hard-coded ``image/jpeg`` or ``image/x-icon`` content-type. Two attack shapes: * ``UI_LOGO_PATH=/etc/passwd`` (or any other readable file path) — any unauthenticated caller exfiltrates the file via ``GET /get_image``. The previous gate was ``os.path.exists(logo_path)`` which fires on every readable file. Same shape for the favicon endpoint. * ``UI_LOGO_PATH=http://169.254.169.254/iam`` (or any internal HTTP service the admin pointed at) — the proxy fetches it server-side and streams the response body to the unauthenticated caller. No URL validation, no Content-Type validation; ``application/json`` AWS metadata gets tunneled out under the ``image/jpeg`` wrapper. New helper module ``litellm/proxy/common_utils/static_asset_utils.py``: * ``resolve_local_asset_path(candidate, allowed_roots)`` — returns the resolved absolute path only if it lives within one of the allowed asset roots. Uses ``realpath`` so symlinks pointing outside the roots are caught. * ``fetch_validated_image_bytes(url)`` — runs the URL through ``validate_url`` (rejecting private / cloud-metadata / loopback targets) and only returns the response body if the upstream Content-Type is in a small allowlist of image MIME types. Both ``/get_image`` and ``/get_favicon`` are wired through the helpers. The SSRF gate is enforced unconditionally — these endpoints are unauthenticated, so the admin-facing ``litellm.user_url_validation`` toggle does not apply (an admin who opted out of URL validation for LLM provider paths shouldn't also expose ``/get_image`` to SSRF). Tests: - ``TestResolveLocalAssetPath``: 10 cases covering legitimate paths, ``/etc/passwd``, ``/proc/self/environ``, symlink-out, ``..`` traversal, directories, missing files, and root list edge cases. - ``TestFetchValidatedImageBytes``: 7 cases covering SSRF block, non- image content-type rejection, valid image passthrough, non-200 response, fetch exception, empty URL, and parametrized coverage of every allowed image MIME type. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The unauthenticated ``/get_logo_url`` endpoint returned the ``UI_LOGO_PATH`` env var verbatim. For HTTP(S) URLs this is intended — the dashboard loads the logo directly from a public/internal CDN. For local filesystem paths it was an information disclosure: any caller could fetch ``/get_logo_url`` and read admin-only filesystem details like ``UI_LOGO_PATH=/etc/litellm/secret-config.json``. Now the endpoint returns the URL only when it begins with ``http://`` or ``https://``. For local paths (or unset) it returns an empty string — the dashboard falls back to ``/get_image`` which serves the file via the path-containment guard added in the previous commit. Tests parametrize the disclosure-blocked cases (``/etc/...``, ``/proc/self/environ``, relative paths) and confirm HTTP / HTTPS URLs still pass through unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Align ``/get_favicon``'s allowed-root list with ``/get_image``'s. Both endpoints now accept paths under any of: * ``LITELLM_ASSETS_PATH`` (or its default — ``/var/lib/litellm/assets`` for non-root, the package dir otherwise) * the package's bundled-asset dir (``proxy/_experimental/out`` for the default favicon, ``proxy/`` for the default logo) * the proxy package dir (``current_dir``) as a final fallback Without this, an admin who put a custom favicon under ``LITELLM_ASSETS_PATH`` (e.g. mounted into the container at ``/var/lib/litellm/assets/favicon.ico``) would have the favicon endpoint silently fall back to the default after the previous commit's path-containment guard. The logo endpoint already accepted this root. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…pdate legacy tests Three CI failures from the previous push, all addressed: * ``lint`` (mypy): ``async_client.get(url, **request_kwargs)`` confused mypy because ``AsyncHTTPHandler.get``'s second positional arg is typed ``bool | None``. Switched to an explicit branch: ``await async_client.get(rewritten_url, headers={"host": host_header})`` for the HTTP-rewritten case, plain ``get(rewritten_url)`` otherwise. * ``proxy-infra`` / ``test_get_image_custom_local_logo_bypasses_cache``: the existing test set ``UI_LOGO_PATH=/app/custom_logo.jpg`` with no ``LITELLM_ASSETS_PATH``, asserting the path was served verbatim. That was the LFI behaviour the new path-containment guard closes. Updated the test to set ``LITELLM_ASSETS_PATH=/app`` so the path is inside an allowed root, and patched the helper's ``realpath`` / ``isfile`` to go along with the mocked filesystem. Test intent (bypass cache when ``UI_LOGO_PATH`` is local) is preserved. * ``auth-and-jwt`` / ``test_get_image_cache_logic``: existing test built a ``Mock`` response without ``headers``, so the new Content-Type check tripped on ``Mock().split(";")[0]``. Two fixes: 1. Set ``mock_response.headers = {"content-type": "image/jpeg"}`` on the test (matches the real upstream contract — a logo CDN always sets a Content-Type). 2. Make ``fetch_validated_image_bytes`` defensive: if the Content-Type header is missing or non-string, treat as non-image and fall back to default. Closes a subtle hole — pre-fix, an upstream that omits Content-Type entirely would have served arbitrary bytes under the ``image/jpeg`` wrapper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…on cache miss Three review items addressed: * **Veria (Medium): SSRF via redirect.** ``fetch_validated_image_bytes`` was calling ``validate_url(url)`` once and then fetching with the default httpx client, so a 3xx to an internal IP would have been followed unvalidated. Switched to ``async_safe_get`` (the existing SSRF primitive used elsewhere in the codebase) which walks each redirect hop, re-validates, and rejects redirects to blocked networks. Default ``litellm.user_url_validation`` is True so protection is on out of the box. * **Greptile (P2): SVG can embed JS.** Removed ``image/svg+xml`` from the allowed-Content-Type set. The hardcoded response media type (``image/jpeg`` / ``image/x-icon``) means a real SVG body wouldn't render as SVG anyway in modern browsers — the allowlist entry was giving up XSS surface for no actual SVG-rendering benefit. If real SVG support is wanted later, that's a deliberate feature PR with CSP / nosniff bundled. * **Greptile (P2): cache-write OSError drops validated bytes.** When the upstream fetch succeeded but ``open(cache_path, "wb")`` raised (read-only assets dir), the bytes were discarded and the default logo was served — a silent regression for that deployment. Now serve the validated bytes inline via ``Response(...)`` as a fallback before falling back to default. Tests: - Replaced low-level mocks of ``validate_url`` with mocks of ``async_safe_get`` directly, exercising the helper's contract rather than the SSRF primitive's internals. - New ``test_rejects_svg_content_type`` confirms SVG is blocked. - ``test_get_image_cache_logic`` fixture now sets ``mock_response.is_redirect = False`` so ``async_safe_get`` doesn't treat the Mock's truthy attribute as a redirect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…eaner test fixture Two cleanups from the /simplify review pass: * ``Response`` was imported inside the ``except OSError`` branch in ``/get_image`` and at the top of ``/get_favicon``. Per the project's no-inline-imports rule (CLAUDE.md), hoisted to the existing ``from fastapi.responses import (...)`` block at the top of ``proxy_server.py``. * The test class's ``_patches()`` helper returned a 2-element list of patch context managers and tests indexed into them via ``self._patches(...)[0], self._patches()[1]`` — two distinct calls with confusing aliasing semantics. Restructured to: - module-level ``_patch_async_safe_get(...)`` that returns a single patch context manager - autouse fixture that patches ``get_async_httpx_client`` for every test in the file (it's the same patch in every case) - small ``_image_response(...)`` factory to deduplicate Mock setup Tests now read as ``with _patch_async_safe_get(return_value=...):`` with no list-indexing or duplicate Mock construction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…async_safe_get Variant analysis on the unauthenticated /get_image SSRF surfaced one related sink in an admin-only endpoint: ``test_hashicorp_vault_connection`` in ``config_override_endpoints.py:402`` calls ``async_client.get(f"{vault_addr}/v1/auth/token/lookup-self")`` with no SSRF guard. ``vault_addr`` is admin-set, so the threat model is "admin misconfig (or attacker with admin creds) pivots Vault calls to cloud metadata or another internal IP." Same fix shape as the unauthenticated endpoints: wrap in ``async_safe_get`` so each redirect hop is re-validated and private networks are rejected. Admins running against a legitimate internal Vault should add the host to ``litellm.user_url_allowed_hosts`` — the existing escape hatch already used elsewhere in the codebase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Made-with: Cursor

``Response`` is already imported from the top-level ``fastapi`` package via the multi-line ``from fastapi import (...)`` block at the top of the file (along with ``Depends``, ``HTTPException``, etc.) — ``fastapi.Response`` is the same class that ``fastapi.responses`` re-exports. The earlier ``from fastapi.responses import Response`` addition triggered ruff F811 for redefinition. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… option Pass-through endpoints configured in ``general_settings.pass_through_endpoints`` defaulted to ``auth: false`` and the safe ``auth: true`` setting was rejected at startup unless the operator had a LiteLLM Enterprise license. Net result: OSS deployments had **no safe configuration** — every pass-through admins added without remembering ``auth: true`` shipped an unauthenticated forwarder, and remembering ``auth: true`` raised a hard "enterprise-only" error. Three changes: * ``litellm/proxy/_types.py`` — flip ``PassThroughGenericEndpoint.auth`` default to ``True``. Operators who add a pass-through with no explicit ``auth`` value now get a safe, authenticated forwarder by default. Setting ``auth: false`` remains supported for genuine public-forwarder use cases (e.g. webhook receivers). * ``litellm/proxy/pass_through_endpoints/pass_through_endpoints.py`` — drop the ``premium_user`` gate around ``auth: true``. An unauthenticated forwarder is a deployment choice operators should be allowed to make explicitly, but the safe option must always be free. The product-tier decision (which features sit behind the enterprise license) is separate from "OSS users must always have a safe option." * ``litellm/proxy/auth/user_api_key_auth.py`` — the runtime dispatch pulls pass-through endpoints from ``general_settings`` as raw dicts, so the Pydantic default doesn't apply. Switched ``endpoint.get("auth")`` to ``endpoint.get("auth", True)`` so a config dict without an explicit ``auth`` key still requires authentication at request time. Tests: - ``test_passthrough_auth_defaults_to_true`` — Pydantic default is now safe. - ``test_passthrough_auth_can_still_be_explicitly_disabled`` — opt-in to ``auth=False`` still works for legitimate public-forwarder use cases. - ``test_register_passthrough_with_auth_true_works_for_oss`` — ``premium_user=False`` no longer rejects ``auth=true``. - ``test_runtime_check_treats_missing_auth_key_as_authenticated`` — raw dict without an ``auth`` key now requires auth (the previously-unauthenticated forwarder). - ``test_runtime_check_explicit_auth_false_still_skips_validation`` — explicit opt-in still works. Closes GHSA-7h34-mmrh-6g58. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous commit removed the only top-level use of ``CommonProxyErrors`` (the enterprise-gate ``raise ValueError``). Ruff F401 flagged the import as unused; the function-local import at line 2601 in a separate handler is the only remaining caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Made-with: Cursor

chore(proxy): contain UI_LOGO_PATH / LITELLM_FAVICON_URL on unauthenticated asset endpoints

chore(cli): tighten CLI SSO session flow

…cbb6cf [Test] Proxy E2E: Opt In To Client Mock Response For Model Access Tests

The async/sync delete_response_api_handler always passed json=data into httpx.delete, where data is {} from the transformer. httpx serializes that to a 2-byte body. The Azure Responses DELETE endpoint now rejects any request body with code: unexpected_body, breaking test_basic_openai_responses_delete_endpoint on the llm_responses_api_testing job. Build the kwargs dict and only set json= when data is truthy. Add unit tests that patch httpx.delete and assert json/data are not in the captured kwargs for the Azure DELETE path (sync and async).

…hawking-19bdeb [Fix] Responses API: Omit Empty Body On DELETE

… miss

…tHooks Run pre_call_hook on Google generateContent endpoints

…ntMultiPod [Fix] Refresh Redis TTL on counter writes, skip stale in-memory in Redis

…agination Add pagination controls to model health status

…/litellm into litellm_auth_bypass_tag_based_routing

…metadata_labels feat(vertex_ai): propagate metadata labels to embedding, Imagen, rerank

…mode-nonstreaming-mixed-tools fix(anthropic): json response_format + user tools non-streaming

[Infra] Bump Versions

…_based_routing add test(tag-routing): prevent header regex bypass for strict plain t…

greptile-apps · 2026-05-01T12:52:35Z

Too many files changed for review. (113 files found, 100 file limit)

StatPan · 2026-05-01T13:01:25Z

Opened a fully clean replacement PR here: #26983

Reason: #26982 reused the earlier head branch and inherited the wrong ancestry for litellm_oss_staging, which pulled in unrelated diff and triggered fork/dependency guard noise.

#26983 is rebuilt from origin/litellm_oss_staging and cherry-picks only the intended Gemini fix/test commits. Current diff is limited to:

litellm/llms/vertex_ai/vertex_llm_base.py
tests/test_litellm/llms/vertex_ai/test_vertex_llm_base.py

StatPan · 2026-05-14T06:55:43Z

Closing as superseded by the clean actionable PR #26983. This branch inherited unrelated diff from the wrong ancestry; #26983 is rebuilt from and should be the review path.

Sameerlite and others added 30 commits April 2, 2026 09:56

Merge pull request BerriAI#25521 from BerriAI/main

5805609

merge main

Fix mypy

ebdc34e

Add litellm_metadata support

b19951a

Merge pull request BerriAI#25949 from BerriAI/litellm_internal_staging

b594001

merge litellm_internal_staging

Merge branch 'litellm_internal_staging' into litellm_vertex_request_m…

2b028a6

…etadata_labels

Merge pull request BerriAI#26304 from BerriAI/litellm_internal_staging

47f46b4

merge main

Merge pull request BerriAI#26379 from BerriAI/litellm_internal_staging

9d58e6e

merge main

Fix black formatting

f503c06

Merge pull request BerriAI#26437 from BerriAI/litellm_internal_staging

be41d4b

merge main

Fix black

3337128

fix lint

09b648a

Merge pull request BerriAI#26745 from BerriAI/litellm_internal_staging

d7862e7

merge mian

add test(tag-routing): prevent header regex bypass for strict plain t…

25cf097

…ags. Add tests to validate the condition improve the conditional readability by naming the plain-tag check explicitly.

Add health status pagination controls

1277cbe

Made-with: Cursor

Fix model e2e result count locator

0b9d06a

Made-with: Cursor

Address health pagination review feedback

a7a3f0a

Made-with: Cursor

tighten cli sso session flow

88d8a80

Fix health pagination review issues

181e99b

Made-with: Cursor

yuneng-berri and others added 22 commits April 30, 2026 17:10

Merge pull request BerriAI#26815 from stuxf/fix/get-image-lfi-ssrf

15b7386

chore(proxy): contain UI_LOGO_PATH / LITELLM_FAVICON_URL on unauthenticated asset endpoints

Merge pull request BerriAI#26835 from stuxf/codex/cli-sso-flow-binding

bdcc238

chore(cli): tighten CLI SSO session flow

Merge pull request BerriAI#26941 from BerriAI/litellm_/stoic-jemison-…

326bcd6

…cbb6cf [Test] Proxy E2E: Opt In To Client Mock Response For Model Access Tests

Merge pull request BerriAI#26949 from BerriAI/litellm_/condescending-…

76e43b7

…hawking-19bdeb [Fix] Responses API: Omit Empty Body On DELETE

Refresh Redis TTL on counter writes and skip stale in-memory on Redis…

9f08db9

… miss

Invalidate spend counters on budget reset

fed5f36

Match docstring style on async_increment_cache

ff2a938

Reorder counter invalidation to run after DB write

4e26835

Merge pull request BerriAI#26914 from BerriAI/litellm_googleGenConten…

e4fb325

…tHooks Run pre_call_hook on Google generateContent endpoints

Merge pull request BerriAI#26829 from BerriAI/litellm_budgetEnforceme…

05e6402

…ntMultiPod [Fix] Refresh Redis TTL on counter writes, skip stale in-memory in Redis

Merge pull request BerriAI#26826 from BerriAI/litellm_health_status_p…

ebbe2f4

…agination Add pagination controls to model health status

Merge branch 'litellm_internal_staging' of https://github.com/BerriAI…

7c86e60

…/litellm into litellm_auth_bypass_tag_based_routing

Merge pull request BerriAI#25499 from BerriAI/litellm_vertex_request_…

72ddbce

…metadata_labels feat(vertex_ai): propagate metadata labels to embedding, Imagen, rerank

Merge pull request BerriAI#26222 from BerriAI/litellm_anthropic-json-…

efa33bf

…mode-nonstreaming-mixed-tools fix(anthropic): json response_format + user tools non-streaming

bump: version 0.4.69 → 0.4.70

dd549d9

uv lock

6da13ef

Merge pull request BerriAI#26961 from BerriAI/yj_bump_apr30

9397409

[Infra] Bump Versions

Merge pull request BerriAI#26805 from BerriAI/litellm_auth_bypass_tag…

eab0075

…_based_routing add test(tag-routing): prevent header regex bypass for strict plain t…

fix(gemini): avoid duplicate model route for full api_base

d8418ff

test(gemini): cover full-route custom api_base handling

0d6d94c

test(vertex): restore credential project validation expectation

f981e4a

StatPan mentioned this pull request May 1, 2026

fix(gemini): avoid duplicate model route for full api_base #26980

Closed

2 tasks

taxfree-python mentioned this pull request May 2, 2026

fix(langfuse_otel): serialize pydantic Message objects in set_messages #26978

Open

StatPan closed this May 14, 2026

This was referenced May 14, 2026

Track upstream contribution: Gemini full api_base duplicate route fix StatPan/litellm#1

Open

fix(gemini): avoid duplicate model route for full api_base #26983

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(gemini): avoid duplicate model route for full api_base#26982

fix(gemini): avoid duplicate model route for full api_base#26982
StatPan wants to merge 116 commits into
BerriAI:litellm_oss_stagingfrom
StatPan:fix/gemini-full-api-base-duplicate-path

StatPan commented May 1, 2026

Uh oh!

greptile-apps Bot commented May 1, 2026

Uh oh!

StatPan commented May 1, 2026

Uh oh!

StatPan commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Uh oh!

Conversation

StatPan commented May 1, 2026

Summary

Context

Test Plan

Uh oh!

greptile-apps Bot commented May 1, 2026

Uh oh!

StatPan commented May 1, 2026

Uh oh!

StatPan commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants