refactor(clients): disambiguate model identity + fix /v1/models stub (#100) by antoinezambelli · Pull Request #101 · antoinezambelli/forge

antoinezambelli · 2026-06-01T22:21:18Z

What

Two related changes to model identity, in two commits:

Refactor — disambiguates the overloaded self.model across all five clients into two unambiguously-named roles.
Fix /v1/models endpoint doesn't pass through request to backend and list models #100 — /v1/models now reports the real backend model instead of a hardcoded "forge" stub, which the refactor makes a clean one-liner.

1. Identity disambiguation

attribute	meaning
`self.model`	the wire `"model"` field, sent verbatim to the backend
`self.sampling_key`	the registry-lookup key for `apply_sampling_defaults`

self.model previously meant a derived registry stem on VLLMClient but the wire id on every other client. VLLMClient is the only client that changes meaningfully (wire id moves self.model_path → self.model, byte-identical value; stem moves to self.sampling_key; self.model_path dropped as an attribute). The other four keep self.model as-is and gain a self.sampling_key alias. The --model-path CLI flag and VLLMClient(model_path=...) ctor param are unchanged — model_path survives only at the locked boundary.

2. Fix #100 — /v1/models stub

_handle_models returned {"id": "forge"} regardless of backend. It now reports self._client.model — the real wire id (served-model-name for vLLM, gguf stem for llama.cpp, model tag for ollama). No fallback: a client without .model raises rather than serving a false id. The identity refactor is what makes this correct — before it, self._client.model was inconsistent across backends.

Also elevates model: str to the LLMClient protocol (sibling of api_format), making the wire-id attribute a documented contract now that all five clients set it uniformly.

Compatibility: zero proxy-user impact

CLI flags, ctor kwargs, the wire "model" value, and output schemas are all untouched. The renames are client-internal variable names; the only externally-visible behavior change is the intended one — /v1/models now tells the truth.

Verification

Full tests/unit suite green (1086).
Mock proxy smoke (scripts/smoke_test_proxy.py) extended with /v1/models coverage (none before).
Live smoke against real backends on an 8B Q4 (llama.cpp + ollama): /v1/models reports the real id on both; tool-call round-trips end-to-end (ollama validates the tag, confirming wire-id integrity).

Lands in v0.7.4.

🤖 Generated with Claude Code

Every client now uses two unambiguously-named identity attributes: - self.model the wire "model" field, sent verbatim to the backend - self.sampling_key the registry-lookup key for apply_sampling_defaults Previously self.model meant different things across clients: a derived registry-lookup stem on VLLMClient, but the wire id (doubling as the key) on ollama/openai_compat/anthropic/llamafile. That overload was the smell. VLLMClient is the only client that changes meaningfully: its wire id moves from self.model_path to self.model (the value sent is byte-identical), and the derived stem moves from self.model to self.sampling_key. The other four clients keep self.model exactly as-is and gain a self.sampling_key alias so the registry lookup reads an unambiguous name. self.model_path is dropped as an attribute (nothing external read it). The --model-path CLI flag and VLLMClient(model_path=...) ctor param are unchanged model_path lives on only at the locked boundary. Zero proxy-user impact: CLI flags, ctor kwargs, wire values, and all output schemas (/v1/models, completion model echo, eval JSONL) are untouched. Internal-only; no version bump (rides the next release). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…t protocol Closes #100. /v1/models previously returned a hardcoded {"id": "forge"} stub regardless of the backend. It now reports self._client.model — the real wire id every client carries after the identity refactor (served-model-name for vLLM, gguf stem for llama.cpp, model tag for ollama). No fallback: a client lacking .model raises rather than serving a lie. Elevates model: str to the LLMClient protocol (sibling of api_format), making the wire-id attribute a real contract now that all five clients set it uniformly. No type-checker runs in CI today, so this is a documented contract rather than an enforced one — the direct read is what does the work. Also extends scripts/smoke_test_proxy.py with /v1/models coverage (absent before): the external "default" placeholder case and the configured-model case. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

antoinezambelli and others added 2 commits June 1, 2026 17:20

antoinezambelli changed the title ~~refactor(clients): disambiguate model identity into model (wire id) + sampling_key~~ refactor(clients): disambiguate model identity + fix /v1/models stub (#100) Jun 1, 2026

antoinezambelli merged commit ad16280 into main Jun 1, 2026
2 checks passed

antoinezambelli deleted the refactor/model-identity-disambiguation branch June 1, 2026 23:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(clients): disambiguate model identity + fix /v1/models stub (#100)#101

refactor(clients): disambiguate model identity + fix /v1/models stub (#100)#101
antoinezambelli merged 2 commits into
mainfrom
refactor/model-identity-disambiguation

antoinezambelli commented Jun 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

antoinezambelli commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

1. Identity disambiguation

2. Fix #100 — /v1/models stub

Compatibility: zero proxy-user impact

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

antoinezambelli commented Jun 1, 2026 •

edited

Loading