refactor(clients): disambiguate model identity + fix /v1/models stub (#100)#101
Merged
Merged
Conversation
Every client now uses two unambiguously-named identity attributes: - self.model the wire "model" field, sent verbatim to the backend - self.sampling_key the registry-lookup key for apply_sampling_defaults Previously self.model meant different things across clients: a derived registry-lookup stem on VLLMClient, but the wire id (doubling as the key) on ollama/openai_compat/anthropic/llamafile. That overload was the smell. VLLMClient is the only client that changes meaningfully: its wire id moves from self.model_path to self.model (the value sent is byte-identical), and the derived stem moves from self.model to self.sampling_key. The other four clients keep self.model exactly as-is and gain a self.sampling_key alias so the registry lookup reads an unambiguous name. self.model_path is dropped as an attribute (nothing external read it). The --model-path CLI flag and VLLMClient(model_path=...) ctor param are unchanged model_path lives on only at the locked boundary. Zero proxy-user impact: CLI flags, ctor kwargs, wire values, and all output schemas (/v1/models, completion model echo, eval JSONL) are untouched. Internal-only; no version bump (rides the next release). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t protocol Closes #100. /v1/models previously returned a hardcoded {"id": "forge"} stub regardless of the backend. It now reports self._client.model — the real wire id every client carries after the identity refactor (served-model-name for vLLM, gguf stem for llama.cpp, model tag for ollama). No fallback: a client lacking .model raises rather than serving a lie. Elevates model: str to the LLMClient protocol (sibling of api_format), making the wire-id attribute a real contract now that all five clients set it uniformly. No type-checker runs in CI today, so this is a documented contract rather than an enforced one — the direct read is what does the work. Also extends scripts/smoke_test_proxy.py with /v1/models coverage (absent before): the external "default" placeholder case and the configured-model case. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Two related changes to model identity, in two commits:
self.modelacross all five clients into two unambiguously-named roles./v1/modelsnow reports the real backend model instead of a hardcoded"forge"stub, which the refactor makes a clean one-liner.1. Identity disambiguation
self.model"model"field, sent verbatim to the backendself.sampling_keyapply_sampling_defaultsself.modelpreviously meant a derived registry stem onVLLMClientbut the wire id on every other client.VLLMClientis the only client that changes meaningfully (wire id movesself.model_path→self.model, byte-identical value; stem moves toself.sampling_key;self.model_pathdropped as an attribute). The other four keepself.modelas-is and gain aself.sampling_keyalias. The--model-pathCLI flag andVLLMClient(model_path=...)ctor param are unchanged —model_pathsurvives only at the locked boundary.2. Fix #100 — /v1/models stub
_handle_modelsreturned{"id": "forge"}regardless of backend. It now reportsself._client.model— the real wire id (served-model-name for vLLM, gguf stem for llama.cpp, model tag for ollama). No fallback: a client without.modelraises rather than serving a false id. The identity refactor is what makes this correct — before it,self._client.modelwas inconsistent across backends.Also elevates
model: strto theLLMClientprotocol (sibling ofapi_format), making the wire-id attribute a documented contract now that all five clients set it uniformly.Compatibility: zero proxy-user impact
CLI flags, ctor kwargs, the wire
"model"value, and output schemas are all untouched. The renames are client-internal variable names; the only externally-visible behavior change is the intended one —/v1/modelsnow tells the truth.Verification
tests/unitsuite green (1086).scripts/smoke_test_proxy.py) extended with/v1/modelscoverage (none before)./v1/modelsreports the real id on both; tool-call round-trips end-to-end (ollama validates the tag, confirming wire-id integrity).Lands in v0.7.4.
🤖 Generated with Claude Code