Skip to content

feat: model_override adapter with parameterized adapter config#438

Merged
mcowger merged 6 commits into
mainfrom
adapter-model-override
May 18, 2026
Merged

feat: model_override adapter with parameterized adapter config#438
mcowger merged 6 commits into
mainfrom
adapter-model-override

Conversation

@mcowger
Copy link
Copy Markdown
Owner

@mcowger mcowger commented May 18, 2026

Why

Some LLM providers don't respect reasoning-related fields in the request body. Instead, they expose reasoning variants as separate model names — e.g. deepseek-r1 (with reasoning) vs deepseek-r1-fast (without). When a client sends reasoning: { effort: "none" } or enable_thinking: false, the provider ignores it because the client addressed the "thinking" model, not the "fast" one.

Plexus currently has no way to bridge this gap — the router picks a target model, and that's what gets sent to the provider. There was no mechanism to say "if the client is asking for no reasoning, switch to the fast variant instead."

What This Enables

Conditional model rewriting based on request content. You can now configure rules that say:

"When the target model is deepseek-r1 AND any of these conditions are true (reasoning disabled, thinking turned off, budget set to zero, etc.), rewrite the model name to deepseek-r1-fast before sending it to the provider."

This means:

  • Clients can keep using a single model alias (e.g. glm-5.1) and let Plexus automatically pick the right provider variant based on what features they're requesting
  • Providers that use separate model names for reasoning/non-reasoning variants are now first-class citizens in Plexus — no manual model switching required
  • The rewrite is transparent to the client — they still see the canonical model name in billing/usage tracking

Bonus Fix

The OpenAI chat→chat transformer was silently dropping any fields it didn't explicitly know about (e.g. enable_thinking, chat_template_kwargs, thinking_budget, budget_tokens, thinking). This meant even if you configured conditions for those fields, the adapter could never match them because the fields were gone by the time it saw the payload.

Now, same-API-type transformations (chat→chat) preserve all unknown fields from the original request body, overlaying only the fields that need explicit transformation. This makes the adapter useful for any non-standard field a provider might use.

How to Use It

Configure the model_override adapter on a per-model basis in the provider settings (via the Plexus UI or management API):

  1. Open the provider in the UI, expand the model entry (e.g. zai-org/GLM-5.1-FP8)
  2. Under Model Adapters, enable Model Override
  3. Add rules specifying:
    • Match model — the provider model name to match against (e.g. zai-org/GLM-5.1-FP8)
    • Rewrite to — the model name to send instead (e.g. glm-5.1-fast)
    • Conditions — dotted-path fields and values, any match triggers the rewrite (OR semantics)

Example conditions for detecting "no reasoning" requests:

Field path Value
reasoning.enabled false
reasoning.effort "none"
enable_thinking false
thinking.type "disabled"
chat_template_kwargs.enable_thinking false
thinking_budget 0
budget_tokens 0

If value is left empty, the condition matches on field presence alone (any value).

Implementation Notes

  • Adapter config upgraded from bare strings to uniform { name, options } entries. Legacy format is normalized lazily on read — no DB migration needed, rows self-heal on next save
  • Adapter interface gains optional options parameter on all hooks; existing adapters are backward-compatible
  • Frontend includes an inline rule editor for model_override conditions in the per-model adapter section
  • 1344 backend tests passing, frontend build + typecheck + lint clean

mcowger added 6 commits May 18, 2026 14:07
Introduce a conditional model-override adapter that rewrites the
provider model name based on request payload fields (e.g. switching
to a -fast variant when reasoning is disabled).

Key changes:

- Adapter config format upgraded from bare strings to uniform
  { name, options } entries. Legacy string format is normalized
  lazily on read via z.preprocess() and normalizeAdapterEntries()
  in config-repository — no DB migration needed.

- ProviderAdapter interface gains optional 'options' parameter on
  preDispatch/postDispatch/stream hooks. Existing adapters
  (reasoning_content, suppress_developer_role) updated with
  _options param (backward compatible).

- resolveAdapters() now returns ResolvedAdapter[] (adapter +
  options pair) instead of bare ProviderAdapter[].

- Dispatcher updated to destructure { adapter, options } and pass
  options through to all adapter call sites.

- model_override adapter: evaluates OR-semantics conditions against
  the transformed provider payload using dotted-path field
  resolution. Rewrites payload.model when any condition matches.

- Frontend: KNOWN_ADAPTERS updated, model-level adapter checkbox
  logic adapted for { name, options } format, inline rule editor
  for model_override conditions added to ProviderModelsEditor.

- All 1344 backend tests passing.
…ation

When the OpenAI transformer builds the outbound provider payload for a
chat-to-chat transformation, it now starts from a spread of the original
request body instead of an empty object. Explicitly-mapped fields are
overlaid on top, so transformations (e.g. tool normalization, system
message prepending) still apply correctly, but unknown fields like
enable_thinking, thinking, chat_template_kwargs, thinking_budget, and
budget_tokens are no longer silently dropped.

Cross-API transformations (chat→messages, chat→gemini) remain fully
explicit since they are fundamentally different protocols.

This fixes the model_override adapter's inability to match on non-standard
reasoning fields that were previously stripped during transformation.
UI:
- Make match model input 2x wider (flex: 2) so long model names fit
- Add column headers above conditions: 'Field path (dotted)' and
  'Value (blank = presence check)' with a delete-button spacer
- Shorten condition input placeholders to avoid crowding under headers

Docs:
- Add model_override adapter to CONFIGURATION.md adapters table
- Add dedicated 'Model Override Adapter' section with rule/condition
  field descriptions, JSON config example, and usage notes

OpenAPI:
- Update ProviderConfig.yaml adapter schema to accept { name, options }
  object entries in addition to bare strings
- Add model_override to the available adapters list with description
- Update model-level adapter field to match the new schema
The match model field in a model_override rule is always the model name
of the target being configured — it makes no sense to edit it. Changed
to a read-only shaded div showing the model name (mId) with ellipsis
overflow for long names. New rules auto-populate the model field with mId.
The Input component has w-full (width: 100%) on its inner <input>,
which overrode any flex value set via the style prop. Moving flex to a
wrapper div makes the field-path column genuinely 2x wider than the
value column.
@mcowger mcowger merged commit e7eb887 into main May 18, 2026
1 check passed
@mcowger mcowger deleted the adapter-model-override branch May 18, 2026 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant