Model Routing

How the bot selects the right LLM model for each request through tier-based routing.

See also: Configuration, Quick Start, Deployment

Overview

The bot uses a 4-tier model selection strategy that picks the most appropriate model based on task complexity. The tier is determined from multiple sources with clear priority:

User preference - set via /tier command or set_tier tool
Skill override - model_tier field in skill YAML frontmatter
Dynamic upgrade - DynamicTierSystem promotes to coding when code activity is detected mid-conversation
Fallback - "balanced" when no tier is explicitly set
Per-user model override - set via /model command

Operationally, model setup now follows this flow:

Configure provider profiles in LLM Providers
Maintain capability metadata in Model Catalog
Assign routing and tier slots in Model Router

User Message
    |
    v
[ContextBuildingSystem]  --- Resolves tier from user prefs / active skill
    |                        Priority: force+user > skill > user pref > balanced
    v
[DynamicTierSystem]      --- May upgrade to coding if code activity is detected
    |                        (only on iteration > 0, never downgrades)
    v
[ModelSelectionService]  --- Resolves actual model for the tier
    |                        (user override > router config fallback)
    v
[ToolLoopExecutionSystem] --- Selects model + reasoning level based on modelTier
    v
LLM API Call

Model Tiers

Tier	Reasoning	Typical Use Cases	Default Model
balanced	`medium`	Greetings, general questions, summarization	`openai/gpt-5.1`
smart	`high`	Complex analysis, architecture decisions, multi-step planning	`openai/gpt-5.1`
coding	`medium`	Code generation, debugging, refactoring, code review	`openai/gpt-5.2`
deep	`xhigh`	Deep scientific or highly technical reasoning	`openai/gpt-5.2`

Each tier is independently configurable.

Configuration

{
  "modelRouter": {
    "routingModel": "openai/gpt-5.2-codex",
    "routingModelReasoning": "none",
    "balancedModel": "openai/gpt-5.1",
    "balancedModelReasoning": "none",
    "smartModel": "openai/gpt-5.1",
    "smartModelReasoning": "none",
    "codingModel": "openai/gpt-5.2",
    "codingModelReasoning": "none",
    "deepModel": "openai/gpt-5.2",
    "deepModelReasoning": "none",
    "dynamicTierEnabled": true,
    "temperature": 0.7
  }
}

Reasoning models may ignore temperature. The effective reasoning and token limits are derived from models/models.json.

How Tier Assignment Works

Tier Priority

The tier is resolved in ContextBuildingSystem with this priority:

Priority	Source	Condition
1	User preference + force	`tierForce=true` and `modelTier` set
2	Skill `model_tier`	Active skill declares a preferred tier
3	User preference	`modelTier` set without force
4	Fallback	`balanced`

`/tier`

/tier
/tier coding
/tier smart force

Key behavior:

/tier <tier> clears force
/tier <tier> force locks the tier
the setting persists in user preferences

`set_tier` tool

The LLM can switch tiers mid-conversation with:

{
  "tier": "coding"
}

blocked if the user locked the tier with force
applies immediately for the current conversation
does not persist to user preferences

`/model`

Users can override the default model for any tier:

/model
/model list
/model <tier> <provider/model>
/model <tier> reasoning <level>
/model <tier> reset

Key behavior:

overrides are stored per user
default reasoning is auto-applied from models.json
the model provider must be in BOT_MODEL_SELECTION_ALLOWED_PROVIDERS
/tier picks the active tier, /model customizes what each tier points to

Skill override

Skills can declare:

model_tier: coding

If the user has not force-locked the tier, the skill tier takes precedence.

Dynamic Tier Upgrade

DynamicTierSystem can upgrade to coding when the current run shows code activity.

Signals include:

code file reads and writes
code-related shell commands
stack traces in tool results

Rules:

upgrades only, never downgrades
skips if already coding or deep
skips if the user force-locked the tier

Multi-Provider Setup

You can mix different providers across tiers:

{
  "llm": {
    "providers": {
      "openai": { "apiKey": "sk-proj-...", "apiType": "openai" },
      "anthropic": { "apiKey": "sk-ant-...", "apiType": "anthropic" },
      "google": { "apiKey": "AIza...", "apiType": "gemini" }
    }
  },
  "modelRouter": {
    "balancedModel": "openai/gpt-5.1",
    "smartModel": "anthropic/claude-opus-4-6",
    "codingModel": "openai/gpt-5.2"
  }
}

Provider config lookup is based on the model prefix. Protocol dispatch is controlled by llm.providers.<provider>.apiType.

`models.json` Reference

Model capabilities are defined in models/models.json.

The dashboard now manages this through Model Catalog and can fetch live suggestions from provider APIs.

Each entry specifies:

Field	Type	Description
`provider`	string	Provider profile key used by runtime config and model discovery
`displayName`	string	Human-readable label
`supportsTemperature`	boolean	Whether to send `temperature`
`supportsVision`	boolean	Whether the model supports image inputs
`maxInputTokens`	integer	Context limit for non-reasoning models
`reasoning`	object	Reasoning config for reasoning-capable models

Example:

{
  "models": {
    "openai/gpt-5.1": {
      "provider": "openai",
      "displayName": "GPT-5.1",
      "supportsTemperature": false,
      "supportsVision": true,
      "reasoning": {
        "default": "medium",
        "levels": {
          "low": { "maxInputTokens": 1000000 },
          "medium": { "maxInputTokens": 1000000 },
          "high": { "maxInputTokens": 500000 }
        }
      }
    }
  },
  "defaults": {
    "supportsTemperature": true,
    "supportsVision": true,
    "maxInputTokens": 128000
  }
}

Resolution rules

ModelConfigService resolves a model in this order:

exact match, for example openai/gpt-5.1
stripped provider prefix, for example gpt-5.1
prefix match, for example gpt-5.1-preview
fallback to defaults

Both plain ids and provider-scoped ids work, but provider-scoped ids are preferred when the same raw model id can appear under multiple providers.

Live Discovery

The dashboard can seed catalog entries via:

GET /api/models/discover/{provider}

ProviderModelDiscoveryService supports:

OpenAI-compatible /models
Anthropic /v1/models
Gemini /v1beta/models

Only provider profiles with configured API keys can be discovered successfully.

Routing Configuration

{
  "modelRouter": {
    "routingModel": "openai/gpt-5.2-codex",
    "balancedModel": "openai/gpt-5.1",
    "smartModel": "openai/gpt-5.1",
    "codingModel": "openai/gpt-5.2",
    "deepModel": "openai/gpt-5.2",
    "dynamicTierEnabled": true
  }
}

Dashboard mapping:

LLM Providers edits llm.providers
Model Catalog edits models/models.json
Model Router edits modelRouter

Large Input Protection

The bot uses a layered defense against context overflow:

AutoCompactionSystem proactively compacts history
tool results are truncated before they explode the context
emergency per-message truncation is applied on context overflow errors

This is why models.json token limits matter beyond just UI selection.

Key Classes

Class	Purpose
`ContextBuildingSystem`	resolves tier and builds prompt context
`DynamicTierSystem`	upgrades to coding mid-run when needed
`ToolLoopExecutionSystem`	executes the tool loop and final model call logic
`AutoCompactionSystem`	prevents context overflow before the LLM call
`CommandRouter`	handles `/tier` and `/model`
`Langchain4jAdapter`	provider protocol dispatch, tool id remapping, name sanitization
`ModelConfigService`	model capability lookups from `models.json`
`ProviderModelDiscoveryService`	live provider API discovery for the Model Catalog
`ModelSelectionService`	per-user override resolution and provider filtering

Debugging

Typical logs:

[ContextBuilding] Resolved tier: coding
[LLM] Model tier: coding, selected model: openai/gpt-5.2
[DynamicTier] Detected coding activity, upgrading tier: balanced -> coding

Useful commands:

/status
/tier
/model

GolemCore Bot -- Apache License 2.0 | GitHub | Issues | Discussions

Home

Getting Started

Core Concepts

Features

Reference

Commands
FAQ

Development

Model Routing

Model Routing

Overview

Model Tiers

Configuration

How Tier Assignment Works

Tier Priority

/tier

set_tier tool

/model

Skill override

Dynamic Tier Upgrade

Multi-Provider Setup

models.json Reference

Resolution rules

Live Discovery

Routing Configuration

Large Input Protection

Key Classes

Debugging

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Home

Clone this wiki locally

`/tier`

`set_tier` tool

`/model`

`models.json` Reference