Skip to content

[absorption] Port cheap-llm provider configs (Minimax, Kimi, Fireworks) into llm-router #65

Description

@kilo-code-bot

Source

Target

  • KooshaPari/phenoAI — Rust workspace; the llm-router crate is the target for this absorption
  • Per registry/registry.yaml: llm-router is "Multi-provider LLM routing with fallback, retry, and cost policy" — exact overlap with cheap-llm router

Why this is the right home

The cheap-llm Python router is a re-implementation of what llm-router already does. Mapping the capabilities:

cheap-llm capability providers/cheap_llm/cheap_llm_mcp/*.py llm-router equivalent
Multi-provider routing with fallback router.py (Router class) Already exists
OpenAI-compatible HTTP client (Minimax/Moonshot/Fireworks all expose OpenAI-shaped APIs) providers/openai_compat.py (OpenAICompatProvider) Already exists (uses reqwest + official SDKs)
Token counting + cost estimation ledger.py (PRICING dict + estimate_cost_usd) Exists in cost-policy module — needs PRICING table ported
JSONL audit ledger ledger.py (Ledger class) Use thegent-jsonl crate if jsonl append-only is needed; otherwise tracing spans
TOML config config.py (Config, ProviderConfig) Use phenotype-config TOML loader
Provider health probes providers/openai_compat.py:health() Add to llm-router health module
Live /v1/models discovery providers/openai_compat.py:list_models() Add to llm-router models module
Retry with exponential backoff retry.py (with_retry) ResilienceKit — use backoff crate

What to port (assets to migrate)

1. Provider configs (PRICING table)

Source: providers/cheap_llm/cheap_llm_mcp/ledger.py lines 11-18

PRICING: dict[str, tuple[float, float]] = {
    "MiniMax-M2": (0.30, 1.20),
    "MiniMax-M2.5": (0.30, 1.20),
    "MiniMax-M2.7": (0.30, 1.20),
    "kimi-k2-turbo-preview": (0.60, 2.50),
    "accounts/fireworks/models/kimi-k2-instruct": (1.00, 3.00),
    "_default": (1.00, 3.00),
}

→ Add as crates/llm-router/src/providers/pricing.rs with USD-per-1M-token rates, include M2.7, kimi-k2-turbo-preview, and the Fireworks variants.

2. Provider default configs

Source: providers/cheap_llm/cheap_llm_mcp/config.py _defaults() function

  • Minimax: https://api.minimax.io/v1, default model MiniMax-M2.7-highspeed, variants base|highspeed|codex
  • Kimi: https://api.moonshot.ai/v1, default model kimi-k2-turbo-preview, variant turbo
  • Fireworks: https://api.fireworks.ai/inference/v1, default model accounts/fireworks/models/minimax-m2p7, variants minimax|kimi
    → Add to crates/llm-router/src/providers/{minimax,kimi,fireworks}.rs as built-in provider presets.

3. Health-probe and model-list patterns

Source: providers/cheap_llm/cheap_llm_mcp/providers/openai_compat.py:health() and list_models()
→ If llm-router already has these, no port needed. If not, add minimal Health trait and list_models helper.

4. Python server (server.py) and CLI (cli.py)

Source: 6 @mcp.tool decorators + python -m cheap_llm_mcp entry
Do NOT port to llm-router directly. The MCP layer belongs in PhenoMCPServers (or its in-tree replacement). After provider config lands in llm-router, the MCP server should consume llm-router from Python via PyO3 bindings, or re-implement the 6 tools using the new provider configs.

What to EXCLUDE

  • The bridge.go (Go subprocess bridge) — dead code, never imported by phenotype-ops-mcp binary. Exclude.
  • The 6 cheap-llm MCP tool names (cheapllm_*) — these are MCP-layer names; the new MCP server (in PhenoMCPServers) should pick a new naming convention per the org standard (<server>_<verb>_<noun> per PhenoMCP MCP-CATALOG.md).
  • The JSONL ledger format — replace with tracing spans + thegent-jsonl if audit trail is needed.

Status

  • Source identified
  • Target agreed: llm-router crate
  • PRICING table ported to crates/llm-router/src/providers/pricing.rs
  • Minimax provider preset added
  • Kimi provider preset added
  • Fireworks provider preset added
  • Health-probe / list_models integration decision
  • Python MCP server (re-implemented in PhenoMCPServers) consumes new llm-router provider configs
  • phenotype-ops-mcp archived (after all 4 absorptions complete)

Notes

  • llm-router already wraps official Anthropic, OpenAI, and modelcontextprotocol SDKs. The OpenAI-compat path used by cheap-llm providers (Minimax, Moonshot, Fireworks all expose OpenAI-shaped APIs) can probably reuse the existing OpenAI provider.
  • The 6 cheap-llm MCP tools (cheapllm_complete_prompt, cheapllm_stream_completion, cheapllm_check_health, cheapllm_get_cost, cheapllm_list_providers, cheapllm_list_models) are an MCP-server surface, not a router concern. The MCP server (in PhenoMCPServers after absorption) should provide these as thin wrappers over llm-router calls.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions