[absorption] Port cheap-llm provider configs (Minimax, Kimi, Fireworks) into llm-router

## Source

- Repo: `KooshaPari/phenotype-ops-mcp` (was a fork of `nanovms/ops-mcp` with `cheap-llm-mcp` merged in via PR #47 on 2026-06-11)
- Source path: `providers/cheap_llm/cheap_llm_mcp/`
- Coordination issue: [phenotype-registry#127](https://github.com/KooshaPari/phenotype-registry/issues/127)
- Sister issue: [PhenoMCPServers#28](https://github.com/KooshaPari/PhenoMCPServers/issues/28)

## Target

- `KooshaPari/phenoAI` — Rust workspace; the **`llm-router` crate** is the target for this absorption
- Per `registry/registry.yaml`: `llm-router` is "Multi-provider LLM routing with fallback, retry, and cost policy" — exact overlap with cheap-llm router

## Why this is the right home

The cheap-llm Python router is a re-implementation of what `llm-router` already does. Mapping the capabilities:

| cheap-llm capability | `providers/cheap_llm/cheap_llm_mcp/*.py` | `llm-router` equivalent |
|----------------------|------------------------------------------|-------------------------|
| Multi-provider routing with fallback | `router.py` (Router class) | Already exists |
| OpenAI-compatible HTTP client (Minimax/Moonshot/Fireworks all expose OpenAI-shaped APIs) | `providers/openai_compat.py` (OpenAICompatProvider) | Already exists (uses `reqwest` + official SDKs) |
| Token counting + cost estimation | `ledger.py` (PRICING dict + estimate_cost_usd) | Exists in cost-policy module — needs PRICING table ported |
| JSONL audit ledger | `ledger.py` (Ledger class) | Use `thegent-jsonl` crate if jsonl append-only is needed; otherwise `tracing` spans |
| TOML config | `config.py` (Config, ProviderConfig) | Use `phenotype-config` TOML loader |
| Provider health probes | `providers/openai_compat.py:health()` | Add to `llm-router` health module |
| Live `/v1/models` discovery | `providers/openai_compat.py:list_models()` | Add to `llm-router` models module |
| Retry with exponential backoff | `retry.py` (with_retry) | `ResilienceKit` — use `backoff` crate |

## What to port (assets to migrate)

### 1. Provider configs (PRICING table)
Source: `providers/cheap_llm/cheap_llm_mcp/ledger.py` lines 11-18
```python
PRICING: dict[str, tuple[float, float]] = {
    "MiniMax-M2": (0.30, 1.20),
    "MiniMax-M2.5": (0.30, 1.20),
    "MiniMax-M2.7": (0.30, 1.20),
    "kimi-k2-turbo-preview": (0.60, 2.50),
    "accounts/fireworks/models/kimi-k2-instruct": (1.00, 3.00),
    "_default": (1.00, 3.00),
}
```
→ Add as `crates/llm-router/src/providers/pricing.rs` with USD-per-1M-token rates, include `M2.7`, `kimi-k2-turbo-preview`, and the Fireworks variants.

### 2. Provider default configs
Source: `providers/cheap_llm/cheap_llm_mcp/config.py` `_defaults()` function
- Minimax: `https://api.minimax.io/v1`, default model `MiniMax-M2.7-highspeed`, variants `base|highspeed|codex`
- Kimi: `https://api.moonshot.ai/v1`, default model `kimi-k2-turbo-preview`, variant `turbo`
- Fireworks: `https://api.fireworks.ai/inference/v1`, default model `accounts/fireworks/models/minimax-m2p7`, variants `minimax|kimi`
→ Add to `crates/llm-router/src/providers/{minimax,kimi,fireworks}.rs` as built-in provider presets.

### 3. Health-probe and model-list patterns
Source: `providers/cheap_llm/cheap_llm_mcp/providers/openai_compat.py:health()` and `list_models()`
→ If `llm-router` already has these, no port needed. If not, add minimal `Health` trait and `list_models` helper.

### 4. Python server (`server.py`) and CLI (`cli.py`)
Source: 6 `@mcp.tool` decorators + `python -m cheap_llm_mcp` entry
→ **Do NOT port to `llm-router` directly.** The MCP layer belongs in `PhenoMCPServers` (or its in-tree replacement). After provider config lands in `llm-router`, the MCP server should consume `llm-router` from Python via PyO3 bindings, or re-implement the 6 tools using the new provider configs.

## What to EXCLUDE

- The `bridge.go` (Go subprocess bridge) — dead code, never imported by `phenotype-ops-mcp` binary. Exclude.
- The 6 cheap-llm MCP tool names (`cheapllm_*`) — these are MCP-layer names; the new MCP server (in PhenoMCPServers) should pick a new naming convention per the org standard (`<server>_<verb>_<noun>` per PhenoMCP MCP-CATALOG.md).
- The JSONL ledger format — replace with `tracing` spans + `thegent-jsonl` if audit trail is needed.

## Status

- [x] Source identified
- [x] Target agreed: `llm-router` crate
- [ ] PRICING table ported to `crates/llm-router/src/providers/pricing.rs`
- [ ] Minimax provider preset added
- [ ] Kimi provider preset added
- [ ] Fireworks provider preset added
- [ ] Health-probe / list_models integration decision
- [ ] Python MCP server (re-implemented in PhenoMCPServers) consumes new `llm-router` provider configs
- [ ] `phenotype-ops-mcp` archived (after all 4 absorptions complete)

## Notes

- `llm-router` already wraps official Anthropic, OpenAI, and `modelcontextprotocol` SDKs. The OpenAI-compat path used by cheap-llm providers (Minimax, Moonshot, Fireworks all expose OpenAI-shaped APIs) can probably reuse the existing OpenAI provider.
- The 6 cheap-llm MCP tools (cheapllm_complete_prompt, cheapllm_stream_completion, cheapllm_check_health, cheapllm_get_cost, cheapllm_list_providers, cheapllm_list_models) are an MCP-server surface, not a router concern. The MCP server (in PhenoMCPServers after absorption) should provide these as thin wrappers over `llm-router` calls.

cheap-llm capability	`providers/cheap_llm/cheap_llm_mcp/*.py`	`llm-router` equivalent
Multi-provider routing with fallback	`router.py` (Router class)	Already exists
OpenAI-compatible HTTP client (Minimax/Moonshot/Fireworks all expose OpenAI-shaped APIs)	`providers/openai_compat.py` (OpenAICompatProvider)	Already exists (uses `reqwest` + official SDKs)
Token counting + cost estimation	`ledger.py` (PRICING dict + estimate_cost_usd)	Exists in cost-policy module — needs PRICING table ported
JSONL audit ledger	`ledger.py` (Ledger class)	Use `thegent-jsonl` crate if jsonl append-only is needed; otherwise `tracing` spans
TOML config	`config.py` (Config, ProviderConfig)	Use `phenotype-config` TOML loader
Provider health probes	`providers/openai_compat.py:health()`	Add to `llm-router` health module
Live `/v1/models` discovery	`providers/openai_compat.py:list_models()`	Add to `llm-router` models module
Retry with exponential backoff	`retry.py` (with_retry)	`ResilienceKit` — use `backoff` crate

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[absorption] Port cheap-llm provider configs (Minimax, Kimi, Fireworks) into llm-router #65

Source

Target

Why this is the right home

What to port (assets to migrate)

1. Provider configs (PRICING table)

2. Provider default configs

3. Health-probe and model-list patterns

4. Python server (`server.py`) and CLI (`cli.py`)

What to EXCLUDE

Status

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[absorption] Port cheap-llm provider configs (Minimax, Kimi, Fireworks) into llm-router #65

Description

Source

Target

Why this is the right home

What to port (assets to migrate)

1. Provider configs (PRICING table)

2. Provider default configs

3. Health-probe and model-list patterns

4. Python server (server.py) and CLI (cli.py)

What to EXCLUDE

Status

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

4. Python server (`server.py`) and CLI (`cli.py`)