Feature Request: Support OpenAI-Compatible Providers for Claude Code

## Summary

Add support for **any OpenAI-compatible API provider** as a selectable option in the **Connect Your AI Agents** setup screen. Users would click Login, enter their API key and endpoint, select a model, and Claude Code sessions route through that provider — no external proxy or manual configuration needed.

This includes **self-hosted models** — any local or on-premises inference server that exposes the standard `/v1/chat/completions` endpoint works out of the box.

## Problem

Claude Code only speaks **Anthropic API format** (`/v1/messages`). This limits HolyClaude users to Anthropic (direct) or Ollama (which added Anthropic compatibility). The vast majority of third-party inference providers use the **OpenAI API format** (`/v1/chat/completions`), which is the de facto industry standard.

Users who want to use these providers today must manually set up an external translation proxy (e.g. LiteLLM), which defeats the purpose of HolyClaude's "just works" philosophy.

## Proposal

HolyClaude handles the API translation internally. When a user selects an OpenAI-compatible provider, a built-in lightweight proxy converts Anthropic ↔ OpenAI format behind the scenes. The user never sees or manages this.

### Setup screen

```
☐ Claude Code           — Not authenticated            [Login]
☐ Cursor                — Command timeout               [Login]
☐ OpenAI Codex          — Codex not configured           [Login]
☐ Gemini                — Gemini CLI not configured      [Login]
☐ OpenAI-Compatible     — Not configured                 [Login]   ← NEW
```

### Login flow

1. Click **Login** next to OpenAI-Compatible
2. Enter **API base URL** (e.g. `https://api.together.xyz/v1`)
3. Enter **API key**
4. Select a **model** from the provider's catalog
5. Done — Claude Code sessions now use that provider

### Environment variables (for docker-compose users)

```yaml
environment:
  - OPENAI_COMPAT_API_KEY=your_key
  - OPENAI_COMPAT_BASE_URL=https://api.provider.com/v1
  - OPENAI_COMPAT_MODEL=meta-llama/Llama-3.3-70B-Instruct
```

## Providers this would unlock

Any provider exposing `/v1/chat/completions`, including but not limited to:

### Cloud providers

| Provider | Endpoint |
|---|---|
| Hyperstack AI Studio | `https://console.hyperstack.cloud/ai/api/v1` |
| Together AI | `https://api.together.xyz/v1` |
| Groq | `https://api.groq.com/openai/v1` |
| Fireworks AI | `https://api.fireworks.ai/inference/v1` |
| Mistral AI | `https://api.mistral.ai/v1` |
| DeepSeek | `https://api.deepseek.com/v1` |

### Self-hosted / local inference

| Server | Example endpoint |
|---|---|
| **vLLM** | `http://localhost:8000/v1` |
| **TGI (Text Generation Inference)** | `http://localhost:8080/v1` |
| **Ollama** (OpenAI-compat mode) | `http://localhost:11434/v1` |
| **LocalAI** | `http://localhost:8080/v1` |
| **LM Studio** | `http://localhost:1234/v1` |
| **llama.cpp server** | `http://localhost:8080/v1` |
| Any custom deployment | User-defined |

This means users can run open-weight models (Llama, Mistral, Qwen, DeepSeek, etc.) on their own GPU hardware — whether a local workstation, a home server, or on-prem infrastructure — and use them directly in Claude Code without any cloud dependency.

## Why

- **"Stop configuring. Start building."** — HolyClaude's own tagline. An external proxy contradicts this.
- **Cost flexibility** — users without an Anthropic subscription get access to open-source models on pay-per-token pricing
- **Self-hosted models** — teams running models on their own hardware get first-class support with zero cloud dependency
- **Fine-tuned models** — teams running custom models on their own infrastructure can use them in Claude Code
- **One feature, many providers** — a single built-in translation layer supports the entire OpenAI-compatible ecosystem

## References

- [Ollama Anthropic API Compatibility](https://docs.ollama.com/api/anthropic-compatibility) — reference for the translation needed
- [LiteLLM Claude Code Quickstart](https://docs.litellm.ai/docs/tutorials/claude_responses_api) — existing external proxy approach
- [Claude Code Third-Party Providers](https://code.claude.com/docs/en/third-party-integrations) — upstream docs on `ANTHROPIC_BASE_URL`

Server	Example endpoint
vLLM	`http://localhost:8000/v1`
TGI (Text Generation Inference)	`http://localhost:8080/v1`
Ollama (OpenAI-compat mode)	`http://localhost:11434/v1`
LocalAI	`http://localhost:8080/v1`
LM Studio	`http://localhost:1234/v1`
llama.cpp server	`http://localhost:8080/v1`
Any custom deployment	User-defined

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Support OpenAI-Compatible Providers for Claude Code #31

Summary

Problem

Proposal

Setup screen

Login flow

Environment variables (for docker-compose users)

Providers this would unlock

Cloud providers

Self-hosted / local inference

Why

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Provider	Endpoint
Hyperstack AI Studio	`https://console.hyperstack.cloud/ai/api/v1`
Together AI	`https://api.together.xyz/v1`
Groq	`https://api.groq.com/openai/v1`
Fireworks AI	`https://api.fireworks.ai/inference/v1`
Mistral AI	`https://api.mistral.ai/v1`
DeepSeek	`https://api.deepseek.com/v1`

Uh oh!

Feature Request: Support OpenAI-Compatible Providers for Claude Code #31

Description

Summary

Problem

Proposal

Setup screen

Login flow

Environment variables (for docker-compose users)

Providers this would unlock

Cloud providers

Self-hosted / local inference

Why

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions