Skip to content

Feature Request: Support OpenAI-Compatible Providers for Claude Code #31

@luis15pt

Description

@luis15pt

Summary

Add support for any OpenAI-compatible API provider as a selectable option in the Connect Your AI Agents setup screen. Users would click Login, enter their API key and endpoint, select a model, and Claude Code sessions route through that provider — no external proxy or manual configuration needed.

This includes self-hosted models — any local or on-premises inference server that exposes the standard /v1/chat/completions endpoint works out of the box.

Problem

Claude Code only speaks Anthropic API format (/v1/messages). This limits HolyClaude users to Anthropic (direct) or Ollama (which added Anthropic compatibility). The vast majority of third-party inference providers use the OpenAI API format (/v1/chat/completions), which is the de facto industry standard.

Users who want to use these providers today must manually set up an external translation proxy (e.g. LiteLLM), which defeats the purpose of HolyClaude's "just works" philosophy.

Proposal

HolyClaude handles the API translation internally. When a user selects an OpenAI-compatible provider, a built-in lightweight proxy converts Anthropic ↔ OpenAI format behind the scenes. The user never sees or manages this.

Setup screen

☐ Claude Code           — Not authenticated            [Login]
☐ Cursor                — Command timeout               [Login]
☐ OpenAI Codex          — Codex not configured           [Login]
☐ Gemini                — Gemini CLI not configured      [Login]
☐ OpenAI-Compatible     — Not configured                 [Login]   ← NEW

Login flow

  1. Click Login next to OpenAI-Compatible
  2. Enter API base URL (e.g. https://api.together.xyz/v1)
  3. Enter API key
  4. Select a model from the provider's catalog
  5. Done — Claude Code sessions now use that provider

Environment variables (for docker-compose users)

environment:
  - OPENAI_COMPAT_API_KEY=your_key
  - OPENAI_COMPAT_BASE_URL=https://api.provider.com/v1
  - OPENAI_COMPAT_MODEL=meta-llama/Llama-3.3-70B-Instruct

Providers this would unlock

Any provider exposing /v1/chat/completions, including but not limited to:

Cloud providers

Provider Endpoint
Hyperstack AI Studio https://console.hyperstack.cloud/ai/api/v1
Together AI https://api.together.xyz/v1
Groq https://api.groq.com/openai/v1
Fireworks AI https://api.fireworks.ai/inference/v1
Mistral AI https://api.mistral.ai/v1
DeepSeek https://api.deepseek.com/v1

Self-hosted / local inference

Server Example endpoint
vLLM http://localhost:8000/v1
TGI (Text Generation Inference) http://localhost:8080/v1
Ollama (OpenAI-compat mode) http://localhost:11434/v1
LocalAI http://localhost:8080/v1
LM Studio http://localhost:1234/v1
llama.cpp server http://localhost:8080/v1
Any custom deployment User-defined

This means users can run open-weight models (Llama, Mistral, Qwen, DeepSeek, etc.) on their own GPU hardware — whether a local workstation, a home server, or on-prem infrastructure — and use them directly in Claude Code without any cloud dependency.

Why

  • "Stop configuring. Start building." — HolyClaude's own tagline. An external proxy contradicts this.
  • Cost flexibility — users without an Anthropic subscription get access to open-source models on pay-per-token pricing
  • Self-hosted models — teams running models on their own hardware get first-class support with zero cloud dependency
  • Fine-tuned models — teams running custom models on their own infrastructure can use them in Claude Code
  • One feature, many providers — a single built-in translation layer supports the entire OpenAI-compatible ecosystem

References

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions