Skip to content

kevinnbass/claude-code-deepseek

 
 

Repository files navigation

Claude Code with GLM 4.5 🧠

A proxy server that lets you use Claude Code with GLM 4.5, providing a cost-effective alternative while maintaining high-quality code assistance capabilities.

GLM 4.5 CODE banner

What This Does 🔄

This proxy acts as a bridge between the Claude Code client and GLM 4.5:

  1. It intercepts requests from Claude Code intended for Anthropic's API
  2. Transforms these requests into a format compatible with GLM 4.5
  3. Forwards them to the GLM API service
  4. Converts responses back to match Anthropic's expected format
  5. Returns them to the Claude Code client

The result: You can use Claude Code's excellent interface while leveraging the powerful GLM 4.5 model.

Key Benefits 💰

  • Cost Savings: Use Claude Code with more affordable GLM 4.5
  • High Performance: GLM 4.5 provides excellent code assistance capabilities
  • Unified Model: All requests use the same high-quality GLM 4.5 model
  • Transparency: Use the same Claude Code interface without changing your workflow
  • Enhanced Features: Added /brainstorm command and other customizations
  • High Token Limits: GLM 4.5 supports up to 98K output tokens

Quick Start ⚡

Prerequisites

Setup 🛠️

  1. Clone this repository:

    git clone https://github.com/kevinnbass/claude-code-deepseek.git
    cd claude-code-deepseek
  2. Install Python dependencies:

    pip install -e .
    # OR with UV (recommended)
    curl -LsSf https://astral.sh/uv/install.sh | sh
    uv pip install -e .
  3. Configure your API keys:

    cp .env.example .env

    Then edit the .env file with your API key:

    GLM_API_KEY=your-glm-key  # Get from https://open.bigmodel.cn
    
  4. Start the proxy server:

    python server.py --always-cot
    # OR with UV
    uv run server.py --always-cot

    Server Options:

    • --always-cot: Recommended flag that improves reasoning capability with Chain-of-Thought prompting
    • Debug mode enabled by setting DEBUG=true in .env file

Using with Claude Code 🖥️

  1. Install Claude Code CLI:

    npm install -g @anthropic-ai/claude-code
  2. Connect to your proxy:

    ANTHROPIC_BASE_URL=http://127.0.0.1:8082 claude
  3. Verify proxy connection: You should see the standard Claude Code welcome message with no errors.

  4. Start using it! Your Claude Code client is now connected to cost-effective alternative models.

Features 🌟

Model Mapping 🗺️

Claude Model Mapped Model Provider Use Case
claude-3-haiku glm-4.5 GLM All tasks - high-quality responses
claude-3-sonnet glm-4.5 GLM All tasks - high-quality responses

Customize via environment variables:

BIG_MODEL=glm-4.5         # Model to use for Sonnet tasks
SMALL_MODEL=glm-4.5       # Model to use for Haiku tasks  

Core Capabilities ✨

  • Text & Code generation - High-quality responses and code
  • Function calling / Tool usage - Full tool support
  • Streaming responses - Real-time streaming
  • Multi-turn conversations - Context preservation
  • System prompts - Full system instruction support
  • Chain-of-Thought - Enhanced reasoning with --always-cot flag

Cost Comparison & Savings 💰

GLM 4.5 Pricing vs Anthropic Models

Pricing confirmed from https://open.bigmodel.cn/pricing and https://www.anthropic.com/pricing#api

GLM 4.5 Tiered Pricing Structure (converted from yuan to USD at ~0.14 exchange rate):

Input Length (K tokens) Output Length (K tokens) Input Cost (50% off) Output Cost (50% off) Input Cost (full price) Output Cost (full price)
0-32K 0-200 $0.14/1M $0.28/1M $0.28/1M $0.56/1M
0-32K 200+ $0.21/1M $0.42/1M $0.42/1M $0.84/1M
32-128K Any $0.28/1M $0.56/1M $0.56/1M $1.12/1M

Anthropic Pricing:

Model Input Cost (per 1M tokens) Output Cost (per 1M tokens)
Claude Opus 4.1 $15.00 $75.00
Claude Sonnet 4 $3.00 (≤200K) / $6.00 (>200K) $15.00 (≤200K) / $22.50 (>200K)

Cost Comparison for Heavy Usage (20M tokens/month - 16M input, 4M output):

Model Monthly Cost Annual Cost
Claude Opus 4.1 $540.00 $6,480
Claude Sonnet 4 $108.00 $1,296
GLM 4.5 (50% off) $3.36 $40.32
GLM 4.5 (full price) $6.72 $80.64

Massive Cost Savings

With Current 50% Off Promotion (until August 31, 2025):

  • 99.4% savings vs Claude Opus ($3.36 vs $540.00 monthly)
  • 96.9% savings vs Claude Sonnet ($3.36 vs $108.00 monthly)

Even at Full Price (after August 31, 2025):

  • 98.8% savings vs Claude Opus ($6.72 vs $540.00 monthly)
  • 93.8% savings vs Claude Sonnet ($6.72 vs $108.00 monthly)

Promotional Benefits

🎉 Current Limited-Time Offers:

  • 50% discount on all GLM models until August 31, 2025
  • Free registration credits available (check https://open.bigmodel.cn for current offers)
  • Tiered pricing that scales with usage - pay less for smaller context windows
  • Additional promotional credits available through Zhipu AI platform

Bottom line: GLM 4.5 delivers comparable performance at a fraction of the cost, making advanced AI assistance accessible for individual developers and small teams.

Limitations ⚠️

  • Context window: Smaller context windows compared to Claude's 200K+
  • Multimodal content: Limited compared to Claude's multimodal capabilities

Technical Details 🔧

Architecture

  • API Compatibility Layer: Implements Anthropic's API endpoints
  • Model Routing Logic: Routes requests to appropriate providers
  • Request/Response Transformation: Converts between API formats
  • Custom Command System: Extends functionality with specialized commands
  • Streaming Support: Full event streaming compatibility

Technologies

  • FastAPI: High-performance web framework
  • LiteLLM: Standardized interface to multiple LLM providers
  • HTTPX: Modern HTTP client for async API communication
  • Python 3.9+: Core language with robust async support

How It Works

  1. The proxy receives requests from Claude Code in Anthropic's format
  2. It routes all requests to GLM 4.5 via the GLM API
  3. It transforms the request format for the GLM API
  4. It optionally adds Chain-of-Thought prompting (when --always-cot flag is used)
  5. It processes the response and converts it back to Anthropic's format
  6. The Claude Code client receives responses in the expected format

Special Feature: /brainstorm Command ✨

This proxy extends Claude Code with custom slash commands for specialized tasks. The most powerful is the /brainstorm command that uses GLM 4.5 with a specialized brainstorming system prompt.

Generate creative ideas for any code challenge or problem:

/brainstorm How can I optimize CI/CD pipelines for our microservices architecture?

The /brainstorm command:

  • Uses a specialized system prompt tailored for code-related brainstorming
  • Powered by GLM 4.5 with enhanced reasoning capabilities
  • Generates diverse, actionable ideas with implementation details and code snippets
  • Includes tradeoffs and considerations for each solution

When to use: Architecture decisions, code optimization, refactoring approaches, testing strategies, and creative problem solving.

Example use cases: API design, database optimization, refactoring approaches, testing strategies, performance optimization, and error handling patterns.

Future planned commands include /debug, /refactor, and /perf. Contributions welcome!

For a comprehensive comparison between Claude and alternative model capabilities, see CAPABILITIES.md. For detailed performance metrics and analysis, see PERFORMANCE_SUMMARY.md.

Contributing 🤝

Contributions are welcome! Please feel free to submit a Pull Request or open an Issue.

License

MIT License

About

Run Claude Code using DeepSeek and Gemini models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%