Skip to content

Latest commit

 

History

History
69 lines (49 loc) · 2.55 KB

File metadata and controls

69 lines (49 loc) · 2.55 KB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Z.AI Proxy - An OpenAI-compatible API proxy for Z.AI with token pool management. The core implementation (index.js) works in both Cloudflare Workers and Node.js environments.

Architecture

Single-file design: index.js contains all core logic (TokenManager, RequestHandler, ProxyHandler classes) and exports a Cloudflare Workers-compatible fetch handler.

Token pool system:

  • TokenManager maintains a configurable pool of Z.AI authentication tokens
  • Load balancing uses least-recently-used algorithm across tokens
  • Automatic token refresh on failure (max 3 failures per token before cooling off for 5 minutes)
  • Tokens are fetched from https://chat.z.ai/api/v1/auths/ without credentials

Content cleaning:

  • cleanThinkingContent() removes <glm_block>, <details>, <summary> tags and > prefixes from thinking phase
  • cleanAnswerContent() removes <glm_block> tags from answer phase
  • Thinking/answer phases are detected from upstream SSE stream and wrapped in <think></think> tags in the output

Model mapping: OpenAI-style model names map to Z.AI upstream models:

  • glm-4.6GLM-4-6-API-V1
  • glm-4.50727-360B-API
  • glm-4.5-air0727-106B-API

Key Commands

# Run tests (requires TOKEN_POOL_SIZE)
TOKEN_POOL_SIZE=2 node test.js

# Start development server
TOKEN_POOL_SIZE=5 node server.js

# With custom port
PORT=8080 TOKEN_POOL_SIZE=3 node server.js

# Docker
docker build -t z-ai-proxy .
docker run -p 3000:3000 -e TOKEN_POOL_SIZE=10 z-ai-proxy

Configuration

Environment variables:

  • TOKEN_POOL_SIZE (default: 5) - Number of tokens in pool
  • API_KEY (default: sk-z2api-key-2024) - Client auth key
  • PORT (default: 3000) - Server port (Node.js only)
  • SHOW_THINK_TAGS (default: false) - Whether to show thinking tags

Testing

test.js includes comprehensive tests:

  • Health and models endpoints
  • Content cleaning validation using fixture.txt and fixture_test_output.txt golden truth
  • Auth/model validation
  • Empty token pool handling
  • Streaming and non-streaming chat completions (require live Z.AI access)

Content cleaning tests use fixture files to verify SSE stream parsing and HTML tag removal.

Deployment

  • Cloudflare Workers: Deploy index.js directly, set TOKEN_POOL_SIZE in dashboard
  • Node.js: Use server.js as HTTP wrapper around index.js worker
  • Docker: Dockerfile includes health checks and default environment variables