🚀 Toolify-code

Empower any LLM with Function Calling + Visual Admin Interface

English | 简体中文

Quick Start • Features • Documentation • Admin UI • Contributing

📊 Project Origin & Acknowledgments

Based on funnycups/toolify
Special thanks to FunnyCups for creating the excellent Toolify middleware

✨ Key Enhancements

🎨 Web Admin UI - React 19 + TypeScript visual configuration
⚡ Real-time Reload - Config changes take effect instantly
🔄 Multi-Channel Failover - Smart priority-based routing
🌐 Multi-API Support - OpenAI + Anthropic + Gemini formats (3-way conversion!)
🔍 Capability Detection - Test AI provider capabilities automatically
🧠 Reasoning Budget - Smart conversion between reasoning_effort and thinking tokens
📱 Responsive Design - Perfect for mobile and desktop

📖 Introduction

Toolify-code is a powerful LLM function calling middleware proxy designed for enterprise applications. It injects OpenAI-compatible function calling capabilities into Large Language Models through Prompt Injection technology, while providing a modern web-based admin interface for visual configuration management.

✨ Key Features

🎯 Function Calling

🔌 Universal Support - Inject function calling into any LLM
📦 Multi-Function - Execute multiple functions concurrently
⚡ Flexible Trigger - Initiate calls at any stage
🧠 Think Tag Safe - Seamlessly handle thinking process
🌊 Streaming - Full streaming support with real-time parsing
🎨 Context Enhanced - Improved model understanding

🛡️ Enterprise Features

🔄 Multi-Channel Failover - Smart priority-based routing
🌐 Multi-API Format - OpenAI + Anthropic + Gemini (3-way conversion)
🔍 Capability Detection - Automated testing of AI features
🧠 Reasoning Budget - Intelligent effort/token conversion
🔐 Secure Auth - JWT Token + bcrypt encryption
⚡ Real-time Reload - Zero-downtime config updates
📊 Visual Management - Modern web interface
📱 Responsive - Works on desktop, tablet, mobile

How It Works

Intercept Request: Toolify intercepts API requests (OpenAI/Anthropic/Gemini formats), which includes the desired tools.
Format Detection: Automatically detects the source API format based on request structure.
Inject Prompt: Generates a specific system prompt instructing the LLM how to output function calls using XML format.
Convert & Proxy: Converts request to target format and proxies to configured upstream LLM service.
Parse Response: Analyzes upstream response. If trigger signal is detected, parses XML structure to extract function calls.
Format Response: Transforms tool calls to match the client's expected format and sends back.

🌐 Supported API Formats

Format	Request Endpoint	Response Format	Auth Method
OpenAI	`POST /v1/chat/completions`	OpenAI JSON	`Authorization: Bearer`
Anthropic	`POST /v1/messages`	Anthropic JSON	`x-api-key` header
Gemini	`POST /v1beta/models/{model}:generateContent`	Gemini JSON	`key` parameter

Format Conversion Matrix:

     OpenAI ←→ Anthropic ←→ Gemini
       ↑           ↑           ↑
       └───────────┴───────────┘
            All directions supported!

🏗️ Architecture

System Architecture Diagram

graph TB
    subgraph "Client Layer"
        Client1[OpenAI SDK Client<br/>chat/completions]
        Client2[Anthropic SDK Client<br/>messages API]
    end

    subgraph "Toolify Middleware"
        subgraph "API Gateway"
            Main[main.py<br/>FastAPI Routes]
            Auth[admin_auth.py<br/>Authentication]
            Config[config_loader.py<br/>Configuration]
        end

        subgraph "Core Processing - toolify_core/"
            Models[models.py<br/>Data Models]
            
            subgraph "Format Conversion"
                Anthropic[anthropic_adapter.py<br/>Anthropic ↔ OpenAI<br/>Format Converter]
            end
            
            subgraph "Request Processing"
                MsgProc[message_processor.py<br/>Message Preprocessing]
                Router[upstream_router.py<br/>Smart Routing]
            end
            
            subgraph "Function Calling Engine"
                FC_Prompt[function_calling/prompt.py<br/>Prompt Generation]
                FC_Parser[function_calling/parser.py<br/>XML Parsing]
                FC_Stream[function_calling/streaming.py<br/>Stream Detection]
            end
            
            subgraph "Streaming & Proxy"
                StreamProxy[streaming_proxy.py<br/>Streaming Handler]
            end
            
            subgraph "Utilities"
                TokenCounter[token_counter.py<br/>Token Counting]
                ToolMap[tool_mapping.py<br/>Tool Call Mapping]
            end
        end
    end

    subgraph "Upstream Services"
        Upstream1[OpenAI API<br/>Priority: 100]
        Upstream2[Backup Service<br/>Priority: 50]
        Upstream3[Fallback Service<br/>Priority: 10]
    end

    subgraph "Admin Interface"
        Frontend[React Admin UI<br/>Configuration Management]
    end

    Client1 -->|OpenAI Format| Main
    Client2 -->|Anthropic Format| Main
    Main -->|Convert to OpenAI| Anthropic
    Anthropic --> Models
    Main --> Auth
    Main --> Models
    Models --> MsgProc
    MsgProc --> FC_Prompt
    FC_Prompt --> Router
    Router -->|Inject Prompt| Upstream1
    Router -.->|Failover| Upstream2
    Router -.->|Failover| Upstream3
    Upstream1 -->|XML Response| StreamProxy
    StreamProxy --> FC_Parser
    StreamProxy --> FC_Stream
    FC_Parser --> ToolMap
    FC_Parser -->|Parse & Convert| Main
    Main -->|Convert back| Anthropic
    Anthropic -->|Anthropic Format| Client2
    Main -->|OpenAI Format| Client1
    
    Main --> TokenCounter
    
    Frontend -->|Admin API| Auth
    Frontend --> Config

    style Main fill:#e1f5ff
    style Anthropic fill:#ffebcd
    style FC_Prompt fill:#ffe1f5
    style FC_Parser fill:#ffe1f5
    style Router fill:#f5ffe1
    style StreamProxy fill:#fff4e1

Request Flow

sequenceDiagram
    participant C as Client
    participant M as Main (FastAPI)
    participant A as Anthropic Adapter
    participant MP as Message Processor
    participant FC as Function Calling
    participant R as Router
    participant U as Upstream LLM
    participant SP as Stream Proxy

    alt OpenAI Format Request
        C->>M: POST /v1/chat/completions
        M->>MP: Preprocess messages
    else Anthropic Format Request
        C->>M: POST /v1/messages
        M->>A: Convert Anthropic → OpenAI
        A->>MP: Converted request
    end
    
    MP->>FC: Generate function prompt
    FC-->>M: Injected system prompt
    M->>R: Find upstream service
    R-->>M: Priority-sorted upstreams
    
    alt Non-Streaming
        M->>U: Forward request (OpenAI format)
        U-->>M: Complete response
        M->>FC: Parse XML if detected
        FC-->>M: Converted tool_calls
        
        alt OpenAI Client
            M-->>C: Standard OpenAI format
        else Anthropic Client
            M->>A: Convert OpenAI → Anthropic
            A-->>C: Anthropic format response
        end
    else Streaming
        M->>SP: Start streaming
        SP->>U: Stream request
        U-->>SP: Streaming chunks
        SP->>FC: Detect & parse on-the-fly
        FC-->>SP: Tool calls chunks
        
        alt OpenAI Client
            SP-->>C: OpenAI stream
        else Anthropic Client
            SP->>A: Convert stream format
            A-->>C: Anthropic stream
        end
    end

Core Module Overview

Module	Responsibility	Key Features
function_calling/	Function call engine	Prompt injection, XML parsing, streaming detection
models.py	Data validation	Pydantic models for type safety
token_counter.py	Token management	Accurate counting for 20+ models
upstream_router.py	Service routing	Priority-based failover, smart retry
streaming_proxy.py	Stream handling	Real-time parsing, chunk management
anthropic_adapter.py	Format conversion	Seamless OpenAI ↔ Anthropic translation
message_processor.py	Message prep	Tool result formatting, validation
tool_mapping.py	Call tracking	TTL cache, LRU eviction

Installation and Setup

You can run Toolify using Docker Compose or Python directly.

Option 1: Using Docker Compose

This is the recommended way for easy deployment.

Prerequisites

Docker and Docker Compose installed.

Steps

Clone the repository:

git clone https://github.com/ImogeneOctaviap794/Toolify-code.git
cd Toolify-code

Configure the application:

Copy the example configuration file and edit it:

cp config.example.yaml config.yaml

Edit config.yaml. Make sure to add admin_authentication configuration (for the web admin interface):

admin_authentication:
  username: "admin"
  password: "$2b$12$..."  # Use init_admin.py to generate
  jwt_secret: "your-secure-random-jwt-secret-min-32-chars"

Or use the init_admin.py script to generate automatically:

python init_admin.py

Start the service:
```
docker-compose up -d --build
```
This will build the Docker image (including the frontend admin interface) and start the Toolify service in detached mode.
- API Service: http://localhost:8000
- Admin Interface: http://localhost:8000/admin
Note: The frontend will be compiled during Docker build, which may take a few minutes on first build.

Option 2: Using Python

Prerequisites

Python 3.8+

Steps

Clone the repository:

git clone https://github.com/funnycups/toolify.git
cd toolify

Install dependencies:
```
pip install -r requirements.txt
```
Configure the application:

Copy the example configuration file and edit it:
```
cp config.example.yaml config.yaml
```
Edit config.yaml to set up your upstream services, API keys, and allowed client keys.
Run the server:
```
python main.py
```

Configuration (`config.yaml`)

Refer to config.example.yaml for detailed configuration options.

server: Middleware host, port, and timeout settings.
upstream_services: List of upstream LLM providers (e.g., Groq, OpenAI, Anthropic).
- Define base_url, api_key, supported models, and set one service as is_default: true.
client_authentication: List of allowed_keys for clients accessing this middleware.
features: Toggle features like logging, role conversion, and API key handling.
- key_passthrough: Set to true to directly forward the client-provided API key to the upstream service, bypassing the configured api_key in upstream_services.
- model_passthrough: Set to true to forward all requests directly to the upstream service named 'openai', ignoring any model-based routing rules.
- prompt_template: Customize the system prompt used to instruct the model on how to use tools.

Usage

Once Toolify is running, configure your client application (e.g., using the OpenAI SDK) to use Toolify's address as the base_url. Use one of the configured allowed_keys for authentication.

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",  # Toolify endpoint
    api_key="sk-my-secret-key-1"          # Your configured client key
)

# The rest of your OpenAI API calls remain the same, including tool definitions.

Toolify handles the translation between the standard OpenAI tool format and the prompt-based method required by unsupported LLMs.

Multi-Channel Priority & Failover

Toolify Admin supports configuring multiple upstream channels for the same model with priority-based automatic failover, significantly improving service availability and stability.

Features

Priority Mechanism: Configure priority value for each service (higher number = higher priority, 100 > 50)
No Default Service Required: Removed is_default requirement, automatically uses highest priority service as fallback
Automatic Failover: Automatically try next priority channel when high-priority channel fails
Smart Retry Strategy:
- For 429 (rate limit) and 5xx (server errors): Automatically switch to backup channel
- For 400/401/403 (client errors): No retry (would fail on other channels too)
Same Model Multi-Channel: Configure multiple OpenAI proxies or mirrors for the same model
Transparent Switching: Completely transparent to clients, handles all failover logic automatically

Configuration Example

upstream_services:
  # Primary channel - highest priority
  - name: "openai-primary"
    base_url: "https://api.openai.com/v1"
    api_key: "your-primary-key"
    priority: 100  # Highest priority (higher number = higher priority)
    models:
      - "gpt-4"
      - "gpt-4o"
      - "gpt-3.5-turbo"
  
  # Backup channel - second priority
  - name: "openai-backup"
    base_url: "https://api.openai-proxy.com/v1"
    api_key: "your-backup-key"
    priority: 50  # Second priority
    models:
      - "gpt-4"
      - "gpt-4o"
  
  # Third priority channel
  - name: "openai-fallback"
    base_url: "https://another-proxy.com/v1"
    api_key: "your-fallback-key"
    priority: 10
    models:
      - "gpt-4"

Workflow

Request gpt-4 model
System first tries priority: 100 channel (openai-primary) - highest priority
If returns 429 or 500+ error, automatically switches to priority: 50 channel (openai-backup)
If still fails, continues to try priority: 10 channel (openai-fallback)
Only returns error to client when all channels have failed

Notes

Priority Rule: Higher number = higher priority (recommend using intervals like 100/50/10 for easy insertion of intermediate priorities)
Streaming Requests: Due to the nature of streaming responses, always uses highest priority channel (cannot switch mid-stream)
Same Priority: Multiple services can have same priority, in which case they're tried in config file order
Model Matching: Only services configured with the same model participate in failover
is_default Deprecated: No longer need to set default service, system automatically uses highest priority service as fallback

Web Admin Interface

Toolify provides a modern web-based admin interface for easy configuration management through your browser.

Initialize Admin Account

Before using the admin interface, initialize an admin account:

python init_admin.py

Follow the prompts to enter a username and password. The script will automatically generate a hashed password and JWT secret, then update your config.yaml file.

Alternatively, you can manually add the following configuration to config.yaml:

admin_authentication:
  username: "admin"
  password: "$2b$12$..."  # bcrypt hashed password
  jwt_secret: "your-secure-random-jwt-secret-min-32-chars"

Access Admin Interface

Start the Toolify service
Open http://localhost:8000/admin in your browser
Login with your admin credentials

Features

📊 Server Configuration: Manage host, port, and timeout settings
🔄 Upstream Services: Add, edit, and remove upstream LLM service configurations
🔑 Client Authentication: Manage client API keys
⚙️ Feature Configuration: Toggle feature flags and behavior parameters
💾 Real-time Saving: Changes are saved directly to config.yaml
🔐 Secure Authentication: JWT-based secure login system

Frontend Development

If you need to modify the admin interface frontend:

# Install dependencies
cd frontend
npm install

# Development mode (with hot reload)
npm run dev

# Build for production
npm run build

# Or use the build script
cd ..
./build_frontend.sh

Frontend Tech Stack:

React 19 + TypeScript
Vite build tool
Tailwind CSS + shadcn/ui component library

🔍 Capability Detection

Toolify includes a powerful capability detection system to test AI provider features automatically:

# Use the capability detection API
POST /api/detect/capabilities
{
  "provider": "openai",  // or "anthropic", "gemini"
  "api_key": "your-key",
  "base_url": "https://api.openai.com/v1",  // optional
  "model": "gpt-4o"  // optional
}

Detectable Capabilities:

✅ Basic chat completion
✅ Streaming responses
✅ Function calling / Tool use
✅ Vision / Image understanding
✅ System messages
✅ JSON mode / Structured output

🧠 Reasoning Budget Conversion

Toolify automatically converts reasoning parameters between different formats:

OpenAI	Anthropic	Gemini	Description
`reasoning_effort: "low"`	`thinkingBudget: 2048`	`thinkingBudget: 2048`	Light reasoning
`reasoning_effort: "medium"`	`thinkingBudget: 8192`	`thinkingBudget: 8192`	Moderate reasoning
`reasoning_effort: "high"`	`thinkingBudget: 16384`	`thinkingBudget: 16384`	Deep reasoning

Example:

# Client sends OpenAI format with reasoning_effort
{
  "model": "o1-preview",
  "reasoning_effort": "high",
  "messages": [...]
}

# Toolify automatically converts to Anthropic/Gemini
{
  "model": "claude-3-opus",
  "thinkingBudget": 16384,  # Auto-converted!
  "messages": [...]
}

Configuration Examples

Multi-Provider Configuration

upstream_services:
  # OpenAI Service
  - name: "openai-primary"
    service_type: "openai"
    base_url: "https://api.openai.com/v1"
    api_key: "sk-..."
    priority: 100
    models: ["gpt-4", "gpt-4o", "o1-preview"]
  
  # Anthropic Claude Service
  - name: "anthropic-claude"
    service_type: "anthropic"
    base_url: "https://api.anthropic.com"
    api_key: "sk-ant-..."
    priority: 90
    models: ["claude-3-5-sonnet-20241022", "claude-3-opus"]
  
  # Google Gemini Service
  - name: "google-gemini"
    service_type: "gemini"
    base_url: "https://generativelanguage.googleapis.com/v1beta"
    api_key: "AI..."
    priority: 80
    models: ["gemini-2.0-flash-exp", "gemini-1.5-pro"]

Per-Service Function Calling Control

upstream_services:
  - name: "openai-with-injection"
    inject_function_calling: true    # Enable Toolify injection
    optimize_prompt: true             # Use optimized prompt
    
  - name: "openai-native"
    inject_function_calling: false   # Use native function calling API

Model Redirection

upstream_services:
  - name: "openai"
    model_mapping:
      gpt-4: gpt-4o           # Client requests gpt-4 → Actually use gpt-4o
      gpt-3.5: gpt-4o-mini    # Client requests gpt-3.5 → Actually use gpt-4o-mini
      claude-2: claude-3      # Works with model names

Prompt Optimization

When enabled, function calling prompts are simplified:

Detailed Mode (default): 50,679 chars, ~12,669 tokens (17 tools)
Optimized Mode: ~15,000 chars, ~4,000 tokens (17 tools)
Savings: 60-70% reduction in prompt tokens ✅

License

This project is licensed under the GPL-3.0-or-later license.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.run		.run
frontend		frontend
toolify_core		toolify_core
.dockerignore		.dockerignore
.gitignore		.gitignore
ADMIN_GUIDE.md		ADMIN_GUIDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
admin_auth.py		admin_auth.py
build_frontend.sh		build_frontend.sh
config.example.yaml		config.example.yaml
config_loader.py		config_loader.py
docker-compose.yml		docker-compose.yml
init_admin.py		init_admin.py
main.py		main.py
requirements.txt		requirements.txt
test_function_calling.py		test_function_calling.py

Folders and files

Latest commit

History

Repository files navigation

🚀 Toolify-code

📊 Project Origin & Acknowledgments

✨ Key Enhancements

📖 Introduction

✨ Key Features

🎯 Function Calling

🛡️ Enterprise Features

How It Works

🌐 Supported API Formats

🏗️ Architecture

System Architecture Diagram

Request Flow

Core Module Overview

Installation and Setup

Option 1: Using Docker Compose

Prerequisites

Steps

Option 2: Using Python

Prerequisites

Steps

Configuration (config.yaml)

Usage

Multi-Channel Priority & Failover

Features

Configuration Example

Workflow

Notes

Web Admin Interface

Initialize Admin Account

Access Admin Interface

Features

Frontend Development

🔍 Capability Detection

🧠 Reasoning Budget Conversion

Configuration Examples

Multi-Provider Configuration

Per-Service Function Calling Control

Model Redirection

Prompt Optimization

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Configuration (`config.yaml`)

Packages