Empower any LLM with Function Calling + Visual Admin Interface
Quick Start • Features • Documentation • Admin UI • Contributing
Based on funnycups/toolify
Special thanks to FunnyCups for creating the excellent Toolify middleware
- 🎨 Web Admin UI - React 19 + TypeScript visual configuration
- ⚡ Real-time Reload - Config changes take effect instantly
- 🔄 Multi-Channel Failover - Smart priority-based routing
- 🌐 Multi-API Support - OpenAI + Anthropic + Gemini formats (3-way conversion!)
- 🔍 Capability Detection - Test AI provider capabilities automatically
- 🧠 Reasoning Budget - Smart conversion between reasoning_effort and thinking tokens
- 📱 Responsive Design - Perfect for mobile and desktop
Toolify-code is a powerful LLM function calling middleware proxy designed for enterprise applications. It injects OpenAI-compatible function calling capabilities into Large Language Models through Prompt Injection technology, while providing a modern web-based admin interface for visual configuration management.
|
|
- Intercept Request: Toolify intercepts API requests (OpenAI/Anthropic/Gemini formats), which includes the desired tools.
- Format Detection: Automatically detects the source API format based on request structure.
- Inject Prompt: Generates a specific system prompt instructing the LLM how to output function calls using XML format.
- Convert & Proxy: Converts request to target format and proxies to configured upstream LLM service.
- Parse Response: Analyzes upstream response. If trigger signal is detected, parses XML structure to extract function calls.
- Format Response: Transforms tool calls to match the client's expected format and sends back.
| Format | Request Endpoint | Response Format | Auth Method |
|---|---|---|---|
| OpenAI | POST /v1/chat/completions |
OpenAI JSON | Authorization: Bearer |
| Anthropic | POST /v1/messages |
Anthropic JSON | x-api-key header |
| Gemini | POST /v1beta/models/{model}:generateContent |
Gemini JSON | key parameter |
Format Conversion Matrix:
OpenAI ←→ Anthropic ←→ Gemini
↑ ↑ ↑
└───────────┴───────────┘
All directions supported!
graph TB
subgraph "Client Layer"
Client1[OpenAI SDK Client<br/>chat/completions]
Client2[Anthropic SDK Client<br/>messages API]
end
subgraph "Toolify Middleware"
subgraph "API Gateway"
Main[main.py<br/>FastAPI Routes]
Auth[admin_auth.py<br/>Authentication]
Config[config_loader.py<br/>Configuration]
end
subgraph "Core Processing - toolify_core/"
Models[models.py<br/>Data Models]
subgraph "Format Conversion"
Anthropic[anthropic_adapter.py<br/>Anthropic ↔ OpenAI<br/>Format Converter]
end
subgraph "Request Processing"
MsgProc[message_processor.py<br/>Message Preprocessing]
Router[upstream_router.py<br/>Smart Routing]
end
subgraph "Function Calling Engine"
FC_Prompt[function_calling/prompt.py<br/>Prompt Generation]
FC_Parser[function_calling/parser.py<br/>XML Parsing]
FC_Stream[function_calling/streaming.py<br/>Stream Detection]
end
subgraph "Streaming & Proxy"
StreamProxy[streaming_proxy.py<br/>Streaming Handler]
end
subgraph "Utilities"
TokenCounter[token_counter.py<br/>Token Counting]
ToolMap[tool_mapping.py<br/>Tool Call Mapping]
end
end
end
subgraph "Upstream Services"
Upstream1[OpenAI API<br/>Priority: 100]
Upstream2[Backup Service<br/>Priority: 50]
Upstream3[Fallback Service<br/>Priority: 10]
end
subgraph "Admin Interface"
Frontend[React Admin UI<br/>Configuration Management]
end
Client1 -->|OpenAI Format| Main
Client2 -->|Anthropic Format| Main
Main -->|Convert to OpenAI| Anthropic
Anthropic --> Models
Main --> Auth
Main --> Models
Models --> MsgProc
MsgProc --> FC_Prompt
FC_Prompt --> Router
Router -->|Inject Prompt| Upstream1
Router -.->|Failover| Upstream2
Router -.->|Failover| Upstream3
Upstream1 -->|XML Response| StreamProxy
StreamProxy --> FC_Parser
StreamProxy --> FC_Stream
FC_Parser --> ToolMap
FC_Parser -->|Parse & Convert| Main
Main -->|Convert back| Anthropic
Anthropic -->|Anthropic Format| Client2
Main -->|OpenAI Format| Client1
Main --> TokenCounter
Frontend -->|Admin API| Auth
Frontend --> Config
style Main fill:#e1f5ff
style Anthropic fill:#ffebcd
style FC_Prompt fill:#ffe1f5
style FC_Parser fill:#ffe1f5
style Router fill:#f5ffe1
style StreamProxy fill:#fff4e1
sequenceDiagram
participant C as Client
participant M as Main (FastAPI)
participant A as Anthropic Adapter
participant MP as Message Processor
participant FC as Function Calling
participant R as Router
participant U as Upstream LLM
participant SP as Stream Proxy
alt OpenAI Format Request
C->>M: POST /v1/chat/completions
M->>MP: Preprocess messages
else Anthropic Format Request
C->>M: POST /v1/messages
M->>A: Convert Anthropic → OpenAI
A->>MP: Converted request
end
MP->>FC: Generate function prompt
FC-->>M: Injected system prompt
M->>R: Find upstream service
R-->>M: Priority-sorted upstreams
alt Non-Streaming
M->>U: Forward request (OpenAI format)
U-->>M: Complete response
M->>FC: Parse XML if detected
FC-->>M: Converted tool_calls
alt OpenAI Client
M-->>C: Standard OpenAI format
else Anthropic Client
M->>A: Convert OpenAI → Anthropic
A-->>C: Anthropic format response
end
else Streaming
M->>SP: Start streaming
SP->>U: Stream request
U-->>SP: Streaming chunks
SP->>FC: Detect & parse on-the-fly
FC-->>SP: Tool calls chunks
alt OpenAI Client
SP-->>C: OpenAI stream
else Anthropic Client
SP->>A: Convert stream format
A-->>C: Anthropic stream
end
end
| Module | Responsibility | Key Features |
|---|---|---|
| function_calling/ | Function call engine | Prompt injection, XML parsing, streaming detection |
| models.py | Data validation | Pydantic models for type safety |
| token_counter.py | Token management | Accurate counting for 20+ models |
| upstream_router.py | Service routing | Priority-based failover, smart retry |
| streaming_proxy.py | Stream handling | Real-time parsing, chunk management |
| anthropic_adapter.py | Format conversion | Seamless OpenAI ↔ Anthropic translation |
| message_processor.py | Message prep | Tool result formatting, validation |
| tool_mapping.py | Call tracking | TTL cache, LRU eviction |
You can run Toolify using Docker Compose or Python directly.
This is the recommended way for easy deployment.
- Docker and Docker Compose installed.
-
Clone the repository:
git clone https://github.com/ImogeneOctaviap794/Toolify-code.git cd Toolify-code -
Configure the application:
Copy the example configuration file and edit it:
cp config.example.yaml config.yaml
Edit
config.yaml. Make sure to addadmin_authenticationconfiguration (for the web admin interface):admin_authentication: username: "admin" password: "$2b$12$..." # Use init_admin.py to generate jwt_secret: "your-secure-random-jwt-secret-min-32-chars"
Or use the
init_admin.pyscript to generate automatically:python init_admin.py
-
Start the service:
docker-compose up -d --build
This will build the Docker image (including the frontend admin interface) and start the Toolify service in detached mode.
- API Service:
http://localhost:8000 - Admin Interface:
http://localhost:8000/admin
Note: The frontend will be compiled during Docker build, which may take a few minutes on first build.
- API Service:
- Python 3.8+
-
Clone the repository:
git clone https://github.com/funnycups/toolify.git cd toolify -
Install dependencies:
pip install -r requirements.txt
-
Configure the application:
Copy the example configuration file and edit it:
cp config.example.yaml config.yaml
Edit
config.yamlto set up your upstream services, API keys, and allowed client keys. -
Run the server:
python main.py
Refer to config.example.yaml for detailed configuration options.
server: Middleware host, port, and timeout settings.upstream_services: List of upstream LLM providers (e.g., Groq, OpenAI, Anthropic).- Define
base_url,api_key, supportedmodels, and set one service asis_default: true.
- Define
client_authentication: List ofallowed_keysfor clients accessing this middleware.features: Toggle features like logging, role conversion, and API key handling.key_passthrough: Set totrueto directly forward the client-provided API key to the upstream service, bypassing the configuredapi_keyinupstream_services.model_passthrough: Set totrueto forward all requests directly to the upstream service named 'openai', ignoring any model-based routing rules.prompt_template: Customize the system prompt used to instruct the model on how to use tools.
Once Toolify is running, configure your client application (e.g., using the OpenAI SDK) to use Toolify's address as the base_url. Use one of the configured allowed_keys for authentication.
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1", # Toolify endpoint
api_key="sk-my-secret-key-1" # Your configured client key
)
# The rest of your OpenAI API calls remain the same, including tool definitions.Toolify handles the translation between the standard OpenAI tool format and the prompt-based method required by unsupported LLMs.
Toolify Admin supports configuring multiple upstream channels for the same model with priority-based automatic failover, significantly improving service availability and stability.
- Priority Mechanism: Configure
priorityvalue for each service (higher number = higher priority, 100 > 50) - No Default Service Required: Removed
is_defaultrequirement, automatically uses highest priority service as fallback - Automatic Failover: Automatically try next priority channel when high-priority channel fails
- Smart Retry Strategy:
- For 429 (rate limit) and 5xx (server errors): Automatically switch to backup channel
- For 400/401/403 (client errors): No retry (would fail on other channels too)
- Same Model Multi-Channel: Configure multiple OpenAI proxies or mirrors for the same model
- Transparent Switching: Completely transparent to clients, handles all failover logic automatically
upstream_services:
# Primary channel - highest priority
- name: "openai-primary"
base_url: "https://api.openai.com/v1"
api_key: "your-primary-key"
priority: 100 # Highest priority (higher number = higher priority)
models:
- "gpt-4"
- "gpt-4o"
- "gpt-3.5-turbo"
# Backup channel - second priority
- name: "openai-backup"
base_url: "https://api.openai-proxy.com/v1"
api_key: "your-backup-key"
priority: 50 # Second priority
models:
- "gpt-4"
- "gpt-4o"
# Third priority channel
- name: "openai-fallback"
base_url: "https://another-proxy.com/v1"
api_key: "your-fallback-key"
priority: 10
models:
- "gpt-4"- Request
gpt-4model - System first tries
priority: 100channel (openai-primary) - highest priority - If returns 429 or 500+ error, automatically switches to
priority: 50channel (openai-backup) - If still fails, continues to try
priority: 10channel (openai-fallback) - Only returns error to client when all channels have failed
- Priority Rule: Higher number = higher priority (recommend using intervals like 100/50/10 for easy insertion of intermediate priorities)
- Streaming Requests: Due to the nature of streaming responses, always uses highest priority channel (cannot switch mid-stream)
- Same Priority: Multiple services can have same priority, in which case they're tried in config file order
- Model Matching: Only services configured with the same model participate in failover
- is_default Deprecated: No longer need to set default service, system automatically uses highest priority service as fallback
Toolify provides a modern web-based admin interface for easy configuration management through your browser.
Before using the admin interface, initialize an admin account:
python init_admin.pyFollow the prompts to enter a username and password. The script will automatically generate a hashed password and JWT secret, then update your config.yaml file.
Alternatively, you can manually add the following configuration to config.yaml:
admin_authentication:
username: "admin"
password: "$2b$12$..." # bcrypt hashed password
jwt_secret: "your-secure-random-jwt-secret-min-32-chars"- Start the Toolify service
- Open
http://localhost:8000/adminin your browser - Login with your admin credentials
- 📊 Server Configuration: Manage host, port, and timeout settings
- 🔄 Upstream Services: Add, edit, and remove upstream LLM service configurations
- 🔑 Client Authentication: Manage client API keys
- ⚙️ Feature Configuration: Toggle feature flags and behavior parameters
- 💾 Real-time Saving: Changes are saved directly to
config.yaml - 🔐 Secure Authentication: JWT-based secure login system
If you need to modify the admin interface frontend:
# Install dependencies
cd frontend
npm install
# Development mode (with hot reload)
npm run dev
# Build for production
npm run build
# Or use the build script
cd ..
./build_frontend.shFrontend Tech Stack:
- React 19 + TypeScript
- Vite build tool
- Tailwind CSS + shadcn/ui component library
Toolify includes a powerful capability detection system to test AI provider features automatically:
# Use the capability detection API
POST /api/detect/capabilities
{
"provider": "openai", // or "anthropic", "gemini"
"api_key": "your-key",
"base_url": "https://api.openai.com/v1", // optional
"model": "gpt-4o" // optional
}Detectable Capabilities:
- ✅ Basic chat completion
- ✅ Streaming responses
- ✅ Function calling / Tool use
- ✅ Vision / Image understanding
- ✅ System messages
- ✅ JSON mode / Structured output
Toolify automatically converts reasoning parameters between different formats:
| OpenAI | Anthropic | Gemini | Description |
|---|---|---|---|
reasoning_effort: "low" |
thinkingBudget: 2048 |
thinkingBudget: 2048 |
Light reasoning |
reasoning_effort: "medium" |
thinkingBudget: 8192 |
thinkingBudget: 8192 |
Moderate reasoning |
reasoning_effort: "high" |
thinkingBudget: 16384 |
thinkingBudget: 16384 |
Deep reasoning |
Example:
# Client sends OpenAI format with reasoning_effort
{
"model": "o1-preview",
"reasoning_effort": "high",
"messages": [...]
}
# Toolify automatically converts to Anthropic/Gemini
{
"model": "claude-3-opus",
"thinkingBudget": 16384, # Auto-converted!
"messages": [...]
}upstream_services:
# OpenAI Service
- name: "openai-primary"
service_type: "openai"
base_url: "https://api.openai.com/v1"
api_key: "sk-..."
priority: 100
models: ["gpt-4", "gpt-4o", "o1-preview"]
# Anthropic Claude Service
- name: "anthropic-claude"
service_type: "anthropic"
base_url: "https://api.anthropic.com"
api_key: "sk-ant-..."
priority: 90
models: ["claude-3-5-sonnet-20241022", "claude-3-opus"]
# Google Gemini Service
- name: "google-gemini"
service_type: "gemini"
base_url: "https://generativelanguage.googleapis.com/v1beta"
api_key: "AI..."
priority: 80
models: ["gemini-2.0-flash-exp", "gemini-1.5-pro"]upstream_services:
- name: "openai-with-injection"
inject_function_calling: true # Enable Toolify injection
optimize_prompt: true # Use optimized prompt
- name: "openai-native"
inject_function_calling: false # Use native function calling APIupstream_services:
- name: "openai"
model_mapping:
gpt-4: gpt-4o # Client requests gpt-4 → Actually use gpt-4o
gpt-3.5: gpt-4o-mini # Client requests gpt-3.5 → Actually use gpt-4o-mini
claude-2: claude-3 # Works with model namesWhen enabled, function calling prompts are simplified:
- Detailed Mode (default): 50,679 chars, ~12,669 tokens (17 tools)
- Optimized Mode: ~15,000 chars, ~4,000 tokens (17 tools)
- Savings: 60-70% reduction in prompt tokens ✅
This project is licensed under the GPL-3.0-or-later license.