-
Notifications
You must be signed in to change notification settings - Fork 4
Configuration
Complete guide to configuring RubyLLM::Agents.
The install generator creates config/initializers/ruby_llm_agents.rb:
RubyLLM::Agents.configure do |config|
# Default Settings
config.default_model = "gemini-2.0-flash"
config.default_temperature = 0.0
config.default_timeout = 60
config.default_streaming = false
# Caching
config.cache_store = Rails.cache
# Execution Logging
config.async_logging = true
config.retention_period = 30.days
# Anomaly Detection
config.anomaly_cost_threshold = 5.00
config.anomaly_duration_threshold = 10_000
# Dashboard
config.dashboard_auth = ->(controller) { true }
config.dashboard_parent_controller = "ApplicationController"
endAs of v2.1.0, you can configure all LLM provider API keys directly in the RubyLLM::Agents.configure block. No separate RubyLLM.configure or ruby_llm.rb initializer needed:
RubyLLM::Agents.configure do |config|
# API Keys — forwarded to RubyLLM automatically
config.openai_api_key = ENV["OPENAI_API_KEY"]
config.anthropic_api_key = ENV["ANTHROPIC_API_KEY"]
config.gemini_api_key = ENV["GOOGLE_API_KEY"]
config.deepseek_api_key = ENV["DEEPSEEK_API_KEY"]
# Custom endpoints
config.openai_api_base = "https://custom-endpoint.example.com"
config.ollama_api_base = "http://localhost:11434"
# Agent settings
config.default_model = "gpt-4o"
config.default_temperature = 0.0
endThese attributes are set on RubyLLM::Agents.configure but forwarded to RubyLLM.config automatically:
| Attribute | Description |
|---|---|
openai_api_key |
OpenAI API key |
anthropic_api_key |
Anthropic API key |
gemini_api_key |
Google Gemini API key |
deepseek_api_key |
DeepSeek API key |
openrouter_api_key |
OpenRouter API key |
bedrock_api_key |
AWS Bedrock access key |
bedrock_secret_key |
AWS Bedrock secret key |
bedrock_session_token |
AWS Bedrock session token |
bedrock_region |
AWS Bedrock region |
mistral_api_key |
Mistral API key |
perplexity_api_key |
Perplexity API key |
xai_api_key |
xAI (Grok) API key |
gpustack_api_key |
GPUStack API key |
openai_api_base |
Custom OpenAI-compatible endpoint |
openai_organization_id |
OpenAI organization ID |
openai_project_id |
OpenAI project ID |
gemini_api_base |
Custom Gemini endpoint |
gpustack_api_base |
Custom GPUStack endpoint |
ollama_api_base |
Ollama server URL |
vertexai_project_id |
Google Vertex AI project ID |
vertexai_location |
Google Vertex AI location |
request_timeout |
HTTP request timeout for RubyLLM |
max_retries |
HTTP-level retries for RubyLLM |
If you previously had a separate config/initializers/ruby_llm.rb:
# BEFORE (two initializers):
# config/initializers/ruby_llm.rb
RubyLLM.configure do |config|
config.openai_api_key = ENV["OPENAI_API_KEY"]
end
# config/initializers/ruby_llm_agents.rb
RubyLLM::Agents.configure do |config|
config.default_model = "gpt-4o"
end# AFTER (single initializer):
# config/initializers/ruby_llm_agents.rb
RubyLLM::Agents.configure do |config|
config.openai_api_key = ENV["OPENAI_API_KEY"] # Forwarded to RubyLLM
config.default_model = "gpt-4o"
endNote: Running
rails generate ruby_llm_agents:upgradewill suggest consolidating if it detects a separateruby_llm.rbinitializer.
| Option | Default | Description |
|---|---|---|
default_model |
"gemini-2.0-flash" |
Default LLM model for agents |
default_temperature |
0.0 |
Default temperature (0.0-2.0) |
default_timeout |
60 |
Default request timeout in seconds |
default_streaming |
false |
Enable streaming by default |
default_thinking |
nil |
Default thinking config (e.g., { effort: :medium }) |
config.default_model = "gpt-4o"
config.default_temperature = 0.7
config.default_timeout = 120
config.default_streaming = true
config.default_thinking = nil # Or { effort: :medium } to enable globallyConfigure default thinking/reasoning behavior:
# Disable thinking by default (recommended)
config.default_thinking = nil
# Enable medium-effort thinking for all agents
config.default_thinking = { effort: :medium }
# Enable with token budget
config.default_thinking = { effort: :high, budget: 10000 }See Thinking for details on supported providers and per-agent configuration.
| Option | Default | Description |
|---|---|---|
cache_store |
Rails.cache |
Cache store for agent responses |
# Use Rails default cache
config.cache_store = Rails.cache
# Use memory store
config.cache_store = ActiveSupport::Cache::MemoryStore.new
# Use Redis
config.cache_store = ActiveSupport::Cache::RedisCacheStore.new(
url: ENV['REDIS_URL']
)| Option | Default | Description |
|---|---|---|
async_logging |
true |
Use background jobs for logging |
retention_period |
30.days |
How long to keep execution records |
persist_prompts |
true |
Store prompts in database |
persist_responses |
true |
Store responses in database |
config.async_logging = Rails.env.production?
config.retention_period = 90.days
config.persist_prompts = true
config.persist_responses = true| Option | Default | Description |
|---|---|---|
anomaly_cost_threshold |
5.00 |
Alert if execution costs more (in dollars) |
anomaly_duration_threshold |
10_000 |
Alert if execution takes longer (in ms) |
config.anomaly_cost_threshold = 1.00
config.anomaly_duration_threshold = 5_000| Option | Default | Description |
|---|---|---|
dashboard_auth |
->(_) { true } |
Authentication proc |
dashboard_parent_controller |
"ApplicationController" |
Parent controller class |
dashboard_per_page |
25 |
Items per page in lists |
dashboard_recent_executions |
10 |
Recent executions on overview |
# Require admin access
config.dashboard_auth = ->(controller) {
controller.current_user&.admin?
}
# Use HTTP Basic Auth
config.dashboard_auth = ->(controller) {
controller.authenticate_or_request_with_http_basic do |user, pass|
user == ENV['ADMIN_USER'] && pass == ENV['ADMIN_PASS']
end
}
# Inherit from custom controller
config.dashboard_parent_controller = "Admin::BaseController"config.budgets = {
# Global limits
global_daily: 100.0,
global_monthly: 2000.0,
# Per-agent limits
per_agent_daily: {
"ExpensiveAgent" => 50.0,
"CheapAgent" => 5.0
},
per_agent_monthly: {
"ExpensiveAgent" => 500.0
},
# Enforcement mode: :hard or :soft
enforcement: :hard
}See Budget Controls for details.
config.on_alert = ->(event, payload) {
case event
when :budget_hard_cap
Slack::Notifier.new(ENV['SLACK_WEBHOOK']).ping("Budget exceeded")
when :breaker_open
PagerDuty.trigger(summary: "Circuit breaker opened")
end
}See Alerts for details.
| Option | Default | Description |
|---|---|---|
async_max_concurrency |
10 |
Maximum concurrent operations for batch processing |
# Configure concurrent operation limit
config.async_max_concurrency = 20To use async features, add the async gem to your Gemfile:
gem 'async', '~> 2.0'See Async/Fiber for details.
Inject custom middleware into the execution pipeline for all agents:
config.use_middleware AuditMiddleware
config.use_middleware RateLimitMiddleware, before: RubyLLM::Agents::Pipeline::Middleware::Cache
config.use_middleware TracingMiddleware, after: RubyLLM::Agents::Pipeline::Middleware::TenantClear all registered middleware:
config.clear_middleware!See Custom Middleware for writing middleware, per-agent registration, and examples.
# config/initializers/ruby_llm_agents.rb
RubyLLM::Agents.configure do |config|
# Common settings
config.default_model = "gpt-4o"
if Rails.env.production?
config.async_logging = true
config.dashboard_auth = ->(c) { c.current_user&.admin? }
config.budgets = {
global_daily: 100.0,
enforcement: :hard
}
else
config.async_logging = false
config.dashboard_auth = ->(_) { true }
end
endAgents can override defaults:
class MyAgent < ApplicationAgent
model "claude-3-5-sonnet" # Override default model
temperature 0.7 # Override default temperature
timeout 120 # Override default timeout
cache 1.hour # Enable caching
streaming true # Enable streaming
end| Option | Type | Default | Description |
|---|---|---|---|
default_model |
String | "gemini-2.0-flash" |
Default LLM model |
default_temperature |
Float | 0.0 |
Default temperature (0.0-2.0) |
default_timeout |
Integer | 60 |
Request timeout in seconds |
default_streaming |
Boolean | false |
Enable streaming by default |
default_tools |
Array | [] |
Default tools for all agents |
default_thinking |
Hash | nil |
Default thinking config (e.g., {effort: :medium}) |
default_retries |
Hash | {max: 0} |
Default retry configuration |
default_fallback_models |
Array | [] |
Default fallback models |
default_total_timeout |
Integer | nil |
Default total timeout |
default_embedding_model |
String | "text-embedding-3-small" |
Default embedding model |
default_embedding_dimensions |
Integer | nil |
Default embedding dimensions |
default_embedding_batch_size |
Integer | 100 |
Default batch size for embeddings |
track_embeddings |
Boolean | true |
Track embedding executions |
default_transcription_model |
String | "whisper-1" |
Default transcription model |
track_transcriptions |
Boolean | true |
Track transcription executions |
default_tts_provider |
Symbol | :openai |
Default TTS provider |
default_tts_model |
String | "tts-1" |
Default TTS model |
default_tts_voice |
String | "nova" |
Default TTS voice |
track_speech |
Boolean | true |
Track TTS executions |
elevenlabs_api_key |
String | nil |
ElevenLabs API key (required for :elevenlabs provider) |
elevenlabs_api_base |
String | "https://api.elevenlabs.io" |
ElevenLabs API base URL |
tts_model_pricing |
Hash | {} |
Custom TTS pricing overrides per 1K characters |
default_tts_cost |
Float | 0.015 |
Fallback cost per 1K characters for unknown models |
transcription_model_pricing |
Hash | {} |
Custom transcription pricing overrides per minute |
pricing_cache_ttl |
Integer | 86400 |
Multi-source pricing cache TTL in seconds (24h) |
portkey_pricing_enabled |
Boolean | true |
Enable Portkey AI as pricing source |
openrouter_pricing_enabled |
Boolean | true |
Enable OpenRouter as pricing source |
helicone_pricing_enabled |
Boolean | true |
Enable Helicone as pricing source |
llmpricing_enabled |
Boolean | true |
Enable LLM Pricing AI as pricing source |
async_logging |
Boolean | true |
Log executions via background job |
retention_period |
Duration | 30.days |
Execution record retention |
cache_store |
Cache | Rails.cache |
Custom cache store |
budgets |
Hash | nil |
Budget configuration |
on_alert |
Proc | nil |
Alert handler callback |
persist_prompts |
Boolean | true |
Store prompts in executions |
persist_responses |
Boolean | true |
Store responses in executions |
multi_tenancy_enabled |
Boolean | false |
Enable multi-tenancy |
tenant_resolver |
Proc | -> { nil } |
Returns current tenant ID |
dashboard_parent_controller |
String | "ApplicationController" |
Dashboard controller parent |
dashboard_auth |
Proc | ->(_) { true } |
Custom auth lambda |
dashboard_per_page |
Integer | 25 |
Dashboard records per page |
dashboard_recent_executions |
Integer | 10 |
Dashboard recent executions |
anomaly_cost_threshold |
Float | 5.00 |
Cost anomaly threshold (USD) |
anomaly_duration_threshold |
Integer | 10_000 |
Duration anomaly threshold (ms) |
job_retry_attempts |
Integer | 3 |
Background job retries |
async_max_concurrency |
Integer | 10 |
Max concurrent operations for async batch |
| Option | Default | Description |
|---|---|---|
default_image_model |
"gpt-image-1" |
Default image generation model |
default_image_size |
"1024x1024" |
Default image size |
default_image_quality |
"standard" |
Default quality (standard, hd) |
default_image_style |
"vivid" |
Default style (vivid, natural) |
track_image_generation |
true |
Track image operations |
| Option | Default | Description |
|---|---|---|
tool_result_max_length |
10_000 |
Max chars for tool results |
| Option | Default | Description |
|---|---|---|
root_directory |
"agents" |
Root directory under app/
|
root_namespace |
nil |
Module namespace for agents |
For testing:
# Reset to defaults
RubyLLM::Agents.reset_configuration!
# Then reconfigure
RubyLLM::Agents.configure do |config|
config.async_logging = false
end- Installation - Initial setup and API key configuration
- Budget Controls - Cost limits
- Alerts - Notification setup
- Multi-Tenancy - Tenant isolation
- Async/Fiber - Concurrent execution
- Dashboard - Monitoring UI
- Custom Middleware - Pipeline middleware injection
- Migration - Upgrading between versions