feat: Add support for grok-4-fast model via LiteLLM proxy by johnbean393 · Pull Request #154 · bytebot-ai/bytebot

johnbean393 · 2025-10-12T17:40:45Z

Description

This PR adds support for the new grok-4-fast model via OpenRouter through the bytebot-llm-proxy (LiteLLM).

Changes

Removed max_tokens parameter from proxy service Chat Completion requests
Removed reasoning_effort parameter from proxy service Chat Completion requests

These model-specific parameters were causing compatibility issues with the grok-4-fast model. By removing them, the proxy service now works seamlessly with grok-4-fast and other models that don't support these parameters, while LiteLLM handles model-specific parameter mapping automatically.

Testing

Verified that grok-4-fast model works correctly through the LiteLLM proxy
Confirmed backward compatibility with existing models

Remove max_tokens and reasoning_effort parameters from proxy service to improve compatibility with grok-4-fast model through OpenRouter. These model-specific parameters were causing issues with the new model.

- Add proxy.model-info.ts to dynamically fetch context windows from OpenRouter API - Update tasks.controller.ts to use async extractContextWindow function - Replace hardcoded 128K context window with dynamic values from OpenRouter - Implement caching layer (1-hour TTL) to minimize API calls - Fix Dockerfile to properly handle Prisma in Alpine Linux Benefits: - Grok 4 Fast now correctly reports 2M token context window - Claude Sonnet 4.5 reports 1M tokens instead of 200K - Gemini 2.5 models report 1048576 tokens - All models automatically get accurate, up-to-date context windows - Improves agent performance by preventing premature summarization Fixes context window inaccuracies by prioritizing: 1. LiteLLM model_info (when available) 2. OpenRouter API context_length (when LiteLLM returns null) 3. Default fallback (128K) Related to Grok 4 Fast support

johnbean393 added 2 commits October 12, 2025 13:39

feat: add support for grok-4-fast model via LiteLLM proxy

0b38380

Remove max_tokens and reasoning_effort parameters from proxy service to improve compatibility with grok-4-fast model through OpenRouter. These model-specific parameters were causing issues with the new model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for grok-4-fast model via LiteLLM proxy#154

feat: Add support for grok-4-fast model via LiteLLM proxy#154
johnbean393 wants to merge 2 commits intobytebot-ai:mainfrom
johnbean393:feat/support-grok-4-fast-model

johnbean393 commented Oct 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johnbean393 commented Oct 12, 2025

Description

Changes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant