Batch file translation tool using LLM APIs with litellm.
- Translate single files or entire directories
- Support for 25+ languages
- 100+ LLM providers via litellm (Anthropic, OpenAI, DashScope, DeepSeek, etc.)
- Configurable context window for large files
- Chunk-based translation for long documents
- Preserves formatting, code blocks, and structure
- CLI and Python API
pip install git+https://github.com/xjsongphy/llm_batch_translate.gitgit clone https://github.com/xjsongphy/llm_batch_translate.git
cd llm-batch-translate
pip install -e .# Run setup command (copies example config to default location)
llm-translate setup
# Or manually:
# mkdir -p ~/.config/llm-batch-translate
# cp .config/example.config.yml ~/.config/llm-batch-translate/config.yml
# Edit the file and add your API keyEdit ~/.config/llm-batch-translate/config.yml:
llm:
# Your API key
api_key: "your-api-key-here"
# API base URL (optional - auto-detected from model)
api_url: ""
# Model (use provider prefix for non-Anthropic)
model: "claude-3-5-sonnet-20241022"
max_tokens: 8192
context_window: 200000
translation:
source_lang: "en"
target_lang: "zh-cn"| Provider | Model Format | Example |
|---|---|---|
| Anthropic | claude-* (no prefix) |
claude-3-5-sonnet-20241022 |
| OpenAI | openai/* |
openai/gpt-4 |
| DashScope/Qwen | openai/* |
openai/qwen-turbo-latest |
| DeepSeek | deepseek/* |
deepseek/deepseek-chat |
See litellm providers for all 100+ options.
# Single file
llm-translate input.txt -o output.txt -s en -t zh-cn
# Multiple files to directory
llm-translate *.md -d translated/
# Directory recursively
llm-translate ./docs/ -d translated_docs/ -rThe tool searches for config files in this order:
~/.config/llm-batch-translate/config.yml~/.llm-batch-translate.yml./config.yml(current directory)
llm:
# Required: Your API key
api_key: "sk-ant-api03-xxxxxxxxxxxx"
# Optional: API URL
api_url: "https://api.anthropic.com"
# Optional: Model name
model: "claude-3-5-sonnet-20241022"
# Optional: Max output tokens
max_tokens: 8192
# Optional: Request timeout
timeout: 60
# Optional: Context window size
context_window: 200000
# Optional: Reserved tokens for output
reserved_output_tokens: 16384
translation:
# Optional: Default source language
source_lang: "en"
# Optional: Default target language
target_lang: "zh-cn"
# Optional: Custom prompt template
prompt_template: |
Please translate from {source} to {target}:
{text}
# Optional: Chunk settings for large files
chunk_size: 50000
chunk_overlap: 500# Setup: Copy example config to default location
llm-translate setup
# Translate files
llm-translate translate [OPTIONS] FILES...
# List supported languages
llm-translate languages
# Show configuration
llm-translate config
# Show config file locations
llm-translate config --show-pathllm-translate translate input.txt -o output.txt -s en -t zh-cn
Options:
-s, --source TEXT Source language code
-t, --target TEXT Target language code
-o, --output PATH Output file (single input)
-d, --output-dir PATH Output directory (multiple inputs)
-r, --recursive Process directories recursively
-p, --pattern TEXT File pattern (default: *)
-e, --extensions TEXT File extensions to include
-c, --config PATH Path to config file# Setup: Initialize config file
llm-translate setup
# Translate English to Chinese
llm-translate translate README.md -o README_zh.md -s en -t zh-cn
# Translate all markdown files in a directory
llm-translate translate ./docs/ -d docs_ja/ -s en -t ja -e .md -r
# Use custom config file
llm-translate translate -c custom.yml file.txt -o out.txt
# Check supported languages
llm-translate languages
# View current config
llm-translate config| Code | Language | Code | Language |
|---|---|---|---|
en |
English | zh |
Chinese |
zh-cn |
Simplified Chinese | zh-tw |
Traditional Chinese |
ja |
Japanese | ko |
Korean |
es |
Spanish | fr |
French |
de |
German | it |
Italian |
pt |
Portuguese | ru |
Russian |
ar |
Arabic | hi |
Hindi |
vi |
Vietnamese | th |
Thai |
nl |
Dutch | pl |
Polish |
tr |
Turkish | uk |
Ukrainian |
cs |
Czech | sv |
Swedish |
da |
Danish | fi |
Finnish |
no |
Norwegian |
Run llm-translate languages for the full list.
from llm_batch_translate import Config, Translator
# Load configuration
config = Config.from_file("config.yml")
# Create translator
translator = Translator(config)
# Translate text
result = translator.translate("Hello, world!", source="en", target="zh-cn")
print(result)
# Translate file
result = translator.translate_file("input.txt", "output.txt", source="en", target="ja")
if result.success:
print(f"Translated {result.chunks_count} chunks")
# Translate directory
results = translator.translate_directory(
"./docs/",
"./docs_translated/",
source="en",
target="zh-cn",
recursive=True,
extensions=[".md", ".txt"]
)The context window determines how much text can be processed at once.
The effective input size is calculated as:
max_input = context_window - max_tokens - reserved_output_tokens
For models with larger context windows, adjust in config file:
llm:
context_window: 200000
max_tokens: 16384
reserved_output_tokens: 16384- Text files (
.txt) - Markdown (
.md) - Source code (
.py,.js,.ts,.html,.css, etc.) - Config files (
.json,.yaml,.toml, etc.) - Any plain text format
FileNotFoundError: No config file found
Create a config file:
# Use setup command
llm-translate setup
# Or manually:
# mkdir -p ~/.config/llm-batch-translate
# cp .config/example.config.yml ~/.config/llm-batch-translate/config.ymlValueError: api_key is required in config file
Edit your config file and add the API key.
For large files, the text is automatically chunked. Adjust chunk size in config:
translation:
chunk_size: 100000
chunk_overlap: 1000MIT License - see LICENSE file for details.