Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 25 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

---

Works with any OpenAI-compatible endpoint — local llama.cpp servers, OpenAI, Groq, DeepSeek, OpenRouter, and more.
Works with any OpenAI-compatible endpoint — local servers, OpenAI, Groq, DeepSeek, OpenRouter, and more.

Two interfaces share the same pipeline:

Expand All @@ -26,8 +26,10 @@ Two interfaces share the same pipeline:

## Highlights

- **Batched translation** — sends ~15 subtitle blocks at a time so small models don't drift, skip short lines, or merge split sentences.
- **Strict validation** — every batch is checked for block count, numbering, and unchanged timestamps; failures retry with back-off.
- **Batched translation** — sends ~10 subtitle blocks at a time so small models don't drift, skip short lines, or merge split sentences.
- **Cast & register prepass** — a pre-scan extracts characters, recurring terms, and the written register so every batch translates names and formality consistently.
- **Strict validation** — every batch is checked for block count, numbering, and unchanged timestamps; failures retry with back-off and recursively split on repeated failure.
- **Auto-detect source language** — omit the source and the model infers it from the text, so mixed-language batches translate to a single target cleanly.
- **Any OpenAI-compatible provider** — local or cloud, no vendor lock-in.
- **Parallelism** — translate many batches per file and many files at once.
- **Live progress** — per-file progress bars in the web app, an in-place status line (elapsed / ETA / throughput) in the CLI.
Expand All @@ -40,7 +42,7 @@ npm install
ng serve
```

Open http://localhost:4200, drop in one or more subtitle files, pick source/target languages and a provider, and download translated files individually or as a ZIP.
Open http://localhost:4200, drop in one or more subtitle files, pick a target language (source defaults to Auto-detect) and a provider, and download translated files individually or as a ZIP.

## Command line

Expand All @@ -49,16 +51,20 @@ cd cli

# Option A — pip
pip install -r requirements.txt
python translora.py movie.srt -s English -t Arabic \
python translora.py movie.srt -t Arabic \
--api-url http://127.0.0.1:8080/v1/chat/completions

# Option B — uv (faster, auto-manages the venv)
uv sync
uv run translora.py movie.srt -s English -t Arabic \
uv run translora.py movie.srt -t Arabic \
--api-url http://127.0.0.1:8080/v1/chat/completions

# Explicit source language (skip auto-detect)
python translora.py movie.srt -s English -t Arabic \
--api-url http://127.0.0.1:8080/v1/chat/completions

# Cloud provider, whole folder in parallel
python translora.py ./subs/ -s English -t Arabic \
# Cloud provider, whole folder in parallel (source auto-detected per file)
python translora.py ./subs/ -t Arabic \
--api-url https://api.openai.com/v1/chat/completions \
--api-key sk-... --model gpt-4.1-mini -c 10 -pf 3
```
Expand All @@ -67,28 +73,31 @@ Frequently used flags:

| Flag | Description |
| --- | --- |
| `-s, --source` / `-t, --target` | Source and target language names |
| `-t, --target` | Target language name (required) |
| `-s, --source` | Source language (optional; omit to auto-detect — useful for mixed-language batches) |
| `--api-url` | OpenAI-compatible `/v1/chat/completions` endpoint |
| `--api-key` | API key; use `none` for local servers |
| `--model` | Model name (optional for local) |
| `--batch-size` | Subtitle blocks per batch (default **15**) |
| `-c, --concurrency` | Parallel batches per file (default **1**) |
| `--batch-size` | Subtitle blocks per batch (default **10**) |
| `-c, --concurrency` | Parallel batches per file (default **1** — raise for cloud providers) |
| `-pf, --parallel-files` | Files translated in parallel (default **1**) |
| `--max-retries` | Retries per batch (default **5**) |
| `--force` | Re-translate even if the output exists |
| `-v, --verbose` | Show retry/validation warnings (hidden by default) |
| `-o, --output` | Output path (single file only) |

Set `NO_COLOR=1` to disable ANSI colors; output auto-falls back to plain lines when piped.

## How it works

Small and medium LLMs have known failure modes on long subtitle files: skipping one-word blocks (`"Oh!"`, `"Hmm."`), merging sentences split across two blocks for timing, and drifting mid-file. TransLora defends against that with a five-step pipeline:
Small and medium LLMs have known failure modes on long subtitle files: skipping one-word blocks (`"Oh!"`, `"Hmm."`), merging sentences split across two blocks for timing, drifting mid-file, and switching dialect or formality between batches. TransLora defends against that with a six-step pipeline:

1. Parse the subtitle file into numbered blocks with timestamps (SRT, VTT, ASS, SSA, SBV, SUB).
2. Split blocks into batches small enough that the model can't drift.
3. Send each batch with a structure-preserving system prompt.
4. Validate the response: block count in = out, numbers and timestamps untouched.
5. Retry failed batches up to `--max-retries` before flagging the file, then stitch the validated batches back in order.
2. Pre-scan the file with one extra LLM call to extract the cast, recurring terms, and the written register (e.g. Modern Standard Arabic, peninsular Spanish, polite Japanese). The relevant slice is attached to each batch so names and formality stay consistent across the whole file.
3. Split blocks into batches small enough that the model can't drift.
4. Send each batch with a structure-preserving system prompt.
5. Validate the response: block count in = out, numbers and timestamps untouched. Repeated failures recursively split the batch down to singletons before giving up.
6. Retry failed batches up to `--max-retries` before flagging the file, then stitch the validated batches back in order.

## Providers

Expand Down Expand Up @@ -128,7 +137,6 @@ Anything else that speaks the OpenAI chat-completions protocol will work the sam
## Roadmap

- Side-by-side preview and per-block editing in the web app
- Translation memory for character-voice consistency across a file
- General document/text translation beyond subtitles

## License
Expand Down
124 changes: 82 additions & 42 deletions cli/core/batch_runner.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
"""Per-batch HTTP call, response sanitizing, and retry loop.

This is the "send one batch, get it back validated" layer. It knows how
to talk to an OpenAI-compatible chat endpoint and how to recover from
transient failures. Everything above this layer (translator.py) just
asks for batches and stitches them together.
"""
"""Per-batch HTTP call, response sanitizing, and retry loop."""

from __future__ import annotations

Expand All @@ -14,7 +8,8 @@

import httpx

from .srt_parser import SubtitleBlock, parse_srt, serialize_srt, validate_batch
from .context_pass import FileContext
from .srt_parser import SubtitleBlock, parse_lite, serialize_lite, validate_batch
from .config import TranslationConfig
from .prompt import SYSTEM_PROMPT

Expand All @@ -25,16 +20,11 @@


class FileTranslationError(Exception):
"""A batch used up all its retries — the whole file is considered failed."""

"""A batch exhausted its retries; the whole file is considered failed."""

# ---------------------------------------------------------------------------
# Input sanitization — users paste URLs/keys in all kinds of shapes.
# ---------------------------------------------------------------------------

def sanitize_api_url(url: str) -> str:
"""Drop credential query params like `?key=...` so we don't authenticate
twice when the user pastes a pre-keyed URL."""
"""Drop credential query params so we don't authenticate twice."""
url = (url or "").strip()
if not url:
return url
Expand All @@ -49,7 +39,6 @@ def sanitize_api_url(url: str) -> str:


def sanitize_api_key(key: str) -> str:
"""Strip whitespace, surrounding quotes, and any `Bearer ` prefix."""
k = (key or "").strip()
if (k.startswith('"') and k.endswith('"')) or \
(k.startswith("'") and k.endswith("'")):
Expand All @@ -60,7 +49,6 @@ def sanitize_api_key(key: str) -> str:


def strip_markdown_fences(text: str) -> str:
"""LLMs sometimes wrap output in ```...``` despite being told not to."""
text = text.strip()
if text.startswith("```"):
text = re.sub(r"^```[a-zA-Z]*\n?", "", text)
Expand All @@ -69,29 +57,23 @@ def strip_markdown_fences(text: str) -> str:


def is_retryable_http(code: int) -> bool:
"""Retry on timeout / rate-limit / server errors. Everything else is fatal."""
return code in (408, 429) or code >= 500


# ---------------------------------------------------------------------------
# HTTP call + retry
# ---------------------------------------------------------------------------

async def call_chat_api(
client: httpx.AsyncClient,
batch_srt: str,
system_prompt: str,
user_message: str,
cfg: TranslationConfig,
block_count: int,
max_tokens: int,
) -> str:
"""POST one batch to the OpenAI-compatible chat endpoint, return raw text."""
body: dict = {
"messages": [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content":
f"Translate from {cfg.source_lang} to {cfg.target_lang}:\n\n{batch_srt}"},
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_message},
],
"temperature": 0.1,
"max_tokens": max(block_count, 1) * 120,
"max_tokens": max(max_tokens, 1),
"stream": False,
"cache_prompt": True,
}
Expand All @@ -109,25 +91,67 @@ async def call_chat_api(
return resp.json()["choices"][0]["message"]["content"]


def _build_user_message(
cfg: TranslationConfig,
batch_wire: str,
file_context: FileContext | None,
batch: list[SubtitleBlock],
) -> str:
if cfg.source_lang:
header = f"Translate from {cfg.source_lang} to {cfg.target_lang}:"
else:
header = f"Translate to {cfg.target_lang}:"
if file_context is not None:
ctx = file_context.render_for_batch(batch)
if ctx:
return f"Glossary for this scene:\n{ctx}\n\n{header}\n\n{batch_wire}"
return f"{header}\n\n{batch_wire}"


_ATTEMPTS_BEFORE_SPLIT = 2


async def translate_batch_with_retry(
client: httpx.AsyncClient,
batch_idx: int,
batch: list[SubtitleBlock],
cfg: TranslationConfig,
file_context: FileContext | None = None,
_split_path: str = "",
) -> list[SubtitleBlock]:
"""Translate one batch; retry on transient errors; raise on exhaustion."""
batch_srt = serialize_srt(batch)
label = f"Batch {batch_idx + 1}"
"""Translate one batch; on repeated validation failure, halve and recurse.

Persistent count mismatches usually mean the model is deterministically
merging two adjacent similar-looking blocks. Halving keeps terminating
because at N=1 a count mismatch is impossible.
"""
batch_wire = serialize_lite(batch)
user_msg = _build_user_message(cfg, batch_wire, file_context, batch)
label = f"Batch {batch_idx + 1}" + (f".{_split_path}" if _split_path else "")
first_block = batch[0].number

for attempt in range(1, cfg.max_retries + 1):
tag = f"attempt {attempt}/{cfg.max_retries}"
can_split = len(batch) > 1
attempts = _ATTEMPTS_BEFORE_SPLIT if can_split else cfg.max_retries
hit_validation_failure = False
Comment on lines +133 to +135
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attempts is set to _ATTEMPTS_BEFORE_SPLIT for any batch with >1 block, which limits retries for transient HTTP/network errors as well as validation failures. This can reduce resilience to 429/5xx spikes. Consider keeping cfg.max_retries for request failures, and only triggering split-after-N when validation keeps failing.

Copilot uses AI. Check for mistakes.

for attempt in range(1, attempts + 1):
tag = f"attempt {attempt}/{attempts}"
try:
raw = await call_chat_api(client, batch_srt, cfg, len(batch))
output = parse_srt(strip_markdown_fences(raw))
raw = await call_chat_api(
client, SYSTEM_PROMPT, user_msg, cfg, max(len(batch), 1) * 120,
)
output = parse_lite(strip_markdown_fences(raw))
if len(output) == len(batch):
output = [
SubtitleBlock(number=batch[i].number,
timestamp=batch[i].timestamp,
text=output[i].text)
Comment on lines +144 to +148
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the parsed lite output is rewritten with the input batch’s numbers/timestamps before validation. That prevents validate_batch from catching incorrect numbering or reordered blocks (those fields get overwritten), which can silently misalign text with timestamps. Prefer validating the returned numbering/order first, and then reattaching timestamps by matching on the returned block number (or only overwriting timestamps).

Copilot uses AI. Check for mistakes.
for i in range(len(batch))
]
check = validate_batch(batch, output)
if check.ok:
return output
hit_validation_failure = True
cfg.warn(f" {label} validation failed ({tag}): {check.error}")

except httpx.HTTPStatusError as e:
Expand All @@ -139,19 +163,35 @@ async def translate_batch_with_retry(
raise FileTranslationError(
f"{label} (block {first_block}) HTTP {code}: {snippet}"
)
if code == 429 and attempt < cfg.max_retries:
if code == 429 and attempt < attempts:
delay = 2 ** attempt
cfg.warn(f" Rate limited waiting {delay}s...")
cfg.warn(f" Rate limited - waiting {delay}s...")
await asyncio.sleep(delay)
continue

except Exception as e: # network error, JSON decode error, etc.
except Exception as e:
cfg.warn(f" {label} request failed ({tag}): {e}")

# Small back-off before the next attempt (1s, 2s, 3s cap).
if attempt < cfg.max_retries:
if attempt < attempts:
await asyncio.sleep(min(attempt, 3))

if hit_validation_failure and can_split:
mid = len(batch) // 2
left, right = batch[:mid], batch[mid:]
cfg.warn(
f" {label} splitting {len(batch)} -> {len(left)} + {len(right)} blocks"
)
left_path = (_split_path + "L") if _split_path else "L"
right_path = (_split_path + "R") if _split_path else "R"
# Sequential: parallel halves would oversubscribe the outer semaphore.
left_result = await translate_batch_with_retry(
client, batch_idx, left, cfg, file_context, left_path,
)
right_result = await translate_batch_with_retry(
client, batch_idx, right, cfg, file_context, right_path,
)
return left_result + right_result

raise FileTranslationError(
f"{label} (block {first_block}) failed all {cfg.max_retries} retries"
f"{label} (block {first_block}) failed all {attempts} retries"
)
21 changes: 11 additions & 10 deletions cli/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,25 +8,26 @@
DEFAULT_MAX_RETRIES = 5


def _default_warn(msg: str) -> None:
def _silent_warn(msg: str) -> None:
pass


def _stderr_warn(msg: str) -> None:
print(msg, file=sys.stderr)


@dataclass
class TranslationConfig:
"""Everything a translation run needs beyond the file paths.

Bundled so we aren't threading 8+ arguments through every helper.
`warn` lets callers intercept retry/validation messages so they can be
routed around a live progress line instead of clobbering it.
"""
source_lang: str
"""Per-run config. `warn` is the retry/validation sink — silent by default,
rebindable by callers so it can route around a live progress line."""
source_lang: str # "" means auto-detect
target_lang: str
api_url: str
api_key: str
model: str | None = None
batch_size: int = 15
batch_size: int = 10
concurrency: int = 1
max_retries: int = DEFAULT_MAX_RETRIES
quiet: bool = False
warn: Callable[[str], None] = field(default=_default_warn)
verbose: bool = False
warn: Callable[[str], None] = field(default=_silent_warn)
Loading
Loading