Add PaddleOCR-VL-1.5 to the model registry by davanstrien · Pull Request #24 · davanstrien/ocr-bench

davanstrien · 2026-06-26T14:43:12Z

What

Adds PaddleOCR-VL-1.5 (PaddlePaddle/PaddleOCR-VL-1.5, 0.9B) to MODEL_REGISTRY, completing the PaddleOCR-VL pair alongside 1.6 (added in #21).

Why

Follow-up to #21, which added 1.6. Unlike 1.6 (Qwen3.5/flashinfer → needs the prebuilt vllm/vllm-openai image on a100), the 1.5 script uses transformers batch inference (no vLLM/flashinfer), so it's a standard drop-in: default uv-script image on l4x1. #21 itself flagged 1.5 as an easy drop-in follow-up.

Changes

run.py — register paddleocr-vl-1.5 as a standard model (l4x1, no image/python/env). Script default --task-mode is ocr (markdown), directly comparable to the other OCR models.
test_run.py — registry count 9→10; guard that 1.5 is standard (l4x1, no image-mode kwargs).

Verification

uv run ruff check src/ tests/ — clean
uv run --extra viewer pytest tests/ -q — 253 passed
ocr-bench run … --list-models → paddleocr-vl-1.5 shows l4x1 (not image-mode)
ocr-bench run … --models paddleocr-vl-1.5 paddleocr-vl-1.6 --dry-run → 1.5 standard image, 1.6 image-mode
Smoke test on HF Jobs against davanstrien/encyclopaedia-britannica-1771 — results added as a comment below

🤖 Generated with Claude Code

1.5 uses transformers batch inference (no vLLM/flashinfer), so it runs on the default uv-script image on l4x1 — no image-mode config needed, unlike 1.6. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

davanstrien · 2026-06-26T14:56:29Z

Closing — smoke-tested on HF Jobs against davanstrien/encyclopaedia-britannica-1771 and PaddleOCR-VL-1.5 errors on every page:

[OCR ERROR: 'PaddleOCRVLImageProcessor' object has no attribute 'min_pixels']

The paddleocr-vl-1.5.py uv-script (uv-scripts/ocr) reads processor.image_processor.min_pixels, which transformers ≥5.0 removed (the script pins transformers>=5.0.0). The registry wiring in this PR is correct, but the underlying model script is non-functional on the current image, so adding it would only record OCR errors.

PaddleOCR-VL-1.6 (merged in #21) is verified working on Jobs and covers PaddleOCR in the bench. Can revisit 1.5 if/when the script is fixed upstream.

Add PaddleOCR-VL-1.5 to the model registry

7471d2f

1.5 uses transformers batch inference (no vLLM/flashinfer), so it runs on the default uv-script image on l4x1 — no image-mode config needed, unlike 1.6. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

davanstrien closed this Jun 26, 2026

davanstrien deleted the feat/add-paddleocr-vl-1-5 branch June 26, 2026 14:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PaddleOCR-VL-1.5 to the model registry#24

Add PaddleOCR-VL-1.5 to the model registry#24
davanstrien wants to merge 1 commit into
mainfrom
feat/add-paddleocr-vl-1-5

davanstrien commented Jun 26, 2026

Uh oh!

davanstrien commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davanstrien commented Jun 26, 2026

What

Why

Changes

Verification

Uh oh!

davanstrien commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant