ocr-tool is an installable Agent Skill for extracting text from PDFs with
pdfocr.
It provides one workflow:
- extracts text from full PDFs or selected page ranges
- reuses cached OCR output when the same file and page selection is requested again
- installs
pdfocrwhen needed - returns cleaned extracted text for downstream skills or tasks
This skill is intentionally limited to extraction. Use another skill after OCR if you want notes, flashcards, quizzes, or other transformed outputs.
pdfocr: Required for PDF-to-text extraction.- DeepInfra API Key: Required by
pdfocr.- Set it via
DEEPINFRA_API_KEY(recommended). - Or provide it via
config.jsonnext to thepdfocrexecutable.
- Set it via
Codex recommends installing non-built-in skills using the $skill-installer.
Prompt Codex with:
$skill-installer install the skill from repo planetis-m/study-assistant with path ocr-tool
Clone or copy ocr-tool into your agent's scanned skills path.
Invoke the skill explicitly using $ocr-tool in your prompts:
Use $ocr-tool to extract text from lecture1.pdf.
Use $ocr-tool to OCR pages 8-20 of lecture1.pdf and return only the cleaned text.
Use $ocr-tool on this PDF, then use $study-assistant in study-notes mode on the extracted text.