Skip to content

yusufyzzc/Screen-OCR-Translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR Screen Translator

Capture any part of your screen, extract text with OCR and translate it instantly.

OCR Screen Translator logo

Downloads Visitors License MIT Python


About project

This is a lightweight Windows desktop tool built for one fast workflow:

  1. Select an area on screen.
  2. Run OCR on the captured image.
  3. Translate the extracted text.

No tab switching, no copy-paste between multiple apps.

preview:

intro.mp4

Features

  • Global hotkey capture from anywhere in Windows
  • In-memory OCR pipeline (no temporary screenshot files required)
  • Instant translation using Google Translate (default)
  • Translator architecture ready for additional providers
  • Copy and save actions for fast output handling
  • Desktop executable build with PyInstaller

Keyboard shortcuts

Shortcut Action
Ctrl+Alt+T Capture area and translate
Ctrl+Alt+C Copy output
Ctrl+Alt+S Save output to file

How it works

  1. The app starts and requests Administrator permission on Windows.
  2. Global hotkeys are registered using the keyboard library.
  3. You capture a region with the snipping overlay.
  4. OCR extracts text using Tesseract.
  5. The selected translator returns translated output to the UI.

Tech stack

Layer Technology
Desktop bridge Eel
OCR Tesseract + pytesseract
Screen capture mss + tkinter
Translation requests + Google endpoint
Hotkeys keyboard
Packaging PyInstaller

Quick start

1) Prerequisites

2) Install dependencies

pip install -r requirements.txt

3) Run app

python main.py

Note: On startup, Windows may show a UAC prompt so global hotkeys can work system-wide.

Build executable

python build.py

Trusted GitHub Releases workflow

To improve user trust for downloaded binaries:

  1. Build with safer defaults:
python build.py
  1. Upload these generated metadata files from dist/ with your release assets:
  • main.zip (or your custom zip name)
  • RELEASE-SHA256.txt
  • RELEASE-NOTES-TEMPLATE.md
  • SIGNING-COMMAND-TEMPLATE.txt
  1. In release notes, include checksum verification steps (PowerShell):
Get-FileHash .\ScreenOCRTranslator.exe -Algorithm SHA256
  1. Prefer code-signed binaries for fewer SmartScreen warnings. SIGNING-COMMAND-TEMPLATE.txt contains ready-to-edit signtool examples.

Cross-platform standalone (macOS)

Yes, it is possible, but build must run on macOS.

  • Windows build produces Windows binaries only.
  • For macOS users, build on a Mac machine (or macOS CI runner) with the same source code.
  • Typical macOS output is .app (and optionally .dmg/.zip).
  • To avoid Gatekeeper warnings on macOS, use Apple Developer code signing and notarization.

Suggested release assets:

  • ScreenOCRTranslator-win-x64.zip (built on Windows)
  • ScreenOCRTranslator-macos-arm64.zip (built on Apple Silicon Mac)
  • ScreenOCRTranslator-macos-x64.zip (built on Intel Mac)

GitHub Actions release automation

This repository includes a workflow at .github/workflows/release-build.yml that:

  • Builds standalone ZIP assets for Windows and macOS
  • Generates per-platform SHA256 checksum files
  • Uploads assets automatically when a Release is published

Extend with another translator

  1. Open screen_translator/translator.py.
  2. Create a class that extends BaseTranslator.
  3. Implement translate.
  4. Register the class in TRANSLATORS.

The UI list is populated dynamically.

License

This project is licensed under the MIT License.

About

Desktop OCR translator: capture screen regions, extract text, translate instantly

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors