English to Chinese Translator

An English to Chinese translation tool based on llama.cpp, supporting text input translation and screen capture OCR translation functionality, using CPU for inference

中文

Project Structure

llama/ - Contains llama.cpp library files (.dll, .lib) and header files
- Version: llama.cpp git hash version 9e10bd2ea
tessdata/ - Contains Tesseract OCR data files
- Includes: chi_sim.traineddata (Chinese), eng.traineddata (English)
resource_icon/ - Application icon resources
- Includes: nail.svg (pin icon, provided by Alibaba Cloud Icon)
Source code files:
- widget.cpp - Main interface and core functionality implementation
- configmanager.cpp - Configuration management
- screencapture.cpp - Screen capture functionality
- settingsdialog.cpp - Settings dialog

Technology Stack

Compilation Toolchain: MSVC (Microsoft (R) C/C++ Optimizing Compiler Version 19.44.35207.1 for x86, corresponding to Visual Studio 2022 version 17.4)
Operating System: Windows 11
C++ Library Management: vcpkg
OCR Engine: Tesseract 5.5.2
- Dependencies:
  - Leptonica 1.87.0 (required image processing library for Tesseract)
AI Model: Tencent Tencent-HY-MT1.5 (specifically HY-MT1.5-1.8B-GGUF, model file size approximately 2G)
- Model address: https://huggingface.co/tencent/HY-MT1.5-1.8B-GGUF
- The 7B version can also be used in theory

Features

Text Translation: Input English text to get Chinese translation
Real-time Translation: Automatically triggers translation after input stops
Screen Capture OCR: Capture screen area, automatically recognize text and translate
Global Hotkey: Support setting global hotkey to quickly start OCR functionality
Window Stay-on-Top: Can keep the window on top
Configuration Management: Support custom model path, hotkey and other settings

System Requirements

Memory: At least 8GB RAM (CPU inference)

Inference Performance

Taking one of my tests as an example, performance reference using CPU inference:

First Inference

First Token Time: 3999.03 ms
Total Generated Tokens: 29
Average Throughput: 13.9494 tokens/sec
Average Time per Token: 69.2159 ms

Subsequent Inferences

First Token Time: 1233.4 ms
Total Generated Tokens: 37
Average Throughput: 19.1257 tokens/sec
Average Time per Token: 50.8727 ms

Quick Start

Ensure Visual Studio 2022 17.4 or higher is installed
Install necessary dependencies using vcpkg
Compile the project
Run the generated executable
Configure model path in settings

Developer Notes

For developers, please note the following path configurations:

vcpkg path in CMakeLists.txt: Need to modify to your local installation path for tesseract and other packages (I used absolute paths)
ggml related library and header file paths: Please modify the corresponding path configurations according to your actual installation location

Usage Instructions

Text Translation: Enter English text in the input box, the system will automatically translate or click the "Translate" button to manually trigger
OCR Translation: Use the global hotkey (default is Ctrl+Shift+O) to start screen capture, select the area to be recognized, the system will automatically recognize text and translate
Settings: Click the "Settings" button to configure model path, hotkey and other options

Notes

The first run requires loading the model, which may have a longer initialization time
OCR functionality depends on the training data in the tessdata folder
Translation quality depends on the quality of the model used

License

This project is open source under the Tencent HY Community License Agreement.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
docs		docs
llama		llama
resource_icon		resource_icon
tessdata		tessdata
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
License.txt		License.txt
README.md		README.md
README_CN.md		README_CN.md
configmanager.cpp		configmanager.cpp
configmanager.h		configmanager.h
cpu_detector.cpp		cpu_detector.cpp
cpu_detector.h		cpu_detector.h
debug_helper.cpp		debug_helper.cpp
debug_helper.h		debug_helper.h
debug_run.bat		debug_run.bat
main.cpp		main.cpp
resources.qrc		resources.qrc
screencapture.cpp		screencapture.cpp
screencapture.h		screencapture.h
settingsdialog.cpp		settingsdialog.cpp
settingsdialog.h		settingsdialog.h
test_minimal.cpp		test_minimal.cpp
test_model_load.cpp		test_model_load.cpp
translator_qt_zh_CN.ts		translator_qt_zh_CN.ts
widget.cpp		widget.cpp
widget.h		widget.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

English to Chinese Translator

Project Structure

Technology Stack

Features

System Requirements

Inference Performance

First Inference

Subsequent Inferences

Quick Start

Developer Notes

Usage Instructions

Notes

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

English to Chinese Translator

Project Structure

Technology Stack

Features

System Requirements

Inference Performance

First Inference

Subsequent Inferences

Quick Start

Developer Notes

Usage Instructions

Notes

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages