An English to Chinese translation tool based on llama.cpp, supporting text input translation and screen capture OCR translation functionality, using CPU for inference
llama/- Contains llama.cpp library files (.dll, .lib) and header files- Version: llama.cpp git hash version 9e10bd2ea
tessdata/- Contains Tesseract OCR data files- Includes: chi_sim.traineddata (Chinese), eng.traineddata (English)
resource_icon/- Application icon resources- Includes: nail.svg (pin icon, provided by Alibaba Cloud Icon)
- Source code files:
widget.cpp- Main interface and core functionality implementationconfigmanager.cpp- Configuration managementscreencapture.cpp- Screen capture functionalitysettingsdialog.cpp- Settings dialog
- Compilation Toolchain: MSVC (Microsoft (R) C/C++ Optimizing Compiler Version 19.44.35207.1 for x86, corresponding to Visual Studio 2022 version 17.4)
- Operating System: Windows 11
- C++ Library Management: vcpkg
- OCR Engine: Tesseract 5.5.2
- Dependencies:
- Leptonica 1.87.0 (required image processing library for Tesseract)
- Dependencies:
- AI Model: Tencent Tencent-HY-MT1.5 (specifically HY-MT1.5-1.8B-GGUF, model file size approximately 2G)
- Model address: https://huggingface.co/tencent/HY-MT1.5-1.8B-GGUF
- The 7B version can also be used in theory
- Text Translation: Input English text to get Chinese translation
- Real-time Translation: Automatically triggers translation after input stops
- Screen Capture OCR: Capture screen area, automatically recognize text and translate
- Global Hotkey: Support setting global hotkey to quickly start OCR functionality
- Window Stay-on-Top: Can keep the window on top
- Configuration Management: Support custom model path, hotkey and other settings
- Memory: At least 8GB RAM (CPU inference)
Taking one of my tests as an example, performance reference using CPU inference:
- First Token Time: 3999.03 ms
- Total Generated Tokens: 29
- Average Throughput: 13.9494 tokens/sec
- Average Time per Token: 69.2159 ms
- First Token Time: 1233.4 ms
- Total Generated Tokens: 37
- Average Throughput: 19.1257 tokens/sec
- Average Time per Token: 50.8727 ms
- Ensure Visual Studio 2022 17.4 or higher is installed
- Install necessary dependencies using vcpkg
- Compile the project
- Run the generated executable
- Configure model path in settings
For developers, please note the following path configurations:
- vcpkg path in CMakeLists.txt: Need to modify to your local installation path for tesseract and other packages (I used absolute paths)
- ggml related library and header file paths: Please modify the corresponding path configurations according to your actual installation location
- Text Translation: Enter English text in the input box, the system will automatically translate or click the "Translate" button to manually trigger
- OCR Translation: Use the global hotkey (default is Ctrl+Shift+O) to start screen capture, select the area to be recognized, the system will automatically recognize text and translate
- Settings: Click the "Settings" button to configure model path, hotkey and other options
- The first run requires loading the model, which may have a longer initialization time
- OCR functionality depends on the training data in the tessdata folder
- Translation quality depends on the quality of the model used
This project is open source under the Tencent HY Community License Agreement.