Skip to content

feat: support multiple OCR parameter sets and improve text selection accuracy#86

Open
tvone wants to merge 6 commits intoSWHL:mainfrom
tvone:feature
Open

feat: support multiple OCR parameter sets and improve text selection accuracy#86
tvone wants to merge 6 commits intoSWHL:mainfrom
tvone:feature

Conversation

@tvone
Copy link

@tvone tvone commented Jul 10, 2025

Changes

  • Added support for multiple OCR parameter sets (ocr_params_list) instead of a single config.
  • Improved detection accuracy by testing various configurations (e.g., Det.limit_side_len min/max) to reduce missed detections of short or long text lines within images.
  • Multiprocessing support (multiprocessing.Pool) can optionally be used to speed up processing by running different parameter sets in parallel.

Notes

  • Default behavior remains the same if ocr_params_list is not provided.
  • Parallel execution is optional and only applied if explicitly implemented by the user.

@tvone tvone force-pushed the feature branch 2 times, most recently from a8f7a86 to a2fa3f4 Compare July 10, 2025 17:09
@SWHL SWHL requested a review from Copilot July 11, 2025 00:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Enhance OCRProcessor to accept multiple OCR configurations and select the most comprehensive text result per image, adjusting the entry point to use the new ocr_params_list.

  • Refactored OCRProcessor.__init__ and get_ocr_result to handle a list of parameter sets.
  • Updated single_rec to iterate over all engines and pick the longest text output.
  • Changed RapidVideOCRInput and RapidVideOCR in main.py to use ocr_params_list.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
rapid_videocr/ocr_processor.py Support multiple OCR parameter sets in single_rec, updated engine initialization and result logic.
rapid_videocr/main.py Replaced ocr_params with ocr_params_list in input schema and processor instantiation.
Comments suppressed due to low confidence (1)

rapid_videocr/ocr_processor.py:61

  • Add unit tests to verify that single_rec selects the longest text among multiple OCR parameter sets, covering this conditional branch.
                if max_txt_len < len(txts):

@SWHL
Copy link
Owner

SWHL commented Jul 22, 2025

Thanks, and I will merge it later.

@tvone
Copy link
Author

tvone commented Jul 22, 2025

Thanks, and I will merge it later.

Thank you.

@SWHL
Copy link
Owner

SWHL commented Sep 10, 2025

I'm really sorry for the late response; I've only just found the time to look into RapidVideOCR recently. I've carefully reviewed your submission. The core idea is to instantiate multiple OCR instances with different configurations simultaneously, perform multiple recognitions on the same image, and select the longest result as the final output.

However, this approach would lead to increased resource consumption. Out of curiosity, could you please share what specific scenario prompted you to consider this solution?

@SWHL SWHL added the feature_request 新的功能点需求 label Sep 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature_request 新的功能点需求

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants