Conversation
Added a new LanguageKind.UiAutomation to HistoryInfo and implemented the UiAutomationLang class for "UI Automation Text" language support. Introduced UiAutomationOptions record to configure UI Automation traversal and filtering behavior.
Added support for UI Automation as a selectable OCR language. Integrated UiAutomationLang into language selection, caching, and kind/type checks. Introduced UIAutomationUtilities for extracting text from screen regions, points, and windows using Windows UI Automation APIs. Updated OcrUtilities to route requests to UIAutomationUtilities when appropriate, with fallback logic to traditional OCR. Added CaptureLanguageUtilities for language enumeration and compatibility checks. Improved settings import/export robustness to handle property-based settings. These changes enable text extraction from UI elements as an alternative to image-based OCR.
Added user-configurable settings and UI controls for UI Automation text extraction, including toggles for enabling UI Automation, fallback to OCR, traversal mode, offscreen element inclusion, and focus preference. Updated language picker to use OCR language by default and persist selection. Improved language selection experience and settings persistence.
- Introduce UI Automation as a new OCR language mode, including traversal options. - Centralize language loading and selection logic using CaptureLanguageUtilities. - Unify language dropdown population for all OCR modes (Tesseract, Windows AI, UI Automation). - Update UI to reflect table output support based on selected language. - Invalidate OCR language cache on language reset for accurate UI updates. - Track static vs. live image sources in GrabFrame; notify user if UI Automation is selected with a static image. - Update OCR logic to use UI Automation APIs when appropriate; skip image-based corrections for UI Automation. - Refactor and simplify code for better maintainability and clarity.
Expanded test coverage for CaptureLanguageUtilities and UIAutomationUtilities, including language matching, selection, table output support, text normalization, deduplication, window selection logic, control type handling, and point sampling. Also added tests for UiAutomationLang handling in LanguageService and HistoryInfo.
- Introduce UiAutomationOverlayItem/Snapshot models and enum for overlay representation and metadata. - Add overlay extraction methods to UIAutomationUtilities, including deduplication, sorting, and metadata helpers. - Support overlay snapshot extraction for regions, with optional window exclusion. - Refine region/point text extraction to handle excluded windows and improve accuracy with overlays. - Improve element text extraction: restrict Name fallback to specific control types and skip if visible text descendants exist. - Add ImageSource-to-Bitmap conversion and utility for live UIA source requirement. - Refactor history service to better handle image paths and deduplication.
Expanded test coverage for CaptureLanguageUtilities and UIAutomationUtilities, including new tests for RequiresLiveUiAutomationSource, TryClipBounds, TryAddUniqueOverlayItem, and SortOverlayItems. Added ImageMethodsTests to verify ImageSourceToBitmap behavior. Updated ShouldUseNameFallback tests and improved using directives.
Enables rendering of UI Automation overlays in the GrabFrame window, allowing users to view and interact with detected UI elements when a UI Automation language is selected. Adds logic to capture overlays, render them as word borders, and fall back to OCR when overlays are unavailable. Introduces user feedback messaging for unsupported scenarios, improves language selection synchronization, and refactors word border management. Updates XAML to include a message border for user notifications. Also fixes bitmap handling and ensures robust state management when switching between live and static image modes.
Update all references from "UI Automation" to "Direct Text" in both code and UI. This includes changing the abbreviated name to "DT" and updating display, native, and culture names in UiAutomationLang. Adjust UI labels, descriptions, and toggle switches in LanguageSettings.xaml to reflect the new terminology. No functional changes, only terminology updates for clarity.
There was a problem hiding this comment.
Pull request overview
This PR adds a new “Direct Text” capture mode backed by UI Automation, allowing Text Grab to read accessible UI text from live controls when available, with configurable fallback to OCR when it isn’t.
Changes:
- Introduces UI Automation-based text extraction + overlay snapshot rendering (with settings for traversal/offscreen/focus preference and OCR fallback).
- Updates Grab Frame and Fullscreen Grab to route capture and overlays through Direct Text when that language is selected.
- Centralizes language list/selection persistence logic (including legacy persisted language matching) and adds unit tests for the new utilities.
Reviewed changes
Copilot reviewed 28 out of 29 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| Text-Grab/Views/GrabFrame.xaml.cs | Adds Direct Text rendering paths, live/static-source handling, and message UI logic. |
| Text-Grab/Views/GrabFrame.xaml | Adds an in-frame message border for non-blocking status/errors. |
| Text-Grab/Views/FullscreenGrab.xaml.cs | Moves language persistence/state (table support) to shared utilities. |
| Text-Grab/Views/FullscreenGrab.SelectionStyles.cs | Captures UIA snapshot for GrabFrame and supports Direct Text for region grabs. |
| Text-Grab/Views/EditTextWindow.xaml.cs | Persists selected language and avoids culture-setting for non-Global languages. |
| Text-Grab/Utilities/UIAutomationUtilities.cs | New UIA extraction and overlay snapshot implementation. |
| Text-Grab/Utilities/SettingsImportExportUtilities.cs | Adds reflection fallback for settings import/export and adjusts type conversion flow. |
| Text-Grab/Utilities/OcrUtilities.cs | Adds UIA-first capture paths with OCR fallback and excluded-handle support. |
| Text-Grab/Utilities/ImageMethods.cs | Adds ImageSource→Bitmap helper used by history/save flows. |
| Text-Grab/Utilities/CaptureLanguageUtilities.cs | New shared language list/persistence helpers and UIA compatibility checks. |
| Text-Grab/Services/LanguageService.cs | Adds UiAutomation language kind/tag handling and cache invalidation behavior. |
| Text-Grab/Services/HistoryService.cs | Adjusts history overwrite/save flow to preserve/assign image paths more robustly. |
| Text-Grab/Properties/Settings.settings | Adds user settings for enabling/configuring Direct Text behavior. |
| Text-Grab/Properties/Settings.Designer.cs | Generated settings accessors for new Direct Text settings. |
| Text-Grab/Pages/LanguageSettings.xaml.cs | Adds UI for Direct Text settings and persists them to user settings. |
| Text-Grab/Pages/LanguageSettings.xaml | Adds Direct Text configuration section (toggles + traversal mode). |
| Text-Grab/Models/UiAutomationOverlaySnapshot.cs | New model for storing a captured UIA overlay snapshot. |
| Text-Grab/Models/UiAutomationOverlayItem.cs | New model for individual UIA overlay items + source classification. |
| Text-Grab/Models/UiAutomationOptions.cs | New model for UIA traversal/filter options. |
| Text-Grab/Models/UiAutomationLang.cs | New ILanguage implementation representing “Direct Text”. |
| Text-Grab/Models/HistoryInfo.cs | Rehydrates UiAutomation language kind when reading history. |
| Text-Grab/Enums.cs | Adds LanguageKind.UiAutomation and UiAutomationTraversalMode. |
| Text-Grab/Controls/LanguagePicker.xaml.cs | Persists selected language and uses persisted OCR language as initial selection. |
| Text-Grab/Controls/LanguagePicker.xaml | Generalizes item template for ILanguage-backed items. |
| Text-Grab/App.config | Adds default values for Direct Text settings (and includes FsgSelectionStyle). |
| Tests/UiAutomationUtilitiesTests.cs | Unit tests for UIA helper logic (normalize/dedup/sorting/window selection). |
| Tests/LanguageServiceTests.cs | Adds coverage for UiAutomation language kind/tag and history rehydration. |
| Tests/ImageMethodsTests.cs | Tests new ImageSource→Bitmap conversion helper. |
| Tests/CaptureLanguageUtilitiesTests.cs | Tests persisted-language matching and UIA live-source requirements. |
Files not reviewed (1)
- Text-Grab/Properties/Settings.Designer.cs: Language not supported
Comments suppressed due to low confidence (1)
Text-Grab/Utilities/OcrUtilities.cs:113
GetTextFromAbsoluteRectAsyncallocates aSystem.Drawing.BitmapviaGetRegionOfScreenAsBitmapbut never disposes it. SinceBitmapisIDisposable, this can leak GDI handles/memory during repeated captures; wrap it in ausing(orusing var) so it is disposed after OCR completes.
Rectangle selectedRegion = rect.AsRectangle();
Bitmap bmp = ImageMethods.GetRegionOfScreenAsBitmap(selectedRegion);
return GetStringFromOcrOutputs(await GetTextFromImageAsync(bmp, language));
Text-Grab/Views/GrabFrame.xaml.cs
Outdated
|
|
||
| RectanglesCanvas.Children.Clear(); | ||
| wordBorders.Clear(); | ||
| ClearRenderedWordBorders(); |
There was a problem hiding this comment.
ClearRenderedWordBorders(); is indented as if it were inside the if (!hasLoadedImageSource) block, but without braces it will always execute. This is easy to misread and could lead to accidental logic changes later—either fix indentation or add braces to make the intended scope explicit.
| ClearRenderedWordBorders(); | |
| ClearRenderedWordBorders(); |
Text-Grab/Utilities/OcrUtilities.cs
Outdated
| int thisCorrectedTop = (int)absPosPoint.Y + selectedRegion.Top; | ||
|
|
||
| Rectangle correctedRegion = new(thisCorrectedLeft, thisCorrectedTop, selectedRegion.Width, selectedRegion.Height); | ||
| Bitmap bmp = ImageMethods.GetRegionOfScreenAsBitmap(correctedRegion); |
There was a problem hiding this comment.
GetRegionsTextAsTableAsync creates Bitmap bmp = ImageMethods.GetRegionOfScreenAsBitmap(...) and then creates a scaled bitmap from it, but the original bmp is never disposed. This can leak GDI handles; consider wrapping bmp in a using once scaledBitmap has been created.
| Bitmap bmp = ImageMethods.GetRegionOfScreenAsBitmap(correctedRegion); | |
| using Bitmap bmp = ImageMethods.GetRegionOfScreenAsBitmap(correctedRegion); |
| public static async Task<(IOcrLinesWords?, double)> GetOcrResultFromRegionAsync(Rectangle region, ILanguage language) | ||
| { | ||
| language = GetCompatibleOcrLanguage(language); | ||
| Bitmap bmp = ImageMethods.GetRegionOfScreenAsBitmap(region); | ||
|
|
There was a problem hiding this comment.
GetOcrResultFromRegionAsync allocates a Bitmap (bmp = GetRegionOfScreenAsBitmap(region)) and never disposes it (only the scaled bitmap is disposed). This can leak GDI resources when called repeatedly; wrap bmp in a using and dispose it after scaling / OCR is complete.
| } | ||
|
|
||
| if (ocrResultOfWindow is null) | ||
| return; | ||
|
|
There was a problem hiding this comment.
In DrawOcrRectanglesAsync, if ocrResultOfWindow ends up null the method returns without resetting isDrawing back to false. That will leave the GrabFrame stuck in a permanent "drawing" state and prevent future redraw/search updates. Consider wrapping the body in a try/finally (or ensuring all early returns set isDrawing = false and restart the appropriate timers).
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
@TheJoeFin I've opened a new pull request, #631, to work on those changes. Once the pull request is ready, I'll request review from you. |
…ents Co-authored-by: TheJoeFin <7809853+TheJoeFin@users.noreply.github.com>
Disposing selectionBitmap with a using statement caused app crashes. Now, the bitmap is not disposed immediately, and a comment was added to highlight the issue and the need for further investigation.
Fix bitmap disposal leaks and isDrawing stuck state from PR review
Refined LanguagePicker to filter out internal OCR engine languages (UiAutomationLang, WindowsAiLang) and instead use the current keyboard input language for selection when needed. Updated imports and clarified parameter naming in GlobalLang. Changed UiAutomationLang tag and display values for clarity. This ensures the picker only shows real, user-facing languages and improves user experience.
This introduces a new feature where UI automation tools grab the text directly from the UI element if able and if not it will do an OCR as fall back.