OCR drops Hindi/Devanagari and degrades romanized Hinglish — no Indic script support in Apple Vision

**Summary.** On macOS, the screen→markdown OCR completely misses Devanagari (Hindi) text and partially garbles romanized Hinglish. This makes Familiar's screen-capture path largely unusable for Indian users, whose Slack/notes/work mix English, Roman Hinglish, and Devanagari.

**Root cause.** The OCR uses Apple Vision's `VNRecognizeTextRequest`. Its supported recognition languages contain **no Indic scripts at all**. Verified on macOS 26 (revision 3, `accurate`):

```
en-US, fr-FR, it-IT, de-DE, es-ES, pt-BR, zh-Hans, zh-Hant, yue-Hans, yue-Hant,
ko-KR, ja-JP, ru-RU, uk-UA, th-TH, vi-VN, ar-SA, ars-SA, tr-TR, id-ID, cs-CZ,
da-DK, nl-NL, no-NO, ms-MY, pl-PL, ro-RO, sv-SE
```

No `hi-IN`, no Devanagari/Tamil/Telugu/Bengali. So Devanagari on screen yields nothing or garbage. Romanized Hinglish is read as Latin but `usesLanguageCorrection` (English) distorts non-English tokens.

**Compounding factor.** Captures are downscaled to ~half display resolution (e.g. 960×540 from a 1920×1080 display), so even supported-language small text degrades. Larger on-screen fonts noticeably improve results, which points at capture resolution as a second lever.

**Note:** the clipboard-mirror path preserves full UTF-8 (Devanagari + emoji tested, byte-exact), so the gap is specifically the OCR pipeline.

**Suggestions.**
1. Allow configuring/raising capture resolution (or expose an OCR-fidelity setting).
2. For scripts Vision can't handle, fall back to a Devanagari-capable OCR engine (e.g. Tesseract with `hin`, or a configurable engine) — or at least let users add recognition-language hints.
3. Short term: document the Indic-script limitation so Indian users know to rely on the clipboard path.

Environment: macOS 26 (Tahoe), Familiar v0.0.70, Apple Silicon.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCR drops Hindi/Devanagari and degrades romanized Hinglish — no Indic script support in Apple Vision #81

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

OCR drops Hindi/Devanagari and degrades romanized Hinglish — no Indic script support in Apple Vision #81

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions