fix(i18n): re-translate Italian locale (was ~51% Spanish) + contamination guard#166
Merged
Conversation
…mination guard src/data/it.json had been populated from es.json and only partially re-translated: 632 long strings were byte-identical Spanish, and the _protected brand map mistranslated Claude→Claudio / Anthropic→Antropico / Claude Code→Codice Claudio — silently breaking runtime brand-term restoration for Italian, which is our #1 install market (verified via CWS CSV; growth driven by an Italian dev-blog recommendation). - Re-translated every contaminated string (exact es-copies + Spanish/hybrid forms) from the English source keys via the extension's own GT endpoint, restored brand/technical terms to canonical English, rebuilt _protected with correct Italian wrong-forms. it↔es overlap: 51% → 0.1% (parity with 10 others). - New guard scripts/check-locale-contamination.js (npm run check:locales, wired into ci.yml) fails when a locale shares >8% of long strings with another — catches the wrong-language bug class that check-i18n/dict-coverage miss (they verify key/shape, not language). Gates green: 488 tests, lint, i18n, dict-coverage, glossary, academy, guard.
heznpc
added a commit
that referenced
this pull request
Jun 3, 2026
…rids (#167) #166 re-translated the exact es-copies and token-matched Spanish, but a language-detection (franc) + Spanish-only-character ([áíóúñ¿¡]) re-audit found ~89 residual contaminated strings it missed — partially-Italianized hybrids like "Conectando tus Strumenti", "Configurazione de tokens máximos", "Qué información recopila Skilljar..." that are neither exact copies nor caught by token regexes. Single heuristics keep leaking. Since the file is a botched es-copy with no hand-curation worth preserving, regenerated EVERY body string (1,101 leaves) from the English source keys via the extension's own GT endpoint, with brand-term restoration. Verified clean by three independent methods converging to ~0: - Spanish-only characters [áíóúñ¿¡]: 26 → 0 - franc=spa (long strings): only the 8-string false-positive floor (all confirmed correct Italian, e.g. "Pianifica i tuoi prossimi passi con Claude") - exact es-copy: 1 ("Claude con Amazon Bedrock", identical & valid in both) Gates green: validate, i18n, dict-coverage, glossary, academy, check:locales, 488 tests, lint. _protected and brand terms (Claude/Anthropic/Claude Code/ Cowork/Skilljar) preserved.
heznpc
added a commit
that referenced
this pull request
Jun 9, 2026
…tency) (#181) A verified readiness audit found the code is done (505 tests, 0 open issues) but front-door docs had drifted. Fixes (all factual/compliance, not the deferred strategy docs): Factual errors (were misleading users/owner): - README Installation said the CWS listing "was removed ... not currently available" (full delisting). It is actually live as v1.0.1 in all locales except the US (removed 2026-05-12 over the old icon). Corrected to match POSITIONING (the source of truth). - RELEASE_CHECKLIST pointed at store-assets/promotion/ drafts that were purged and no longer exist. Removed the dead pointer (drafts are kept off-repo). Stale (now closed): - CHANGELOG [Unreleased] was missing #167/#170/#172/#174/#175/#176/#179/#180; added them. - it.json _meta.translation_provenance + lastUpdated (and the matching constants.js comment, README locale-table cell) still said "v1, Spanish- derived regex" — it was re-translated from English in #166/#167 (overlap now 0.1%). Updated; regenerated plugin data accordingly. - TESTING.md listed "E2E flows" under "What is NOT tested" — the Playwright E2E suite exists and runs in CI. Reframed to describe what E2E covers. - PRIVACY_POLICY "Last updated" dateline was April 11 despite June changes. Gates green: 505 tests, lint, prettier, validate, check:plugin/dicts/locales/ i18n/dict-coverage, full E2E (17). Deferred strategy docs (POSITIONING, quarter-focus) untouched — owned by the separate doc-cleanup session.
heznpc
added a commit
that referenced
this pull request
Jun 9, 2026
Mechanical version cut so the dashboard upload ships everything merged since the 3.5.39 tag (#166–#194: Italian re-translation + locale guard, protected- terms CJK fix, tutor stream fixes, Gemini-verify guard, FAB icon + reset-button host-CSS fixes, "Claude(Claude)" gloss collapse, chip alignment, shadow-root isolation, doc consistency). - manifest.json / package.json / package-lock.json → 3.5.40 - 11 src/data/*.json _meta.version → 3.5.40 (check:dict-coverage enforces the match); claude-plugin terms data regenerated in sync - CHANGELOG: [Unreleased] → [3.5.40] - 2026-06-10; fresh [Unreleased] added - RELEASE_CHECKLIST refreshed for v3.5.40 (status block, prepared list, gate counts 520/520 + 19 e2e; historical 3.5.39 mentions kept) - STORE_LISTING → v3.5.40; "What's new" gains the 3.5.40 user-facing items (tutor-button icon, chip alignment, "Claude(Claude)" fix, style isolation) - README installation note + version markers (npm run docs) → 3.5.40 Artifacts rebuilt and verified at 3.5.40: store-assets/skillbridge-bundled.zip (CWS upload) + skillbridge.zip (raw fallback) — both gitignored, upload from local disk. Gates: 520 unit, full E2E 19/19, lint, prettier, validate, all check:* green (check:cws-drift intentionally fails until the dashboard upload).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
src/data/it.json— the Italian locale dictionary — had been built fromes.jsonand only partially re-translated. 632 of its long strings were byte-identical Spanish, and the_protectedbrand map mistranslated brand names (Claude → Claudio,Anthropic → Antropico,Claude Code → Codice Claudio), which silently broke runtime brand-term restoration for Italian users.This matters because Italy is our #1 install market (verified against the CWS CSV exports), with growth traced to an Italian developer blog that recommends SkillBridge specifically as the Academy translation solution — so the strongest market was getting Spanish course content.
The existing checks (
check-i18n,check-dict-coverage, 488 unit tests) all passed on the broken file because they verify key/shape parity, not that values are in the right language.Fix
translate.googleapis.com/translate_a/single,tl=it), then restored brand/technical terms to canonical English._protectedwith the correct Italian wrong-forms (Claude → ["Claudio"],Anthropic → ["Antropico"], …) so runtime restoration works for Italian.Re-prevention
New guard
scripts/check-locale-contamination.js(npm run check:locales, wired intoci.yml) fails when any locale shares >8% of its long strings with another. Clean locales sit at ≤2.1%; the contaminated Italian file was 51%. This closes the blind spot that let the bug ship.Gates
488 tests · lint · check:i18n · check:dict-coverage · check:glossary · check:academy · check:locales — all green.
Found while auditing for pre-CWS-republication hardening. Locale content fix; no logic changes.
🤖 Generated with Claude Code