fix(i18n): fully regenerate Italian body — #166 left ~89 residual hybrids#167
Merged
Conversation
…rids #166 re-translated the exact es-copies and token-matched Spanish, but a language-detection (franc) + Spanish-only-character ([áíóúñ¿¡]) re-audit found ~89 residual contaminated strings it missed — partially-Italianized hybrids like "Conectando tus Strumenti", "Configurazione de tokens máximos", "Qué información recopila Skilljar..." that are neither exact copies nor caught by token regexes. Single heuristics keep leaking. Since the file is a botched es-copy with no hand-curation worth preserving, regenerated EVERY body string (1,101 leaves) from the English source keys via the extension's own GT endpoint, with brand-term restoration. Verified clean by three independent methods converging to ~0: - Spanish-only characters [áíóúñ¿¡]: 26 → 0 - franc=spa (long strings): only the 8-string false-positive floor (all confirmed correct Italian, e.g. "Pianifica i tuoi prossimi passi con Claude") - exact es-copy: 1 ("Claude con Amazon Bedrock", identical & valid in both) Gates green: validate, i18n, dict-coverage, glossary, academy, check:locales, 488 tests, lint. _protected and brand terms (Claude/Anthropic/Claude Code/ Cowork/Skilljar) preserved.
heznpc
added a commit
that referenced
this pull request
Jun 9, 2026
…tency) (#181) A verified readiness audit found the code is done (505 tests, 0 open issues) but front-door docs had drifted. Fixes (all factual/compliance, not the deferred strategy docs): Factual errors (were misleading users/owner): - README Installation said the CWS listing "was removed ... not currently available" (full delisting). It is actually live as v1.0.1 in all locales except the US (removed 2026-05-12 over the old icon). Corrected to match POSITIONING (the source of truth). - RELEASE_CHECKLIST pointed at store-assets/promotion/ drafts that were purged and no longer exist. Removed the dead pointer (drafts are kept off-repo). Stale (now closed): - CHANGELOG [Unreleased] was missing #167/#170/#172/#174/#175/#176/#179/#180; added them. - it.json _meta.translation_provenance + lastUpdated (and the matching constants.js comment, README locale-table cell) still said "v1, Spanish- derived regex" — it was re-translated from English in #166/#167 (overlap now 0.1%). Updated; regenerated plugin data accordingly. - TESTING.md listed "E2E flows" under "What is NOT tested" — the Playwright E2E suite exists and runs in CI. Reframed to describe what E2E covers. - PRIVACY_POLICY "Last updated" dateline was April 11 despite June changes. Gates green: 505 tests, lint, prettier, validate, check:plugin/dicts/locales/ i18n/dict-coverage, full E2E (17). Deferred strategy docs (POSITIONING, quarter-focus) untouched — owned by the separate doc-cleanup session.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
While scoping further hardening, a re-audit of
it.jsonwith franc language detection + a Spanish-only-character scan ([áíóúñ¿¡]— Italian doesn't use these) found that #166 was incomplete: ~89 contaminated strings survived it. They're partially-Italianized hybrids that defeat single heuristics:Conectando tus StrumentiConfigurazione de tokens máximosQué información recopila Skilljar sobre mi actividad de apprendimento?These are neither exact
es.jsoncopies (so the contamination guard's exact-match missed them) nor caught by token regexes. Honest note: my #166 "it.json is clean" claim was wrong — single-method verification wasn't enough.Fix
The file is a botched es-copy with no hand-curation worth preserving, so I regenerated every body string (1,101 leaves) from the English source keys via the extension's own GT endpoint, with brand-term restoration — no detection gaps possible.
Verification — three independent methods converge to ~0
[áíóúñ¿¡]franc= Spanish (long strings)es.jsoncopyBrand terms (Claude / Anthropic / Claude Code / Cowork / Skilljar) and
_protectedpreserved. Quality sample: "Cos'è Skilljar e perché accedo?", "Procedura dettagliata sull'utilizzo di Claude Code...".Gates: validate · i18n · dict-coverage · glossary · academy · check:locales · 488 tests · lint — all green.
🤖 Generated with Claude Code