Skip to content

Hi TN: Implement Roman semiotic class#442

Open
shrpawar-alt wants to merge 7 commits into
NVIDIA:staging/hi_tn_v3from
shrpawar-alt:hi-tn-roman
Open

Hi TN: Implement Roman semiotic class#442
shrpawar-alt wants to merge 7 commits into
NVIDIA:staging/hi_tn_v3from
shrpawar-alt:hi-tn-roman

Conversation

@shrpawar-alt

Copy link
Copy Markdown
Contributor

What does this PR do ?

Implement a Hindi Roman numeral class to accurately normalize cases where Roman numbers are used with Devanagari context (e.g., 'भास्कर-II', 'XIIवीं कक्षा').

Before your PR is "Ready for review"

Pre checks:

  • Have you signed your commits? Use git commit -s to sign.
  • Do all unittests finish successfully before sending PR?
    1. pytest or (if your machine does not have GPU) pytest --cpu from the root folder (given you marked your test cases accordingly @pytest.mark.run_only_on('CPU')).
    2. Sparrowhawk tests bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
  • If you are adding a new feature: Have you added test cases for both pytest and Sparrowhawk here.
  • Have you added __init__.py for every folder and subfolder, including data folder which has .TSV files?
  • Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
  • Have you added the correct license header Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. to all newly added Python files?
  • If you copied nemo_text_processing/text_normalization/en/graph_utils.py your header's second line should be Copyright 2015 and onwards Google, Inc.. See an example here.
  • Remove import guards (try import: ... except: ...) if not already done.
  • If you added a new language or a new feature please update the NeMo documentation (lives in different repo).
  • Have you added your language support to tools/text_processing_deployment/pynini_export.py.

PR Type:

  • New Feature
  • Bugfix
  • Documentation
  • Test

If you haven't finished some of the above items you can still open "Draft" PR.

shrpawar-alt and others added 4 commits June 23, 2026 07:08
Signed-off-by: Shreyas Pawar <shrpawar@nvidia.com>
Signed-off-by: Shreyas Pawar <shrpawar@nvidia.com>
Signed-off-by: Shreyas Pawar <shrpawar@nvidia.com>
@shrpawar-alt shrpawar-alt marked this pull request as ready for review June 23, 2026 12:36
Comment thread tests/nemo_text_processing/hi/test_roman.py Outdated
Comment thread tests/nemo_text_processing/hi/test_roman.py Outdated
Comment thread nemo_text_processing/text_normalization/hi/taggers/tokenize_and_classify.py Outdated
Comment thread nemo_text_processing/text_normalization/hi/verbalizers/roman.py Outdated
Comment thread nemo_text_processing/text_normalization/hi/data/roman/roman_to_spoken.tsv Outdated
shrpawar-alt and others added 3 commits June 25, 2026 10:23
Signed-off-by: Shreyas Pawar <shrpawar@nvidia.com>
Signed-off-by: Shreyas Pawar <shrpawar@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants