Skip to content

Prevent emojis from being spoken aloud by avatar#95

Open
attilin29 wants to merge 1 commit intomainfrom
fix/emoji-tts-filter
Open

Prevent emojis from being spoken aloud by avatar#95
attilin29 wants to merge 1 commit intomainfrom
fix/emoji-tts-filter

Conversation

@attilin29
Copy link
Copy Markdown
Collaborator

Add EmojiTextFilter to strip emoji characters before TTS synthesis, and strengthen the system prompt to explicitly instruct the LLM not to use emojis (since they are read aloud as their names by TTS engines).

Add EmojiTextFilter to strip emoji characters before TTS synthesis,
and strengthen the system prompt to explicitly instruct the LLM not
to use emojis (since they are read aloud as their names by TTS engines).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an EmojiTextFilter to strip emojis from text before TTS synthesis and updates system instructions to discourage their use. Feedback suggests expanding the emoji regex to include missing ranges like Zero Width Joiners and combining characters to ensure more comprehensive filtering. Additionally, it is recommended to replace emojis with a single space and normalize whitespace to prevent unnatural pauses in the TTS output.

Comment on lines +6 to +23
_EMOJI_RE = re.compile(
"["
"\U0001F600-\U0001F64F"
"\U0001F300-\U0001F5FF"
"\U0001F680-\U0001F6FF"
"\U0001F1E0-\U0001F1FF"
"\U00002600-\U000026FF"
"\U00002700-\U000027BF"
"\U0000FE00-\U0000FE0F"
"\U0001F900-\U0001F9FF"
"\U0001FA70-\U0001FAFF"
"\U00002300-\U000023FF"
"\U00002B50-\U00002B55"
"\U0001F004"
"\U0001F0CF"
"]+",
flags=re.UNICODE,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current regex for identifying emojis is missing several important ranges and special characters. Specifically, it does not include the Zero Width Joiner (\u200D), which is frequently used in complex emoji sequences (like family or profession emojis), nor does it cover combining characters like the keycap symbol (\u20E3). Additionally, many symbols in the \U0001F000-\U0001F2FF range (Mahjong, Dominoes, Playing Cards, Enclosed Alphanumeric/Ideographic Supplements) are omitted. Consider using a more comprehensive range or adding these specific characters to prevent them from being read aloud by the TTS engine.

Comment on lines +32 to +33
async def filter(self, text: str) -> str:
return _EMOJI_RE.sub("", text).strip()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Replacing emojis with an empty string can lead to double spaces in the middle of sentences (e.g., "Hello 😃 world" becomes "Hello world"), which might cause unnatural pauses in some TTS engines. It is better to replace emojis with a single space and then normalize the whitespace to ensure a clean string is passed to the synthesis service.

Suggested change
async def filter(self, text: str) -> str:
return _EMOJI_RE.sub("", text).strip()
async def filter(self, text: str) -> str:
# Replace emojis with a space to avoid merging words
text = _EMOJI_RE.sub(" ", text)
# Collapse multiple spaces and strip leading/trailing whitespace
return " ".join(text.split())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants