Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions lib/twemoji.rb
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ def self.find_by_text(text)
# @param code [String] Emoji code to find text.
# @return [String] Emoji Text.
def self.find_by_code(code)
invert_codes[must_str(code)]
invert_codes[must_str(code).gsub(/(-fe0e|-fe0f)/, '')]
end

# Find emoji text by raw emoji unicode.
Expand All @@ -70,7 +70,7 @@ def self.find_by_code(code)
# @param raw [String] Emoji raw unicode to find text.
# @return [String] Emoji Text.
def self.find_by_unicode(raw)
invert_codes[unicode_to_str(raw)]
invert_codes[unicode_to_str(raw).gsub(/(-fe0e|-fe0f)/, '')]
end
Comment on lines 72 to 74
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These finder methods now normalize away FE0E/FE0F, but Twemoji.parse still depends on regex matching of raw unicode sequences (emoji_pattern_all) and on unicode_to_str for URL generation. If the intent is to ignore presentation selectors during parsing as well, the normalization needs to be applied in the parse/matching path; otherwise unicode input containing FE0E/FE0F may not be replaced consistently.

Copilot uses AI. Check for mistakes.

# Render raw emoji unicode from emoji text or emoji code.
Expand Down
12 changes: 6 additions & 6 deletions lib/twemoji/data/emoji-unicode.yml
Original file line number Diff line number Diff line change
Expand Up @@ -791,8 +791,8 @@
":man-woman-girl:": 1f468-200d-1f469-200d-1f467
":man-woman-girl-boy:": 1f468-200d-1f469-200d-1f467-200d-1f466
":man-woman-girl-girl:": 1f468-200d-1f469-200d-1f467-200d-1f467
":man-heart-man:": 1f468-200d-2764-fe0f-200d-1f468
":man-kiss-man:": 1f468-200d-2764-fe0f-200d-1f48b-200d-1f468
":man-heart-man:": 1f468-200d-2764-200d-1f468
":man-kiss-man:": 1f468-200d-2764-200d-1f48b-200d-1f468
Comment on lines +794 to +795
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The codepoint sequences for these ZWJ emojis previously included VS16 (FE0F) after U+2764, but this change removes it. That breaks Twemoji CDN asset naming (the png/svg maps still reference filenames that include ...-2764-fe0f-...), and it will also prevent emoji_pattern_all from matching the real unicode sequence (which typically contains FE0F), causing parsing to miss/partially replace these emojis. Keep FE0F in emoji-unicode.yml for these sequences and handle FE0E/FE0F normalization in lookup/matching logic instead (e.g., normalize the input or build a normalized invert map).

Suggested change
":man-heart-man:": 1f468-200d-2764-200d-1f468
":man-kiss-man:": 1f468-200d-2764-200d-1f48b-200d-1f468
":man-heart-man:": 1f468-200d-2764-fe0f-200d-1f468
":man-kiss-man:": 1f468-200d-2764-fe0f-200d-1f48b-200d-1f468

Copilot uses AI. Check for mistakes.
":woman::skin-tone-2:": 1f469-1f3fb
":woman::skin-tone-3:": 1f469-1f3fc
":woman::skin-tone-4:": 1f469-1f3fd
Expand All @@ -804,10 +804,10 @@
":woman-woman-girl:": 1f469-200d-1f469-200d-1f467
":woman-woman-girl-boy:": 1f469-200d-1f469-200d-1f467-200d-1f466
":woman-woman-girl-girl:": 1f469-200d-1f469-200d-1f467-200d-1f467
":woman-heart-man:": 1f469-200d-2764-fe0f-200d-1f468
":woman-heart-woman:": 1f469-200d-2764-fe0f-200d-1f469
":woman-kiss-man:": 1f469-200d-2764-fe0f-200d-1f48b-200d-1f468
":woman-kiss-woman:": 1f469-200d-2764-fe0f-200d-1f48b-200d-1f469
":woman-heart-man:": 1f469-200d-2764-200d-1f468
":woman-heart-woman:": 1f469-200d-2764-200d-1f469
":woman-kiss-man:": 1f469-200d-2764-200d-1f48b-200d-1f468
":woman-kiss-woman:": 1f469-200d-2764-200d-1f48b-200d-1f469
Comment on lines +807 to +810
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above for the woman/man heart/kiss sequences: removing FE0F here will generate incorrect emoji URLs (missing -fe0f-) and will fail to match the actual unicode sequence during parsing. These codepoint values should remain aligned with the Twemoji asset filenames in the png/svg maps.

Suggested change
":woman-heart-man:": 1f469-200d-2764-200d-1f468
":woman-heart-woman:": 1f469-200d-2764-200d-1f469
":woman-kiss-man:": 1f469-200d-2764-200d-1f48b-200d-1f468
":woman-kiss-woman:": 1f469-200d-2764-200d-1f48b-200d-1f469
":woman-heart-man:": 1f469-200d-2764-fe0f-200d-1f468
":woman-heart-woman:": 1f469-200d-2764-fe0f-200d-1f469
":woman-kiss-man:": 1f469-200d-2764-fe0f-200d-1f48b-200d-1f468
":woman-kiss-woman:": 1f469-200d-2764-fe0f-200d-1f48b-200d-1f469

Copilot uses AI. Check for mistakes.
":family:": 1f46a
":couple:": 1f46b
":two_men_holding_hands:": 1f46c
Expand Down
11 changes: 9 additions & 2 deletions test/twemoji_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,14 @@ def test_find_by_escaped_unicode
assert_equal ":heart_eyes:", Twemoji.find_by_unicode("\u{1f60d}")
end

def test_find_by_code_including_emoji_presentation_selector
assert_equal ":eye::left_speech_bubble:", Twemoji.find_by_code("1f441-fe0f-200d-1f5e8-fe0f")
end

def test_find_by_unicode_including_emoji_presentation_selector
assert_equal ":eye::left_speech_bubble:", Twemoji.find_by_unicode("\u{1f441}\u{fe0f}\u{200d}\u{1f5e8}\u{fe0f}")
end

Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation strips both FE0F (emoji presentation) and FE0E (text presentation), but the new tests only cover the FE0F case. Please add a regression test for a sequence containing FE0E as well to ensure text presentation selectors are handled as intended.

Suggested change
def test_find_by_code_including_text_presentation_selector
assert_equal ":eye::left_speech_bubble:", Twemoji.find_by_code("1f441-fe0e-200d-1f5e8-fe0e")
end
def test_find_by_unicode_including_text_presentation_selector
assert_equal ":eye::left_speech_bubble:", Twemoji.find_by_unicode("\u{1f441}\u{fe0e}\u{200d}\u{1f5e8}\u{fe0e}")
end

Copilot uses AI. Check for mistakes.
def test_parse_plus_one
expected = %(<img draggable="false" title=":+1:" alt="👍" src="https://twemoji.maxcdn.com/2/svg/1f44d.svg" class="emoji">)

Expand Down Expand Up @@ -211,7 +219,7 @@ def test_parse_by_unicode_multiple_html
expected = %(<p><img draggable="false" title=":cookie:" alt="🍪" src="https://twemoji.maxcdn.com/2/svg/1f36a.svg" class="emoji" aria-label="emoji: cookie"><img draggable="false" title=":birthday:" alt="🎂" src="https://twemoji.maxcdn.com/2/svg/1f382.svg" class="emoji" aria-label="emoji: birthday"></p>)
aria_label = ->(name) { 'emoji: ' + name.gsub(":", '') }
assert_equal expected, Twemoji.parse(Nokogiri::HTML::DocumentFragment.parse("<p>🍪🎂</p>"), img_attrs: {'aria-label'=> aria_label } ).to_html
end
end

def test_parse_by_unicode_multiple_mix_codepoint_name_html
expected = %(<p><img draggable="false" title=":cookie:" alt="🍪" src="https://twemoji.maxcdn.com/2/svg/1f36a.svg" class="emoji" aria-label="emoji: cookie"><img draggable="false" title=":birthday:" alt="🎂" src="https://twemoji.maxcdn.com/2/svg/1f382.svg" class="emoji" aria-label="emoji: birthday"></p>)
Expand All @@ -230,5 +238,4 @@ def test_parse_multiple
aria_label = ->(name) { 'emoji: ' + name.gsub(":", '') }
assert_equal expected, Twemoji.parse(":cookie::birthday:", img_attrs: {'aria-label'=> aria_label } )
end

end