Skip to content

Fix multi-byte character handling in chat#1039

Open
anf3is wants to merge 2 commits intoReactiveDrop:reactivedrop_betafrom
anf3is:fix/chat-cut-utf8
Open

Fix multi-byte character handling in chat#1039
anf3is wants to merge 2 commits intoReactiveDrop:reactivedrop_betafrom
anf3is:fix/chat-cut-utf8

Conversation

@anf3is
Copy link
Copy Markdown
Contributor

@anf3is anf3is commented Apr 30, 2026

Related to #855

Problems

  1. Message validation may truncate input in the middle of a multi-byte utf-8 sequence, producing invalid text.
  2. Chat input operates on utf-16 wchar characters, but message sending uses utf-8 char.
  3. Chat input doesn’t represent what will be sent over, because non-Latin letters in most languages are typically multi-byte characters that take up only a single wchar and 2–3 char.

Goals

  • Fix truncation in message validation
  • Fix char limit calculation in chat box input
  • Do not fix truncation splitting utf-8 graphemes

Test strings

Details

Bytes

  • numbers 123… represent 110 + n byte position (counting from 1)
  • | represent string cutoff
  • x represent utf-8 code point's byte
_128_chars______________________________________________________________________________________________________7|8__ok_12345678
ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ_____7|8____ok_12345678
ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ____6x|x___nok_123456ø
ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ___5xx|x___nok_12345Ⓐ
ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ__4xxx|_____ok_1234Ⓐ

Chars

128 characters ("Ⓐ"*41+"Ⓑ"*(126-41)+"78"). 300+ bytes.

ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷⒷ78

Examples

Details

Before

SuperCake: _128_chars______________________________________________________________________________________________________7|8__ok_1234567
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ_____7|8____ok_1234567
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ____6x|x___nok_123456�
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ___5xx|x___nok_12345�
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ__4xxx|_____ok_1234Ⓐ
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒷ�溰p䷗ট溰p�

After

SuperCake: _128_chars______________________________________________________________________________________________________7|8__ok_1234567
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ_____7|8____ok_1234567
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ____6x|x___nok_123456
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ___5xx|x___nok_12345
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶ__4xxx|_____ok_1234Ⓐ
SuperCake: ⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒶⒷ7

anf3is added 2 commits April 29, 2026 14:55
Because of the `Send` method, max char limit should be based on utf-8
byte count, not utf-16 chars.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant