Make constrainByteLength work by tsdko · Pull Request #751 · ynoproject/forest-orb

tsdko · 2026-02-09T16:36:44Z

Might require more testing; I have run the tests on Firefox and Chromium and tested the input manually with Firefox (and IME input with Mozc on Linux) but have not tested on other platforms.

Should hopefully prevent overly long non-ASCII messages from getting eaten during send attempts.

Seems like the current implementation as present in master could have worked with a bit more space in buf (enough to fit the next largest UTF-8 character) and comparing written instead of read as read is in UTF-16 code units instead of bytes, but it still has weird behavior when the caret is not at the end of the string (on regular input it's forced to the end; if you paste something and the entire string is too long the existing text at the end gets cut off).

The behavior of the built-in maxlength attribute is not very consistent across browsers: if the user attempts to replace currently selected text and not even one character from the replacement string fits, Firefox preserves the selection while Chromium discards it instead. This implementation discards the selection.

Shortcoming: hitting the length limit breaks undo (does nothing). (This is a problem with the current implementation as well, it's just a bit more hidden as ASCII inputs get properly constrained via HTML maxlength.)

Test code

// "|" is the caret
const tests = [
  // below the byte limit, unchanged
  "🐱|",       "🐱",
  "あい|",     "あい",
  "abc🐱|",    "abc🐱",
  "abcdあ|",   "abcdあ",
  "abcdefg|",  "abcdefg",
  // above the byte limit, truncated
  "abcdefgh|", "abcdefg",
  "あabcde|",  "あabcd",
  "abcdeあ|",  "abcde",
  "abcd🐱|",   "abcd",
  "あいう|",   "あい",
  "🐱🦈|",     "🐱",
  // above the byte limit, caret in the middle of the string
  "abcd|efgh", "abcefgh",
  "あb|cdef",  "あcdef",
  "abc|deあ",  "abdeあ",
  "abc|d🐱",   "abd🐱",
  "あい|う",   "あう",
  "🐱|🦈",     "🦈",
];
const cbl = constrainByteLength(7);
for (let i = 0; i < tests.length; i += 2) {
  const sel = tests[i].indexOf('|');
  console.assert(sel >= 0, `no caret in ${tests[i]}`);
  const inVal = tests[i].substring(0, sel) + tests[i].substring(sel+1);
  const event = {target: {value: inVal, selectionStart: sel, selectionEnd: sel}};
  cbl(event);
  const actual = event.target.value, expected = tests[i+1];
  console.assert(expected === actual, `expected ${expected}, got ${actual}`);
}

Should hopefully prevent overly long non-ASCII messages from getting eaten during send attempts. Seems like the current implementation could have worked with a bit more space in buf (enough to fit the next largest UTF-8 character) and comparing `written` instead of `read` as `read` is in UTF-16 code units instead of bytes, but it still has weird behavior when the caret is not at the end of the string (on regular input it's forced to the end, if you paste something the existing text at the end gets cut off if the entire string is too long). The behavior of the built-in `maxlength` attribute is not very consistent across browsers: if the user attempts to replace currently selected text and not even one character from the replacement string fits, Firefox preserves the selection while Chromium discards it instead. This implementation discards the selection. Shortcoming: hitting the length limit breaks undo (does nothing). (This is a problem with the current implementation as well, it's just a bit more hidden as ASCII inputs get properly constrained via HTML `maxlength`.) Test code: // "|" is the caret const tests = [ // below the byte limit, unchanged "🐱|", "🐱", "あい|", "あい", "abc🐱|", "abc🐱", "abcdあ|", "abcdあ", "abcdefg|", "abcdefg", // above the byte limit, truncated "abcdefgh|", "abcdefg", "あabcde|", "あabcd", "abcdeあ|", "abcde", "abcd🐱|", "abcd", "あいう|", "あい", "🐱🦈|", "🐱", // above the byte limit, caret in the middle of the string "abcd|efgh", "abcefgh", "あb|cdef", "あcdef", "abc|deあ", "abdeあ", "abc|d🐱", "abd🐱", "あい|う", "あう", "🐱|🦈", "🦈", ]; const cbl = constrainByteLength(7); for (let i = 0; i < tests.length; i += 2) { const sel = tests[i].indexOf('|'); console.assert(sel >= 0, `no caret in ${tests[i]}`); const inVal = tests[i].substring(0, sel) + tests[i].substring(sel+1); const event = {target: {value: inVal, selectionStart: sel, selectionEnd: sel}}; cbl(event); const actual = event.target.value, expected = tests[i+1]; console.assert(expected === actual, `expected ${expected}, got ${actual}`); }

zebraed · 2026-02-10T00:45:59Z

I can help testing with this on another platform, please wait a little while

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make constrainByteLength work#751

Make constrainByteLength work#751
tsdko wants to merge 1 commit intoynoproject:masterfrom
tsdko:fix-constrain-byte-length

tsdko commented Feb 9, 2026 •

edited

Loading

Uh oh!

zebraed commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tsdko commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zebraed commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tsdko commented Feb 9, 2026 •

edited

Loading