Skip to content

Fix box misalignment for emoji with variation selectors#133

Open
forty wants to merge 1 commit into
ascii-boxes:masterfrom
forty:forty/add-vs16-support
Open

Fix box misalignment for emoji with variation selectors#133
forty wants to merge 1 commit into
ascii-boxes:masterfrom
forty:forty/add-vs16-support

Conversation

@forty

@forty forty commented May 21, 2026

Copy link
Copy Markdown

Characters like ⚠️ are actually two codepoints: the base symbol (⚠ U+26A0) plus an invisible "variation selector" (U+FE0F) that tells the terminal to render it as a wide emoji.

Boxes counted the base symbol as 1 column wide and ignored the selector, but terminals display it as 2 columns. This made the right border shift 1 position too far for each affected character.

Characters like 💡 that are inherently wide were handled correctly — only the ones that get promoted to wide by the variation selector were miscounted.

Common affected characters: ⚠️, ℹ️, ✅, ❤️, 1⃣ and other keycap emoji.

Disclaimer this MR was done with the help of "AI". Though I have carefully reviewed and amended the code and made sure it was tested correctly, I'm not a C developer nor a Unicode expert, just some guy with a an ascii box which was not aligned :)

Characters like ⚠️ are actually two codepoints: the base symbol (⚠ U+26A0) plus an invisible "variation selector" (U+FE0F) that tells the terminal to render it as a wide emoji.
Boxes counted the base symbol as 1 column wide and ignored the selector, but terminals display it as 2 columns. This made the right border shift 1 position too far for each affected character.
Characters like 💡 that are inherently wide were handled correctly — only the ones that get promoted to wide by the variation selector were miscounted.
Common affected characters: ⚠️, ℹ️, ✅, ❤️, 1⃣  and other keycap emoji.
@tsjensen

Copy link
Copy Markdown
Member

Thank you for this contribution! It looks good at first glance, but I will need more time to properly review.

For the next two or three weeks, I won't be able to do that. But after that, I'll get the review done and get back to you.

@tsjensen

Copy link
Copy Markdown
Member

Question (mostly to myself): Why does libunistring's u32_strwidth() not return the correct number of characters? Rather than correcting for this problem, I would first like to understand why it exists in the first place. Is it a bug in libunistring? Are we missing some configuration? There should really be a way to invoke libunistring in a way that yields the correct width.

One of us can ask their AI. 🙂 But it will be good to have this answer for the review.

@tsjensen

Copy link
Copy Markdown
Member

The root cause is certainly with libunistring, whose u32_strwidth() returns the wrong result. I have verified this with the latest libunistring (1.4.2). I just sent an email to bug-libunistring@gnu.org, where they want their bug reports. Let's see if I get a reply. Maybe I've made a mistake in using libunistring properly.

If it turns out that it's a proper bug but they can't fix it, then I'll merge this. Other possible futures are: They tell me how to invoke libunistring so we get the correct width, or they provide a 1.4.3 with a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants