Skip to content

Update varchar() to count bytes not chars#961

Open
lachlangh wants to merge 2 commits intor-dbi:mainfrom
lachlangh:main
Open

Update varchar() to count bytes not chars#961
lachlangh wants to merge 2 commits intor-dbi:mainfrom
lachlangh:main

Conversation

@lachlangh
Copy link
Copy Markdown

Fixes #960

This PR changes varchar() to measure the input length in bytes (nchar(x, type = "bytes")) rather than characters.
This matches SQL Server’s definition of VARCHAR(n) as the number of bytes, and prevents truncation when inserting multibyte UTF-8 strings.

Counting bytes instead of characters should not adversely affect other database backends.

Includes a minimal test verifying the multibyte case.

@detule
Copy link
Copy Markdown
Collaborator

detule commented Oct 29, 2025

Thanks again for your submission.

I see our varchar method gets used for NetezzaSQL as well. Off the top of my head I can't imagine your change impacting their workflow adversely so feels like fine to merge without doing extra work to check there.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

varchar() measures characters instead of bytes, causing truncation with multibyte UTF-8 strings (SQL Server)

2 participants