Use cp1252 as default encoding for Strings#300
Use cp1252 as default encoding for Strings#300hkalteBr wants to merge 1 commit intostlehmann:masterfrom
Conversation
For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.
@hkalteBr would you mind bringing an example for this? As strings are 1-Byte characters there should be the same length for type string and type byte. |
|
I'm using the attribute utf-8 encoding for some strings: {attribute 'TcEncoding':='UTF-8'} |
|
This is a hard one. Honestly, I didn't know about this attribute. It will be hard to read values encoded this way because it is not determined which size a character has. Also we don't know the encoding on the client side. |
|
Thanks for looking into this. I remember having the encoding issue with symbol comments. I think we already changed the encoding to cp1252 for them. But we use utf-8 for the normal read/write operations at the moment. Would you mind sharing the link to the documentation? |
|
https://infosys.beckhoff.com/english.php?content=../content/1033/tc3_plc_intro/2529327243.html&id= |
|
My initial guess is that no-one really thought about the encoding too much at that time. To address this issue I suppose we could add an attribute to the read/write functions for string_encoding. Alternatively this attribute could be set in the Connection object. |
|
It is good practice to open an issue before opening a PR so I created #301 on this topic where we can further discuss the issue. |
stlehmann
left a comment
There was a problem hiding this comment.
String encoding is an issue concerning most of the functions for read/write activities. So finally they all need to be altered. Also I feel we need to add some tests concerning encoding.
| buf[offset: offset + data_symbols[data_name].size] = value | ||
| elif data_symbols[data_name].dataType == ADST_STRING: | ||
| buf[offset: offset + len(value)] = value.encode("utf-8") | ||
| buf[offset: offset + len(value.encode("utf-8"))] = value.encode("utf-8") |
There was a problem hiding this comment.
This is a good starting point. We should introduce a constant DEFAULT_TCENCODING = "cp1252". Also we should buffer the encoded value to avoid multiple encodings.


For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.