Make string view and char handling platform-aware (introduce ZAC_CHAR)#8
Make string view and char handling platform-aware (introduce ZAC_CHAR)#8WJiangzhi wants to merge 1 commit into
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the library's character and string view handling by making it platform-aware. It addresses potential issues with Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces platform-aware character handling by defining ZAC_CHAR and updating the ZAC_SV string view wrapper. The changes correctly address potential issues with char signedness across different platforms, which is a solid improvement. My review focuses on improving code consistency and maintainability. I've suggested adding noexcept to constructors where appropriate, using s.data() consistently, and simplifying preprocessor logic to reduce repetition. These changes are recommended for both ZACLib_Types.hpp and the single-header variant ZACLib_single.hpp.
| #if defined(__ANDROID__) || defined(__arm__) || defined(__aarch64__) | ||
| ZAC_SV(const char* s, const size_t n) | ||
| : std::basic_string_view<ZAC_CHAR>(reinterpret_cast<const ZAC_CHAR*>(s), n) {} | ||
|
|
||
| ZAC_SV(const char* s) | ||
| : std::basic_string_view<ZAC_CHAR>( | ||
| reinterpret_cast<const ZAC_CHAR*>(s), | ||
| s ? std::char_traits<char>::length(s) : 0) {} | ||
| #endif |
There was a problem hiding this comment.
The preprocessor condition defined(__ANDROID__) || defined(__arm__) || defined(__aarch64__) is repeated multiple times in this file. To improve maintainability and avoid potential inconsistencies, consider defining a single macro for this check where ZAC_CHAR is defined, and reusing it.
For example:
// At file scope
#if defined(__ANDROID__) || defined(__arm__) || defined(__aarch64__)
#define ZAC_CHAR_IS_UNSIGNED 1
#else
#define ZAC_CHAR_IS_UNSIGNED 0
#endif
// ... then later
#if ZAC_CHAR_IS_UNSIGNED
// ...
#endif| ZAC_SV(const std::string& s) : m_data(reinterpret_cast<const ZAC_CHAR*>(s.c_str())), | ||
| m_size(s.size()) {} // 模仿std::string_view,不禁止隐式构造 |
There was a problem hiding this comment.
For consistency with std::string_view and the C++17 version of ZAC_SV, several constructors in the pre-C++17 implementation should be marked noexcept as they don't perform any operations that can throw. This includes this constructor, and the ones at lines 55 and 59.
Additionally, for consistency with the C++17 version and modern C++ practice, consider using s.data() instead of s.c_str() in this constructor.
ZAC_SV(const std::string& s) noexcept : m_data(reinterpret_cast<const ZAC_CHAR*>(s.data())),
m_size(s.size()) {} // 模仿std::string_view,不禁止隐式构造| #if defined(__ANDROID__) || defined(__arm__) || defined(__aarch64__) | ||
| ZAC_SV(const char* s, const size_t n) | ||
| : std::basic_string_view<ZAC_CHAR>(reinterpret_cast<const ZAC_CHAR*>(s), n) {} | ||
|
|
||
| ZAC_SV(const char* s) | ||
| : std::basic_string_view<ZAC_CHAR>( | ||
| reinterpret_cast<const ZAC_CHAR*>(s), | ||
| s ? std::char_traits<char>::length(s) : 0) {} | ||
| #endif |
There was a problem hiding this comment.
The preprocessor condition defined(__ANDROID__) || defined(__arm__) || defined(__aarch64__) is repeated multiple times in this file. To improve maintainability and avoid potential inconsistencies, consider defining a single macro for this check where ZAC_CHAR is defined, and reusing it.
For example:
// At file scope
#if defined(__ANDROID__) || defined(__arm__) || defined(__aarch64__)
#define ZAC_CHAR_IS_UNSIGNED 1
#else
#define ZAC_CHAR_IS_UNSIGNED 0
#endif
// ... then later
#if ZAC_CHAR_IS_UNSIGNED
// ...
#endif| ZAC_SV(const std::string& s) : m_data(reinterpret_cast<const ZAC_CHAR*>(s.c_str())), | ||
| m_size(s.size()) {} // 模仿std::string_view,不禁止隐式构造 |
There was a problem hiding this comment.
For consistency with std::string_view and the C++17 version of ZAC_SV, several constructors in the pre-C++17 implementation should be marked noexcept as they don't perform any operations that can throw. This includes this constructor, and the ones at lines 71 and 75.
Additionally, for consistency with the C++17 version and modern C++ practice, consider using s.data() instead of s.c_str() in this constructor.
ZAC_SV(const std::string& s) noexcept : m_data(reinterpret_cast<const ZAC_CHAR*>(s.data())),
m_size(s.size()) {} // 模仿std::string_view,不禁止隐式构造
Motivation
charsignedness across platforms (Android/ARM) and eliminate UB from mixingcharandunsigned charwhen indexing trie and appending slices tostd::string.ZAC_SVabstraction so library APIs work correctly with bothcharandunsigned charstorage across toolchains.Description
ZAC_CHARthat isunsigned charon__ANDROID__/__arm__/__aarch64__andcharotherwise.charin the string-view wrapper withZAC_SVbased onstd::basic_string_view<ZAC_CHAR>for C++17 and a compatible customZAC_SVfor pre-C++17, adding constructors that acceptstd::stringandconst char*with appropriatereinterpret_casts.for (const ZAC_CHAR i : from)and cast tounsigned charwhen computing trie indices, ensuring correct byte values are used for 0..255 indexing.reinterpret_cast<const char*>when constructingstd::stringoutputs and when callingresult.append(...)fromZAC_SV::data()to safely convert the library's internal view bytes intostd::stringbytes.ZACLib.cpp/ZACLib_Types.hpp) and the single-header variant (ZACLib_single.hpp).Testing
cmake --build .which completed successfully on the tested x86_64 toolchain.Codex Task