Skip to content

fix/wrong-type-at-arm-char-unsigned#10

Closed
WJiangzhi wants to merge 3 commits into
masterfrom
fix/wrong-type-at-arm-char-unsigned
Closed

fix/wrong-type-at-arm-char-unsigned#10
WJiangzhi wants to merge 3 commits into
masterfrom
fix/wrong-type-at-arm-char-unsigned

Conversation

@WJiangzhi
Copy link
Copy Markdown
Member

@WJiangzhi WJiangzhi commented Mar 6, 2026

添加NDK,Make,VS模块支持
将单头文件版本废弃
修改CMakeLists.txt
修改错误的unsigned char判断

Fixes #9

@WJiangzhi WJiangzhi self-assigned this Mar 6, 2026
@WJiangzhi WJiangzhi added bug Something isn't working enhancement New feature or request labels Mar 6, 2026
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求主要旨在提升ZACLib库的跨平台兼容性和构建系统支持。它通过引入对Android NDK、Makefile和Visual Studio构建系统的支持,扩展了库的可用性。同时,此PR还解决了一个关键的字符类型问题,即在ARM架构上char类型可能默认为unsigned char,通过引入条件编译的ZAC_CHAR类型来确保正确的行为。此外,单头文件版本的库已被标记为废弃。

Highlights

  • 新增NDK、Make、VS模块支持: 为库添加了Android NDK、Makefile和Visual Studio项目文件,以增强跨平台构建能力。
  • 废弃单头文件版本: 将ZACLib_single.hpp移动到Deprecated/ZACLib_single.hpp目录,标志其为废弃版本。
  • CMakeLists.txt更新: 修改了主CMakeLists.txt文件,并调整了示例项目的C++标准版本。
  • 修复ARM平台字符类型问题: 引入ZAC_CHAR类型别名,根据平台(特别是ARM/Android)条件性地定义为unsigned char或char,并相应地更新了代码中的类型使用和类型转换。
Changelog
  • ZACLib/Android.mk
    • 添加了NDK构建配置。
  • ZACLib/CMakeLists.txt
    • 更新了C++标准注释。
    • 移除了冗余的set(CMAKE_CXX_STANDARD 11)
  • ZACLib/Deprecated/ZACLib_single.hpp
    • ZACLib/ZACLib_single.hpp重命名而来。
    • 根据平台定义了ZAC_CHAR类型(ARM/Android为unsigned char,否则为char)。
    • 修改了ZAC_SV以使用ZAC_CHAR,并添加了平台特定的构造函数和类型转换。
    • 更新了data()begin()end()cbegin()cend()operator[]以返回const ZAC_CHAR*const ZAC_CHAR&
    • AddRuleReplace方法中循环变量的类型从char更改为ZAC_CHAR
    • outputs.emplace_backresult.append调用添加了reinterpret_cast<const char*>
    • static_cast<unsigned char>(input[i])更改为static_cast<unsigned char>(input[i])
  • ZACLib/Makefile
    • 添加了Makefile构建配置。
  • ZACLib/ZACLib.cpp
    • 移除了ReSharper注释。
    • 修改了AddRule以使用ZAC_SV("")而不是ZAC_SV("", 0)
    • AddRule中循环变量的类型从char更改为ZAC_CHAR
    • trie[node].next[c]trie[node].output_id赋值添加了static_cast<Node::value_type>
    • outputs.emplace_backresult.append调用添加了ArmCastChar宏。
    • int f更改为const int f
    • const auto invalid_output更改为constexpr auto invalid_output
    • while (cursor < upper_bound)添加了// NOLINT注释。
    • const unsigned char c = input[i]更改为const auto c = static_cast<unsigned char>(input[i])
  • ZACLib/ZACLib.hpp
    • 移除了ReSharper注释。
  • ZACLib/ZACLib.vcxproj
    • 添加了Visual Studio项目文件。
  • ZACLib/ZACLib_Types.hpp
    • 引入了CHAR_UNSIGNED宏用于条件编译。
    • 根据CHAR_UNSIGNEDZAC_CHAR定义为unsigned charchar
    • 定义了ArmCastChar宏用于条件reinterpret_cast
    • 修改了ZAC_SV以使用std::basic_string_view<ZAC_CHAR>,并添加了平台特定的构造函数和类型转换。
    • 更新了data()begin()end()cbegin()cend()operator[]以返回const ZAC_CHAR*const ZAC_CHAR&
    • 更新了Node结构体以使用value_typenext_typesize_type,并为kInvalidOutput使用了constexpr
    • 移除了// ReSharper disable CppVariableCanBeMadeConstexpr并添加了// ReSharper enable CppNonExplicitConvertingConstructor
  • examples/windows/CMakeLists.txt
    • CMAKE_CXX_STANDARD从23更改为11。
  • examples/windows/test.cpp
    • 更新了ZACLib_single.hpp的包含路径为Deprecated/ZACLib_single.hpp
    • std::unordered_map<std::string_view, std::string>更改为std::unordered_map<std::string, std::string>
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR aims to add build support for NDK, Make, and VS, deprecate the single-header version, and address char type issues on ARM architectures. However, it introduces several narrowing conversions where size_t values are cast to int, which can lead to negative indices and out-of-bounds memory access on 64-bit systems, posing a Denial of Service (DoS) risk. Additionally, there are general issues such as spelling errors, build script errors, and C++ code that might cause compilation failures or undefined behavior, including potential compilation errors related to the ARM char fix when interacting with std::string::append and std::strlen.

Comment thread ZACLib/ZACLib_Types.hpp
Comment on lines 9 to 17
#if __cplusplus >= 201703L
#include <string_view>
#include HAS_STRING_VIEW
#endif

#ifdef HAS_STRING_VIEW
#include <string_view>
#else
#include <cstring>
#include <cstring>
#endif
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

预处理指令 #include HAS_STRING_VIEW 是无效的C++语法,会导致编译失败。如果意图是根据宏来条件性地包含头文件,应该使用 #ifdef。然而,之前基于 __cplusplus 宏的检查是标准且更健壮的方式来判断是否支持 C++17 的 string_view。建议恢复之前的实现方式。

#if __cplusplus >= 201703L
#include <string_view>
#else
#include <cstring>
#endif

Comment thread ZACLib/Makefile
all: $(TARGET)

$(TARGET): $(OBJ)
$(CXX) $(OBJ) -o $(TARGET)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

此命令 $(CXX) $(OBJ) -o $(TARGET) 会创建一个可执行文件,而不是静态库。根据项目的其他构建文件(CMake, Android.mk),目标应该是生成一个静态库。要创建静态库(通常是 .a 文件),应该使用 ar 工具。例如:ar rcs libZACLib.a ZACLib.o。当前的 Makefile 无法正确生成库文件。

Comment thread ZACLib/ZACLib_Types.hpp
ZAC_SV(const std::string& s) : m_data(s.c_str()),
m_size(s.size()) {} // 模仿std::string_view,不禁止隐式构造
ZAC_SV(const ZAC_CHAR* d) : m_data(d),
m_size(d ? std::strlen(ArmCastChar(d)) : 0) {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

std::strlen 函数期望一个 const char* 类型的参数。当 CHAR_UNSIGNED 被定义时,ZAC_CHARunsigned chard 的类型是 const unsigned char*。将 const unsigned char* 传递给 std::strlen 的行为是未定义的。为了确保代码的可移植性和正确性,应该总是将指针转换为 const char*

                                    m_size(d ? std::strlen(reinterpret_cast<const char*>(d)) : 0) {}

Comment thread ZACLib/ZACLib.cpp

if (from.size() > trie[node].match_len) {
trie[node].output_id = outputs.size(); // NOLINT(*-narrowing-conversions)
trie[node].output_id = static_cast<Node::value_type>(outputs.size());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

This line involves a critical narrowing conversion. Casting outputs.size() (a size_t) to Node::value_type (an int) can lead to a negative output_id if outputs.size() exceeds INT_MAX. This negative output_id, when later used as an index for the outputs vector in the Do method (line 132), will result in a massive out-of-bounds memory access on 64-bit systems, causing an immediate crash (Denial of Service). It's essential to prevent this overflow and handle cases where the number of patterns exceeds the capacity of Node::value_type.

if (outputs.size() >= static_cast<size_t>(std::numeric_limits<Node::value_type>::max())) {
    // Handle error: Too many patterns
    return;
}
trie[node].output_id = static_cast<Node::value_type>(outputs.size());

Comment thread ZACLib/ZACLib.cpp
const auto c = static_cast<unsigned char>(i);
if (trie[node].next[c] == -1) {
trie[node].next[c] = trie.size(); // NOLINT(*-narrowing-conversions)
trie[node].next[c] = static_cast<Node::value_type>(trie.size());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

A narrowing conversion occurs here when casting trie.size() (which is size_t) to Node::value_type (which is int). On 64-bit systems, the number of nodes in the Trie can exceed the maximum value of a 32-bit signed integer (INT_MAX, approximately 2.1 billion). If this happens, the value will wrap around to a negative integer. Since this value is later used as an index into the trie vector (e.g., on line 28: node = trie[node].next[c]), a negative value will result in an out-of-bounds memory access, leading to a crash (Denial of Service) or potentially more severe memory corruption.

if (trie.size() >= static_cast<size_t>(std::numeric_limits<Node::value_type>::max())) {
    // Handle error: Trie too large
    return;
}
trie[node].next[c] = static_cast<Node::value_type>(trie.size());

Comment thread ZACLib/ZACLib.cpp
const auto c = static_cast<unsigned char>(i);
if (trie[node].next[c] == -1) {
trie[node].next[c] = trie.size(); // NOLINT(*-narrowing-conversions)
trie[node].next[c] = static_cast<Node::value_type>(trie.size());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Similar to the issue in Replace::AddRule, this narrowing conversion from size_t to int can lead to out-of-bounds memory access if the number of nodes in the Trie exceeds INT_MAX. This would cause a crash when the negative index is used in the Do method.

if (trie.size() >= static_cast<size_t>(std::numeric_limits<Node::value_type>::max())) {
    // Handle error
    return;
}
trie[node].next[c] = static_cast<Node::value_type>(trie.size());

Comment thread ZACLib/ZACLib.cpp
for (const unsigned char c : from) {
if (trie[node].next[c] == -1) {
trie[node].next[c] = trie.size(); // NOLINT(*-narrowing-conversions)
trie[node].next[c] = static_cast<Node::value_type>(trie.size());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

This narrowing conversion from size_t to int can lead to out-of-bounds memory access if the number of nodes in the Trie exceeds INT_MAX, causing a crash in the Do method.

if (trie.size() >= static_cast<size_t>(std::numeric_limits<Node::value_type>::max())) {
    // Handle error
    return;
}
trie[node].next[c] = static_cast<Node::value_type>(trie.size());

Comment thread ZACLib/CMakeLists.txt
target_include_directories(ZACLib PUBLIC ".")
set_target_properties(ZACLib PROPERTIES
CXX_STANDARD 11
CXX_STANDARD 11 # or higner
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

这里有一个拼写错误,“higner” 应该是 “higher”。

    CXX_STANDARD 11 # or higher

trie[node].output_id = outputs.size(); // NOLINT(*-narrowing-conversions)
trie[node].match_len = from.size();
outputs.emplace_back(to.data(), to.size());
outputs.emplace_back(reinterpret_cast<const char*>(to.data()), to.size());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

ZAC_CHARunsigned char 时,to.data() 返回 const char*reinterpret_cast<const char*> 是不必要的,但更重要的是,这与 ZACLib_Types.hpp 中的 ArmCastChar 宏不一致,后者在需要时进行转换。为了保持一致性和清晰性,建议使用在 ZACLib_Types.hpp 中定义的 ArmCastChar 宏。

                outputs.emplace_back(ArmCastChar(to.data()), to.size());

Comment thread examples/windows/test.cpp Outdated
ZACLib::ZAC_SV input;

const std::unordered_map<std::string_view, std::string> GrenadeType = {
const std::unordered_map<std::string, std::string> GrenadeType = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

当使用 C++11 标准时,std::string_view 不可用,这会导致编译错误。将 std::unordered_map 的键类型从 std::string_view 改为 std::string 是正确的修复方法,因为它确保了与 C++11 的兼容性。然而,考虑到性能,如果键是从字符串字面量构造的,使用 const char* 作为键类型并提供自定义的哈希和等于函数可能是更高效的选择。但对于当前情况,std::string 是一个安全且正确的选择。

@WJiangzhi WJiangzhi closed this Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Android平台部分情况下无法正确匹配

1 participant