Skip to content

安全 Bug: safe_extract 在 Python 3.12+ 抛错类型不符合契约 (4 个 checkpoint 测试失败) #5

@Zld1994

Description

@Zld1994

问题

agentManager/engine/checkpoint.py 里的 safe_extract() 函数声称会在路径穿越攻击时抛 ValueError,但在 Python 3.12+ 实际抛的是 tarfile.OutsideDestinationError(不是 ValueError)。

失败的测试 (4 个)

FAILED tests/unit/test_checkpoint.py::TestSafeExtract::test_safe_extract_path_traversal_attack
FAILED tests/unit/test_checkpoint.py::TestSafeExtract::test_safe_extract_absolute_path_attack
FAILED tests/unit/test_checkpoint.py::TestSafeExtract::test_safe_extract_nested_traversal
FAILED tests/unit/test_checkpoint.py::TestLoadCheckpointWithRecovery::test_load_checkpoint_with_malicious_paths

报错形式都是:

Failed: DID NOT RAISE <class 'ValueError'>
tarfile.OutsideDestinationError: '../../evil.py' would be extracted outside ...

根因

当前实现按 Python 版本走两条不同分支:

def safe_extract(tar, path):
    target_path = Path(path).resolve()

    # Python 3.12+
    if sys.version_info >= (3, 12):
        tar.extractall(path=path, filter='data')   # ← 抛 OutsideDestinationError / FilterError
        return

    # Python <3.12: 手动校验,抛 ValueError
    for member in tar.getmembers():
        if member.name.startswith('/'):
            raise ValueError(...)
        if '..' in member.name:
            raise ValueError(...)
        ...

文档承诺 Raises: ValueError 但 3.12 分支不满足。

解决方案

推荐方案 A:统一在所有 Python 版本下都做手动校验后再 extractall,或在 3.12 分支把 FilterError 捕获后重新包装为 ValueError

import tarfile
def safe_extract(tar, path):
    target_path = Path(path).resolve()
    # 先手动校验(所有版本统一行为)
    for member in tar.getmembers():
        if member.name.startswith('/'):
            raise ValueError(f'Absolute path detected in archive: {member.name}')
        if '..' in member.name:
            raise ValueError(f'Path traversal detected: {member.name} ...')
        member_path = (target_path / member.name).resolve()
        try:
            member_path.relative_to(target_path)
        except ValueError:
            raise ValueError(f'Path traversal detected: {member.name} ...')
    # 然后再用 3.12+ 的 data filter 二次保护
    if sys.version_info >= (3, 12):
        tar.extractall(path=path, filter='data')
    else:
        tar.extractall(path=path)

或者方案 B:更新测试,让 pytest.raises 同时接受 (ValueError, tarfile.FilterError)。但这会让 API 契约更模糊,不推荐。

影响

  • 严重程度:高(这是安全相关代码,行为不一致风险高)
  • 范围:4 个测试失败;任何依赖 safe_extractValueError 的代码在 3.12+ 都不会捕获到异常

Metadata

Metadata

Assignees

No one assigned

    Labels

    securitySecurity-related issues

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions