Skip to content

Fix tarfile symlink/hardlink escape in safe_extract#45

Merged
rilma merged 2 commits intofeatures/upgrade-ci-cdfrom
copilot/sub-pr-40-again
Feb 21, 2026
Merged

Fix tarfile symlink/hardlink escape in safe_extract#45
rilma merged 2 commits intofeatures/upgrade-ci-cdfrom
copilot/sub-pr-40-again

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 21, 2026

The previous safe_extract only validated member string paths, leaving it vulnerable to symlink/hardlink escape attacks — a crafted tarball could plant a symlink inside the target directory and extract a subsequent file through it to write anywhere on the filesystem.

Changes

  • Python ≥3.11.4 / ≥3.12: delegates to filter=tarfile.data_filter, Python's built-in extraction filter that blocks path traversal, symlinks, hardlinks, and device files at the stdlib level
  • Python 3.11.0–3.11.3 fallback: explicitly rejects any TarInfo member where issym(), islnk(), or isdev() is true before the path-traversal check; only verified regular files/directories are passed to extractall()
def safe_extract(tar, path=".", *, numeric_owner=False):
    # Use the built-in extraction filter when available (Python >=3.11.4 / >=3.12)
    if hasattr(tarfile, "data_filter"):
        tar.extractall(path, numeric_owner=numeric_owner, filter=tarfile.data_filter)
        return

    safe_members = []
    for member in tar.getmembers():
        # Reject symlinks, hardlinks and special files to prevent
        # escape-via-link attacks even when the path looks safe.
        if member.issym() or member.islnk() or member.isdev():
            raise ValueError(f"Unsafe tar member type: {member.name}")
        member_path = os.path.join(path, member.name)
        if not is_within_directory(path, member_path):
            raise ValueError("Attempted Path Traversal in Tar File")
        safe_members.append(member)

    tar.extractall(path, safe_members, numeric_owner=numeric_owner)

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: rilma <14822065+rilma@users.noreply.github.com>
Copilot AI changed the title [WIP] Update CI/CD pipeline with testing improvements Fix tarfile symlink/hardlink escape in safe_extract Feb 21, 2026
Copilot AI requested a review from rilma February 21, 2026 21:54
@rilma rilma marked this pull request as ready for review February 21, 2026 21:59
@rilma rilma merged commit ecd31b3 into features/upgrade-ci-cd Feb 21, 2026
@rilma rilma deleted the copilot/sub-pr-40-again branch February 21, 2026 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants