Skip to content

Fix path/branch extraction removing every /blob/ and /tree/ marker, not just the leading one#149

Closed
patchwright wants to merge 1 commit into
nephila:masterfrom
patchwright:fix-blob-tree-prefix-strip
Closed

Fix path/branch extraction removing every /blob/ and /tree/ marker, not just the leading one#149
patchwright wants to merge 1 commit into
nephila:masterfrom
patchwright:fix-blob-tree-prefix-strip

Conversation

@patchwright

Copy link
Copy Markdown

Description

path and branch are extracted from the matched URL by removing the leading
/blob/, /tree/, /-/blob/ or /-/tree/ marker. The code did this with
str.replace(marker, ""), which removes every occurrence of the marker, not
just the leading one. When the file path or branch name legitimately contains the
same segment again, the result is silently corrupted.

For example:

>>> import giturlparse
>>> giturlparse.parse("https://github.com/owner/repo/blob/main/src/blob/utils.py").path
'main/srcutils.py'        # expected: 'main/src/blob/utils.py'
>>> giturlparse.parse("https://github.com/owner/repo/tree/feature/tree/x").branch
'featurex'                # expected: 'feature/tree/x'

The leading marker is already guaranteed by the preceding startswith(...) check,
so the fix is to slice off exactly that prefix instead of calling replace:

data["path"] = data["path_raw"][len("/blob/"):]

Slicing is used (rather than str.removeprefix) to remain compatible with the
declared python_requires = >=3.8.

Fixed in both the GitHub and GitLab platforms.

References

No existing issue.

Checklist

  • Code lint checked via inv lint (ruff, black, isort all clean on the changed files)
  • Tests added (regression cases for nested /blob/ paths and /tree/ branch names on both GitHub and GitLab; verified to fail on the previous code and pass with the fix)
  • Changelog fragment added under changes/

@protoroto

Copy link
Copy Markdown
Member

@patchwright Hi! Thanks for pointing this out. In order to merge this (and to make the CI pass), I've created #150 this issue. Would you be so kind to:

  • rename the changes/149.bugfix in changes/150.bugfix (to mimick the issue number)
  • rename your branch to bugfix/issue-150-fix-blob-tree-prefix-strip
    Then the CI will not complain and I'll merge and release this as soon as possible.
    Thanks again for contributing!

GitHubPlatform and GitLabPlatform built `path`/`branch` with
str.replace(marker, ""), which removes *every* occurrence of the
marker rather than only the leading one. A URL whose file path or
branch name contained the same segment again was silently corrupted:

    parse('.../blob/main/src/blob/utils.py').path  -> 'main/srcutils.py'
    parse('.../tree/feature/tree/x').branch        -> 'featurex'

The leading marker is already guaranteed by the preceding startswith
check, so slice it off instead. Slicing (rather than str.removeprefix)
keeps the declared Python 3.8 compatibility. Adds regression tests for
nested /blob/ paths and /tree/ branch names on both GitHub and GitLab.
@patchwright patchwright force-pushed the fix-blob-tree-prefix-strip branch from 8d9dcb9 to 24b4612 Compare June 14, 2026 15:31
@patchwright patchwright deleted the fix-blob-tree-prefix-strip branch June 14, 2026 15:32
@patchwright patchwright reopened this Jun 14, 2026
@patchwright patchwright restored the fix-blob-tree-prefix-strip branch June 14, 2026 15:33
@patchwright

Copy link
Copy Markdown
Author

@protoroto Renamed the fragment to changes/150.bugfix (commit 24b4612) — towncrier should pass now.

On the branch rename: renaming the head branch of a fork-sourced PR closes the PR, so I kept the original branch name to leave this open and mergeable. The branch name doesn't affect the checks; if you'd still like it renamed I can push under the new name and reopen a fresh PR — happy to do that.

@protoroto

Copy link
Copy Markdown
Member

@patchwright Hi! Thanks for your quick response! Sadly, towncrier need the branch to be formatted in a specific way

Branch fix-blob-tree-prefix-strip does not respect the '<type>/(<optional-task-type>-)<number>-description' format

so if you would be so kind to re-open this pr with the new branch name, I'll be more than happy to merge and release it as soon as possible!

Thanks again!

@patchwright

Copy link
Copy Markdown
Author

@protoroto Done — re-opened as #152 from branch bugfix/issue-150-fix-blob-tree-prefix-strip, which matches the <type>/(<optional-task-type>-)<number>-description format towncrier expects. Same commit and diff, changelog fragment is changes/150.bugfix. Closing this one in favour of #152. Thanks for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants