Skip to content

fix(input): support scp-style SSH Git URLs in host validation#208

Open
rodboev wants to merge 3 commits into
NVIDIA:mainfrom
rodboev:fix/scp-ssh-git-url-202
Open

fix(input): support scp-style SSH Git URLs in host validation#208
rodboev wants to merge 3 commits into
NVIDIA:mainfrom
rodboev:fix/scp-ssh-git-url-202

Conversation

@rodboev

@rodboev rodboev commented Jun 25, 2026

Copy link
Copy Markdown

Summary

scp-style SSH Git URLs (git@github.com:org/repo.git) are silently admitted by _is_git_url() then rejected by _validate_url_host() with the misleading error URL has no valid hostname. This makes every GitHub/GitLab/Bitbucket SSH clone URL unusable even though git clone supports the format natively.

Closes #202

Root cause

_validate_url_host() (input_handler.py:163-181) uses urlparse(url).hostname for host extraction. scp-style syntax has no // authority separator, so urlparse().hostname returns None, host becomes "", and the function raises at line 171 before reaching the allowlist or SSRF checks. The _clone_git() call at line 190 would handle the format correctly if validation passed.

Diff Notes

  • src/skillspector/input_handler.py: add import re; add private _extract_scp_host() helper matching ^[^@/]+@([^:/]+):.+$; add scp fallback inside _validate_url_host() after urlparse — scp host flows through existing allowlist and _is_private_ip() SSRF checks unchanged. No change to _is_git_url(), _clone_git(), or ALLOWED_GIT_HOSTS.
  • tests/unit/test_input_handler.py: seven new test cases covering scp detection, valid-host extraction, disallowed-host rejection, private-IP rejection, https regression, and SSRF gate for scp-extracted hosts (mocked).

Before / After

Input Before After
git@github.com:org/repo.git ValueError: URL has no valid hostname host github.com extracted, clone proceeds
git@malicious.internal:org/repo.git ValueError: URL has no valid hostname host extracted, fails allowlist, ValueError
git@169.254.169.254:org/repo.git ValueError: URL has no valid hostname host extracted, fails allowlist, ValueError
https://github.com/org/repo.git "github.com" (unchanged) "github.com" (scp fallback not reached)

Scope

No expansion of ALLOWED_GIT_HOSTS. No change to SSRF protections or their order. No new runtime dependency (re is stdlib). _is_git_url() and _clone_git() unchanged.

Verification

  • uv sync --all-extras — exit 0
  • .\.venv\Scripts\python.exe -m pytest tests/unit/test_input_handler.py -v13 passed, 1 warning, 0.50s (6 original + 7 new)
  • uv run ruff check src/skillspector/input_handler.py tests/unit/test_input_handler.py — exit 0
  • uv run ruff format --check src/skillspector/input_handler.py tests/unit/test_input_handler.py — 2 files already formatted, exit 0

Proof matrix:

Test Base (7bc9c0f) Head
test_validate_url_host_scp_extracts_github raises ValueError returns "github.com"
test_scp_valid_host_clones raises before clone subprocess called with scp URL
test_scp_url_is_git_url True True
test_scp_disallowed_host_raises raises ValueError raises ValueError (allowlist)
test_scp_private_ip_raises raises ValueError raises ValueError
test_https_url_unchanged "github.com" "github.com"
test_scp_ssrf_gate_fires N/A (new test) raises ValueError matching "private/internal IP"

CI Lint & Test (Python 3.12), Lint & Test (Python 3.13), and DCO Check pending maintainer approval if fork checks are gated.

Notes

DCO sign-off required on every commit (git commit -s). Known existing failures (tests/nodes/test_meta_analyzer.py) are unrelated to this surface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] scp-style SSH Git URLs (git@host:org/repo.git) are accepted as Git URLs, then rejected with "URL has no valid hostname"

1 participant