Skip to content

fix(marketplace): propagate marketplace --ref to string sources; fix GitLab slash-ref proxy filenames#1880

Open
chkp-roniz wants to merge 8 commits into
microsoft:mainfrom
chkp-roniz:fix/gitlab-nested-path-support
Open

fix(marketplace): propagate marketplace --ref to string sources; fix GitLab slash-ref proxy filenames#1880
chkp-roniz wants to merge 8 commits into
microsoft:mainfrom
chkp-roniz:fix/gitlab-nested-path-support

Conversation

@chkp-roniz

Copy link
Copy Markdown
Contributor

Summary

  • Ref propagationapm install plugin@marketplace now correctly fetches from the marketplace's registered --ref branch instead of silently falling back to the repository's default branch. resolve_marketplace_plugin was not propagating source.ref to downstream resolution calls. Covers both GitHub-family hosts (ref appended to canonical as #ref) and GitLab-hosted marketplaces (ref injected into DependencyReference). Guards prevent double-injection when a plugin's dict source already declares its own ref, and skip main/HEAD as implicit defaults. Mirrors #1824.
  • GitLab slash-ref proxy filenames — GitLab archive URLs with slash-containing branch names (e.g. feat/my-feature) now produce correctly-formed Artifactory-proxyable filenames. The slash is preserved in the path segment (as the GitLab archive API requires) but replaced with - in the archive filename, matching GitLab's own naming convention. Previously, PROXY_REGISTRY_ONLY=1 installs from such branches returned HTTP 404 because the generated filename contained a literal slash. This is the proxy scenario that #1824 did not cover.

What #1824 was missing

#1824 fixed ref propagation for the resolution path but left a gap in the Artifactory proxy path: build_artifactory_archive_url emitted GitLab archive filenames with a literal / when the branch contained a slash. Artifactory rejects such filenames with HTTP 404 under PROXY_REGISTRY_ONLY=1. This PR completes the fix.

Test plan

  • 7 new regression tests in TestMarketplaceRegisteredRefPropagation — GitHub-family and GitLab ref propagation, main/HEAD exclusion, version_spec override precedence, dict-source non-overwrite
  • 2 new tests in TestBuildArtifactoryArchiveUrl — slash-ref filename normalisation and single-component ref stability
  • 9/9 new tests pass; 0 pre-existing test failures introduced (4 pre-existing failures on main are unrelated)
  • CHANGELOG updated in [Unreleased] > Fixed

Follow-ups (not blockers, carried from #1824 review panel)

  • Defense-in-depth _validate_ref() call on injected refs
  • Additional HEAD-exclusion regression test (covered here in test_github_string_source_head_ref_not_appended)

🤖 Generated with Claude Code

chkp-roniz and others added 6 commits May 25, 2026 15:27
Mirrors the native GitLab resolver pattern -- but uses HEAD on Artifactory
archive URLs as the existence signal, so no separate metadata API is
needed.  Covers both routing modes:

  Mode 1 -- explicit FQDN deps (host/artifactory/key/owner/repo/...).
  Mode 2 -- bare shorthand under PROXY_REGISTRY_URL+PROXY_REGISTRY_ONLY.

The resolver walks candidate (owner, repo, virtual_path) splits
shallow-first and locks in the first one whose archive responds 2xx-3xx.
For unambiguous paths it returns the parse-time ref unchanged.  When
every candidate is rejected it raises -- distinguishing "missing repo"
from auth (401/403) errors so users get an actionable hint instead of
silently anchoring on a wrong guess.  ``allow_redirects=False`` on the
HEAD call keeps the Bearer token from leaking cross-host.

Removes the parse-time marker-segment heuristic
(_VIRTUAL_PATH_ROOT_SEGMENTS / _ARTIFACTORY_VIRTUAL_MARKERS /
_ARTIFACTORY_VIRTUAL_FILE_EXTENSIONS) that previously decided the
boundary based on hard-coded directory names like ``skills/`` or
``prompts/``.  Parse-time now defaults to all-as-repo with a structural
file-extension rule on the last segment; the install-time resolver is
the authoritative source of the boundary.

Also fixes a pre-existing URL-roundtrip bug where ``to_github_url`` ->
``parse`` folded the ``artifactory/<key>`` prefix into ``repo_url``,
causing the downloader to build double-prefixed archive URLs and 404.
``_validate_url_repo_path`` now strips the prefix before returning the
bare ``owner/repo`` slug.
- parse_artifactory_path: reject ``owner//virtual`` (empty segment before
  the explicit boundary) instead of falling through and returning an empty
  ``repo`` slug.
- _proxy_routing_target: thread the proxy ``scheme`` (https/http) through
  the resolver into ``build_artifactory_archive_url`` so installs that
  intentionally route through an ``http://`` proxy (under
  ``PROXY_REGISTRY_ALLOW_HTTP=1``) probe over the same transport.
- _CandidateStatus.EXISTS: update inline comment to say ``2xx or 3xx``
  (redirect-inclusive), matching the actual classification in
  ``_candidate_archive_status``.
- CHANGELOG: split the parse-time-behavior note into explicit-FQDN vs
  bare-shorthand modes (the all-as-repo default applies only under
  ``PROXY_REGISTRY_ONLY``; Mode 1 stays shallow).
- package_resolution: clarify the ``direct_virtual_resolved`` docstring --
  GitLab path gates on ``is_virtual``, Artifactory path gates on probe
  rebuild; both signal "persist as structured ``git:`` + ``path:`` entry".
- Add two regression tests for the empty-segment edge case (parser +
  iterator).
…h-support

# Conflicts:
#	CHANGELOG.md
#	src/apm_cli/models/dependency/reference.py
Adds focused regression-trap tests for the new install-pipeline
boundary resolver (microsoft#1472) covering the contracts a future refactor
could silently break:

- allow_redirects=False on every HEAD probe (token-leak guard against
  proxy-issued cross-host redirects)
- Mode 2 Authorization header is sourced from RegistryConfig
  (proxy bearer), never from AuthResolver (which returns the
  github.com PAT and would leak it to the proxy host)
- 403 (like 401) classifies as AUTH; mixed 401/non-auth-4xx demotes
  the candidate set to MISSING so the user-facing error stays
  accurate (avoids misreporting a missing repo as auth failure)
- 429 and other non-{401,403} 4xx are MISSING, not AUTH (guards
  against accidental broadening of the AUTH set)
- _split_owner_repo subgroup folding for 3+ segment GitLab boundaries
  (group/sub1/sub2/repo correctly folds to owner=group,
  repo=sub1/sub2/repo)

Both mutation-break gates were exercised: flipping allow_redirects
to True and rewiring Mode 2 to AuthResolver each made the
corresponding test fail; restoring the source code made them pass.

Also applies ruff format pass to bring three files in line with
the canonical formatter (pre-existing drift on the PR branch that
would have failed CI lint check).

Co-authored-by: chkp-roniz <chkp-roniz@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Conflict resolution:
- CHANGELOG.md: combined the [Unreleased] entries from origin/main
  (semver ranges, ``apm deps why``, install exit-code BREAKING change,
  pinned-constraint fix) with this PR's Artifactory entries.

Panel follow-ups addressed inline:

- doc-writer: added ``(microsoft#1472)`` suffix to every CHANGELOG entry
  belonging to this PR, per ``.github/instructions/changelog.instructions.md``.

- supply-chain-security-expert / devx-ux-expert (convergent):
  the boundary probe used to silently swallow transport errors
  (``requests.RequestException``), which could lock in a wrong owner/repo
  split or surface a misleading "missing repo" error during a transient
  DNS/TLS/timeout outage.  Added a fourth candidate status
  ``_CandidateStatus.INCONCLUSIVE`` that fires when every URL shape for a
  candidate raised a transport error (no HTTP response observed at all).
  The resolver now fails closed on all-INCONCLUSIVE: raises a
  network-specific ``ValueError`` carrying the underlying exception and
  the proxy host, instead of mis-classifying as MISSING.  Per-URL-shape
  transport errors are still tolerated as long as at least one URL shape
  per candidate produced an HTTP response.

  Two regression-trap tests landed in
  ``tests/unit/install/test_artifactory_resolver.py``:
    * ``TestInconclusiveFailsClosed::test_all_transport_errors_raise_network_specific_error``
    * ``TestInconclusiveFailsClosed::test_partial_transport_failure_does_not_fail_closed``
…GitLab slash-ref proxy filenames

Two related fixes for marketplace ref handling:

1. `apm install plugin@marketplace` now fetches from the registered --ref
   branch instead of silently falling back to the repo default branch.
   `resolve_marketplace_plugin` was not propagating `source.ref` downstream.
   Covers GitHub-family hosts (ref appended to canonical) and GitLab hosts
   (ref injected into DependencyReference). Guards skip main/HEAD and respect
   explicit version_spec overrides. Mirrors microsoft#1824.

2. GitLab archive URLs with slash-containing branch names (e.g. feat/my-feature)
   now produce Artifactory-proxyable filenames: slash preserved in the path
   segment, replaced with '-' in the filename. Previously PROXY_REGISTRY_ONLY=1
   installs returned HTTP 404 because the filename contained a literal slash.
   This is the proxy scenario absent from microsoft#1824.

9 regression tests added; 0 pre-existing failures introduced.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 22, 2026 18:44

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves dependency resolution in two main areas: (1) propagating a marketplace-registered --ref through to plugin resolution so installs don’t silently fall back to the default branch, and (2) making Artifactory/GitLab archive URL handling more robust (including slash-containing refs and nested subgroup repo paths) by introducing an install-time Artifactory boundary probe and updating parsing/orchestration accordingly.

Changes:

  • Propagate marketplace source.ref into downstream plugin canonical/dependency-reference resolution (GitHub-family #ref suffix; GitLab via DependencyReference.reference).
  • Add deterministic Artifactory boundary probing (HEAD archive URLs) and wire it into install-time dependency resolution, including support for nested GitLab subgroup repos behind a proxy.
  • Normalize GitLab archive filenames for slash-containing refs and adjust Artifactory URL building and parsing to avoid malformed proxy paths.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
src/apm_cli/marketplace/resolver.py Threads marketplace --ref into plugin resolution and adds guard rails around ref injection/override.
src/apm_cli/utils/github_host.py Adds Artifactory boundary candidate enumeration, refines Artifactory parsing, and fixes GitLab archive filename construction (basename + slash-ref normalization).
src/apm_cli/install/artifactory_resolver.py Introduces install-time Artifactory boundary resolver that probes archive URLs to confirm the correct owner/repo/virtual split.
src/apm_cli/install/package_resolution.py Plumbs the Artifactory boundary resolver into dependency reference resolution and persistence decisions.
src/apm_cli/models/dependency/reference.py Adds Artifactory probe ref constructor and adjusts shorthand parsing/URL validation for proxy/nested paths.
src/apm_cli/deps/artifactory_orchestrator.py Updates owner/repo splitting to preserve nested subgroup repo slugs.
src/apm_cli/commands/install.py Wires Artifactory boundary resolution into apm install and updates persistence condition naming.
tests/unit/marketplace/test_marketplace_resolver.py Adds regression tests for marketplace registered-ref propagation (GitHub-family + GitLab).
tests/unit/test_artifactory_support.py Expands Artifactory parsing/probing/URL-construction tests, including nested subgroup and slash-ref behaviors.
tests/unit/install/test_artifactory_resolver.py Adds focused regression traps for redirect/token-audience/error-classification invariants in the Artifactory resolver.
tests/unit/deps/test_artifactory_orchestrator.py Adds tests ensuring subgroup folding in _split_owner_repo is preserved.
docs/src/content/docs/enterprise/registry-proxy.md Documents nested-group repo behavior behind the registry proxy and the deterministic boundary probe.
CHANGELOG.md Adds Unreleased notes for the new Artifactory boundary probing behavior and the marketplace/GitLab fixes.

Comment on lines +94 to +118
urls = build_artifactory_archive_url(host, prefix, owner, repo, ref, scheme=scheme)
saw_auth = False
saw_any_response = False
last_exc: BaseException | None = None
for url in urls:
try:
# ``allow_redirects=False`` keeps the Bearer token from leaking to
# any host the proxy might redirect us to. 3xx is still existence
# proof -- the server confirmed the resource by issuing a redirect.
r = requests.head(
url, headers=headers, timeout=timeout, verify=verify, allow_redirects=False
)
except requests.RequestException as exc:
last_exc = exc
continue
saw_any_response = True
if 200 <= r.status_code < 400:
return _CandidateStatus.EXISTS, None
if r.status_code in (401, 403):
saw_auth = True
if not saw_any_response:
return _CandidateStatus.INCONCLUSIVE, last_exc
if saw_auth:
return _CandidateStatus.AUTH, None
return _CandidateStatus.MISSING, None
Comment on lines +752 to +755
# GitLab keeps the raw ref (including slashes) as a path segment but replaces
# slashes with dashes in the archive *filename*. E.g. branch
# ``feat/my-feature`` → ``.../-/archive/feat/my-feature/repo-feat-my-feature.zip``.
f"{base}/-/archive/{ref}/{repo_basename}-{ref.replace('/', '-')}.zip",
Comment thread src/apm_cli/marketplace/resolver.py Outdated
)
# ---- Ref override / propagation ----
# Priority:
# 1. Explicit version_spec from the caller (``apm install plugin@mkt#v2``) — always
Comment thread src/apm_cli/marketplace/resolver.py Outdated
# Guard: only when no ref is already embedded in the canonical (dict sources that
# declare their own ref embed it during ``_resolve_github_source``), only for
# in-marketplace string sources, and only for non-default refs (``main`` /
# ``HEAD`` are implicit — embedding them would be redundant noise).

When a marketplace is added with ``apm marketplace add --ref feat/x owner/repo``,
subsequent ``apm install plugin@marketplace`` must download the plugin from
``feat/x`` — not from the repo's default branch. This applies to both:
Comment thread tests/unit/test_artifactory_support.py Outdated
Comment on lines +343 to +344
— the slash stays in the *path segment* (how GitLab addresses the archive) but is
replaced with a dash in the *filename* (how GitLab names the zip).
Comment thread tests/unit/test_artifactory_support.py Outdated
assert dep.virtual_path == "some-subdir"


# ── Artifactory boundary probe (mirrors native GitLab probing pattern) ──
Comment thread tests/unit/test_artifactory_support.py Outdated
assert resolved.is_artifactory() is False


# ── ArtifactoryOrchestrator: multi-segment repo support ──
Comment thread tests/unit/test_artifactory_support.py Outdated
assert repo == "shared-modules/pkg-utils"


# ── token_manager.py: Artifactory token support ──
Comment thread CHANGELOG.md

### Added

- Deterministic Artifactory boundary probe: at install time, `_resolve_artifactory_boundary` HEAD-probes the candidate archive URLs and rebuilds the dependency reference at the proxy-verified split. Covers both explicit-FQDN (`host/artifactory/key/owner/repo/...`) and bare-shorthand-under-proxy deps. Distinguishes "missing repo" from auth (401/403) errors; uses `allow_redirects=False` so the bearer token can't leak cross-host. Mirrors the native GitLab probing pattern but without a separate metadata API. (#1472)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CHANGELOG entries at line 12 (Artifactory boundary probe, semver ranges, etc.) are pre-existing in the [Unreleased] section -- they came from earlier commits already on this branch before this PR's work started. Our PR adds only the two new Fixed bullets at lines 33-34 (ref propagation + slash-ref proxy filename). The broader scope you see is the cumulative branch diff vs main; it is not new work introduced here.

chkp-roniz and others added 2 commits June 22, 2026 22:00
…h ASCII

Addresses Copilot review comments on PR microsoft#1880.  Repo encoding rules
require source files to stay within printable ASCII to avoid
Windows cp1252 UnicodeEncodeError in some environments.

- github_host.py: U+2192 arrow (->)
- resolver.py: two U+2014 em-dashes (--)
- test_marketplace_resolver.py: U+2014 em-dash and two U+2500 separators
- test_artifactory_support.py: U+2014 em-dash and three U+2500 separators

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_candidate_archive_status() was returning AUTH whenever any URL shape
returned 401/403, even if other shapes returned non-auth 4xx (e.g. 404).
This misreported a boundary-resolution failure as an authentication
failure, giving users a misleading error message.

Per the module spec: AUTH is returned only when *every* observed response
was 401/403 with no non-auth 4xx seen.  A mix means at least one URL shape
was definitively absent -- the candidate is MISSING and the unresolved-
boundary error path is triggered instead.

Added saw_non_auth_4xx flag; AUTH gate now requires
`saw_auth and not saw_non_auth_4xx`.

Addresses Copilot review comment on PR microsoft#1880.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants