Skip to content

Feature/676 url based marketplace#691

Open
Vicente-Pastor wants to merge 6 commits intomicrosoft:mainfrom
Vicente-Pastor:feature/676-url-based-marketplace
Open

Feature/676 url based marketplace#691
Vicente-Pastor wants to merge 6 commits intomicrosoft:mainfrom
Vicente-Pastor:feature/676-url-based-marketplace

Conversation

@Vicente-Pastor
Copy link
Copy Markdown

@Vicente-Pastor Vicente-Pastor commented Apr 13, 2026

Description

Adds support for URL-based marketplace registration, enabling apm marketplace add <URL> to register Agent Skills discovery indexes served from arbitrary HTTPS endpoints (per the Agent Skills Discovery RFC v0.2.0).

This extends the existing GitHub-based marketplace system to support any HTTPS-hosted index, with full caching, conditional refresh (ETag/Last-Modified), digest integrity verification, and provenance tracking.

Fixes #676

Type of change

  • New feature
  • Documentation

What's New

Core Features

  • apm marketplace add <URL> — register an Agent Skills discovery index from any HTTPS URL
  • Bare-origin auto-resolutionhttps://example.comhttps://example.com/.well-known/agent-skills/index.json
  • Agent Skills index parser — parses RFC v0.2.0 indexes with strict $schema validation, name validation (1-64 chars, lowercase alphanumeric + hyphens), and digest format enforcement
  • Format auto-detection — automatically detects Agent Skills ("skills" key) vs legacy marketplace.json ("plugins" key)
  • skill-md and archive source types — resolver handles both RFC-defined artifact types

Security

  • HTTPS enforced at both command layer and fetch layer
  • Post-redirect scheme validation (prevents HTTPS→HTTP downgrade)
  • SHA-256 digest computation and verification on every fetch
  • Per-skill digest format validation (sha256:{64 hex chars})

Caching & Performance

  • URL sources cached with same 1-hour TTL as GitHub sources
  • ETag / Last-Modified conditional requests (304 Not Modified)
  • Stale-while-revalidate fallback on network errors
  • SHA-256-based cache keys for URL sources (avoids filesystem character issues)

Provenance

  • source_url and source_digest fields on MarketplaceManifest and LockedDependency
  • Provenance preserved through cache hits, 304, and stale fallbacks

Command Updates

  • marketplace list — shows URL instead of owner/repo for URL sources
  • marketplace remove — correct cache key derivation for URL sources
  • marketplace update — uses source-aware cache clearing
  • marketplace browse — handles URL sources in display

Files Changed

File Type Summary
src/apm_cli/marketplace/models.py Modified MarketplaceSource URL fields, parse_agent_skills_index(), name/digest validators
src/apm_cli/commands/marketplace.py Modified URL detection, _resolve_index_url(), list/remove/update URL display, error handling
src/apm_cli/marketplace/client.py Modified _fetch_url_direct(), FetchResult, _detect_index_format(), _parse_manifest(), SHA-256 cache keys, ETag/conditional
src/apm_cli/marketplace/resolver.py Modified Non-GitHub URL pass-through, skill-md/archive handling, empty URL guard
src/apm_cli/marketplace/archive.py New Archive download/extraction with path traversal and decompression bomb safety checks
src/apm_cli/deps/lockfile.py Modified source_url / source_digest on LockedDependency
tests/.../test_marketplace_url_models.py New Agent Skills parser + URL model tests
tests/.../test_marketplace_url_client.py New URL fetch, cache, digest, ETag tests
tests/.../test_marketplace_url_commands.py New URL add/list/remove command tests
tests/.../test_marketplace_url_resolver.py New URL source resolution tests
tests/.../test_marketplace_archive.py New Archive extraction safety tests

Testing

  • Tested locally
  • All existing tests pass
  • Added tests for new functionality
  • 364 tests passing across 11 test files (7 new + 4 modified)
  • Full TDD approach — all tests written before implementation
  • Parametrized test cases for edge cases (name validation, digest format, URL schemes, archive safety)

Backward Compatibility

  • ✅ All existing GitHub marketplace workflows unchanged
  • from_dict() handles legacy dicts without source_type key
  • to_dict() for GitHub sources omits source_type (preserves existing format)
  • ✅ Existing tests pass with minimal adaptations (tuple return from _read_cache)

Vicente-Pastor and others added 4 commits April 13, 2026 15:29
…skills_index

- TestMarketplaceSourceURL: covers source_type='url' creation, immutability,
  is_url_source helper, to_dict/from_dict serialization, and roundtrip for both
  URL and GitHub sources (backward-compat preservation)
- TestParseAgentSkillsIndex: covers happy-path parsing of skill-md and archive
  entries, multi-skill indexes, empty lists, $schema version enforcement,
  skill name validation (RFC: 1-64 chars, lowercase alnum + hyphens), and
  mixed valid/invalid entry filtering

Part of issue microsoft#676 (URL-based marketplace sources).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…crosoft#676)

Implements RFC v0.2.0 Agent Skills discovery for arbitrary HTTPS endpoints
alongside the existing GitHub-hosted marketplace.json sources.

Core changes:
- models.py: extend MarketplaceSource with source_type='url' and url field;
  add parse_agent_skills_index() for RFC v0.2.0 index format with strict
  $schema enforcement, skill-name validation, and sha256 digest verification
- client.py: add _fetch_url_direct() (HTTPS-only, conditional GET, ETag/
  Last-Modified caching, digest mismatch detection); add _detect_index_format()
  and _parse_manifest() dispatcher; on_stale_warning callback on network errors
- commands/marketplace.py: add URL branch to 'marketplace add'; HTTPS enforced
  at command layer; .well-known auto-resolution via _resolve_index_url()
- marketplace/archive.py (new): safe download and extraction of .tar.gz / .zip
  archive skill packages with path-traversal and decompression-bomb guards
- deps/lockfile.py: extend provenance tracking for URL-sourced skills
- resolver.py: URL source support in skill resolution path

Tests (new files):
- test_marketplace_url_client.py  — _fetch_url_direct, caching, 304 handling
- test_marketplace_url_commands.py — marketplace add URL branch, HTTPS enforcement
- test_marketplace_url_resolver.py — resolver URL source path
- test_marketplace_archive.py     — archive extraction safety cases

Tests (extended):
- test_marketplace_url_models.py  — MarketplaceSource URL fields + round-trips
- test_marketplace_client.py      — stale-cache / on_stale_warning coverage
- test_lockfile_provenance.py     — URL provenance fields
- test_marketplace_install_integration.py — updated for new source shape
- test_marketplace_resolver.py    — pruned redundant cases covered elsewhere

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- S1: validate resp.url scheme post-redirect in _fetch_url_direct
- DR1: from_dict raises ValueError on unknown source_type
- DR3: add Optional[Tuple[Dict, str]] return type to _read_cache
- DR4: preserve query string in _resolve_index_url
- DR5: document GitHub kwargs behavior in _parse_manifest docstring
- DR6: case-insensitive scheme detection in add command
- DR7: PEP 8 two blank lines after FetchResult class

364 tests passing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Vicente-Pastor
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

with patch("apm_cli.commands.marketplace._get_console", return_value=None):
result = runner.invoke(marketplace, ["list"])

assert "https://example.com" in result.output
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Refactored the assertion to use a full URL variable instead of substring matching. The test now compares against the exact expected URL, eliminating the CodeQL false positive. See commit fa8b7a4.

# Provide "n" so it cancels without actually removing
result = runner.invoke(marketplace, ["remove", "example-skills"], input="n\n")

assert "https://example.com" in result.output
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Same approach -- replaced substring URL check with exact variable comparison. See commit fa8b7a4.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends APM's marketplace subsystem to support registering and consuming Agent Skills discovery indexes from arbitrary HTTPS URLs, alongside the existing GitHub-backed marketplace flow.

Changes:

  • Add a new URL-based marketplace source type (source_type="url") with .well-known index auto-resolution and updated marketplace list/remove/update behaviors.
  • Implement direct HTTPS index fetching with SHA-256 digest capture, conditional refresh (ETag/Last-Modified), stale-while-revalidate behavior, and format auto-detection (Agent Skills vs legacy marketplace.json).
  • Add provenance fields (source_url, source_digest) to marketplace manifests and lockfile dependencies, plus new archive download/extraction support and tests.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
src/apm_cli/commands/marketplace.py Adds URL detection + .well-known resolution; updates list/remove/update UX for URL sources.
src/apm_cli/marketplace/models.py Extends MarketplaceSource; adds Agent Skills index parser + provenance fields.
src/apm_cli/marketplace/client.py Adds URL fetch path with caching, digest/ETag handling, and format auto-detection.
src/apm_cli/marketplace/resolver.py Allows non-GitHub HTTPS URL plugin sources; supports skill-md and archive source types.
src/apm_cli/marketplace/archive.py New archive download + safe extraction utilities for type: "archive".
src/apm_cli/deps/lockfile.py Adds source_url / source_digest provenance to LockedDependency.
tests/unit/marketplace/test_marketplace_url_models.py New unit tests for URL source model + Agent Skills parser validation.
tests/unit/marketplace/test_marketplace_url_client.py New unit tests for URL fetch, cache behavior, digest, and conditional refresh.
tests/unit/marketplace/test_marketplace_url_commands.py New CLI tests for marketplace add/list/remove/update URL behavior and error messages.
tests/unit/marketplace/test_marketplace_url_resolver.py New resolver tests for URL marketplaces and Agent Skills source types.
tests/unit/marketplace/test_marketplace_archive.py New tests for archive extraction safety (traversal, bombs, links).
tests/unit/marketplace/test_marketplace_resolver.py Removes old URL-source resolver tests superseded by new URL resolver test file.
tests/unit/marketplace/test_marketplace_install_integration.py Refactors marketplace ref parsing tests into a parametric form.
tests/unit/marketplace/test_marketplace_client.py Adapts tests to updated _read_cache() return shape.
tests/unit/marketplace/test_lockfile_provenance.py Adds lockfile provenance field coverage for URL/index digest.

for s in sources:
table.add_row(s.name, f"{s.owner}/{s.repo}", s.branch, s.path)
if s.is_url_source:
table.add_row(s.name, s.url, "—", "—")
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Rich table uses the Unicode em dash character ("—") for URL sources (Branch/Path). Project encoding rules require printable ASCII for all CLI output; this can trigger UnicodeEncodeError on Windows cp1252. Replace these placeholders with ASCII (e.g., "-", "n/a", or empty strings).

Suggested change
table.add_row(s.name, s.url, "", "")
table.add_row(s.name, s.url, "-", "-")

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All em-dashes replaced with ASCII -- or - across all PR files. See commit fa8b7a4.

Comment on lines +260 to +275
from urllib.parse import urlparse

if urlparse(url).scheme != "https":
raise MarketplaceFetchError(url, "URL sources must use HTTPS")
try:
headers = {"User-Agent": "apm-cli"}
if etag:
headers["If-None-Match"] = etag
if last_modified:
headers["If-Modified-Since"] = last_modified
resp = requests.get(url, headers=headers, timeout=30)
# Guard against HTTPS→HTTP redirect (S1)
final_url = getattr(resp, "url", None)
if isinstance(final_url, str) and urlparse(final_url).scheme != "https":
raise MarketplaceFetchError(
url, "Redirect to non-HTTPS URL rejected"
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_fetch_url_direct() enforces HTTPS with urlparse(url).scheme != "https", which is case-sensitive if callers pass mixed/upper-case schemes (e.g. HTTPS://...). To honor RFC 3986 scheme case-insensitivity (and match the CLI tests), compare urlparse(...).scheme.lower() for both the initial URL and the post-redirect resp.url check.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added .lower() to both scheme checks in _fetch_url_direct() (initial URL and post-redirect resp.url). See commit fa8b7a4.

Comment on lines +90 to +93
# Non-GitHub HTTPS URL — return as-is (CDN, arbitrary HTTPS host, etc.)
if url.startswith("https://"):
return url

Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_resolve_url_source() only accepts non-GitHub URLs when url.startswith("https://"), which rejects valid HTTPS URLs with mixed/upper-case schemes. Consider parsing with urlparse and validating scheme.lower() == "https" instead, so URL sources are consistently handled case-insensitively.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Replaced url.startswith("https://") with urlparse(url).scheme.lower() == "https" for RFC 3986 compliance. See commit fa8b7a4.

Comment on lines 71 to 90
def from_dict(cls, data: Dict[str, Any]) -> "MarketplaceSource":
"""Deserialize from JSON dict."""
source_type = data.get("source_type", "github")
if source_type == "url":
return cls(
name=data["name"],
source_type="url",
url=data.get("url", ""),
)
if source_type != "github":
raise ValueError(f"Unsupported marketplace source_type: {source_type!r}")
return cls(
name=data["name"],
owner=data["owner"],
repo=data["repo"],
owner=data.get("owner", ""),
repo=data.get("repo", ""),
host=data.get("host", "github.com"),
branch=data.get("branch", "main"),
path=data.get("path", "marketplace.json"),
source_type="github",
)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MarketplaceSource.from_dict() allows a URL source to deserialize with a missing/empty url (it falls back to ""). That creates an invalid registered source and can cause cache-key collisions (sha256("")) and confusing runtime errors later. Prefer failing fast by requiring a non-empty url for source_type == "url" and raising ValueError if it's missing.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. from_dict() now raises ValueError if source_type="url" with empty/missing url. Added 2 tests. See commit fa8b7a4.

Comment on lines +367 to +384
skill_type = entry.get("type", "")
url = entry.get("url", "")
digest = entry.get("digest", "")
if not _is_valid_digest(digest):
logger.debug(
"Skipping Agent Skills entry %r with invalid digest %r in '%s'",
name,
digest,
source_name,
)
continue
description = entry.get("description", "")
plugins.append(
MarketplacePlugin(
name=name,
source={"type": skill_type, "url": url, "digest": digest},
description=description,
source_marketplace=source_name,
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_agent_skills_index() does not validate that type, url, and description are strings (or that type is one of the supported artifact types like "skill-md"/"archive"). If these fields are missing or non-strings, invalid sources can flow into MarketplacePlugin and fail later in resolution/install. Add type checks and skip (or warn+skip) entries with unsupported/invalid type or non-string url/description.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added validation: type must be a string in ("skill-md", "archive"), url must be a non-empty string, description defaults to empty string if non-string. Invalid entries are skipped with a warning log. Added 7 tests. See commit fa8b7a4.

Comment on lines +1 to +6
"""Tests for archive download and extraction (Step 7 — gap #14).

Covers: _check_archive_member safety checks, _detect_archive_format, _extract_tar_gz,
_extract_zip, and the download_and_extract_archive public API.
"""

Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test file includes Unicode em dashes ("—") in docstrings/comments. Repo encoding policy requires printable ASCII in all source files to avoid Windows encoding issues. Replace em dashes with ASCII (e.g., "-" or "--").

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All em-dashes replaced with ASCII -- across the file. Also replaced Unicode arrows and section signs in other test files. See commits fa8b7a4 and 516370a.


Covers: _cache_key() URL sources, _fetch_url_direct(), fetch_marketplace()
URL branch, format auto-detection, cache read/write, stale-while-revalidate.
GitHub paths are not touched — regression tests confirm they still work.
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test file includes Unicode em dashes ("—") in docstrings/comments. Repo encoding policy requires printable ASCII in all source files to avoid Windows encoding issues. Replace em dashes with ASCII (e.g., "-" or "--").

Suggested change
GitHub paths are not touched regression tests confirm they still work.
GitHub paths are not touched - regression tests confirm they still work.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All em-dashes and Unicode arrows replaced with ASCII equivalents. See commits fa8b7a4 and 516370a.

Comment on lines +90 to +116
)
assert mock_add.called
source = mock_add.call_args[0][0]
assert source.source_type == "url"

def test_http_url_rejected(self, runner):
"""Plain http:// (non-TLS) must be rejected — RFC requires HTTPS."""
from apm_cli.commands.marketplace import marketplace

result = runner.invoke(marketplace, ["add", "http://example.com"])
assert result.exit_code != 0
assert "https" in result.output.lower()

@pytest.mark.parametrize("url", [
"HTTPS://EXAMPLE.COM",
"Https://example.com",
"HTTP://example.com",
])
def test_mixed_case_scheme_detected_as_url(self, runner, url):
"""URL detection must be case-insensitive per RFC 3986 §3.1."""
from apm_cli.commands.marketplace import marketplace

result = runner.invoke(marketplace, ["add", url])
# Should NOT produce "Invalid format ... OWNER/REPO" —
# it should enter the URL path (may fail on http or succeed on https)
assert "OWNER/REPO" not in result.output

Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test file includes Unicode em dashes ("—") in docstrings/comments. Repo encoding policy requires printable ASCII in all source files to avoid Windows encoding issues. Replace em dashes with ASCII (e.g., "-" or "--").

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All em-dashes, Unicode arrows, and section signs replaced with ASCII. See commits fa8b7a4 and 516370a.

Comment on lines +1 to +6
"""Tests for URL-based marketplace source and Agent Skills index parser.

Covers MarketplaceSource URL fields, serialization round-trips, and the
parse_agent_skills_index() parser including schema enforcement, skill name
validation, and source type handling.
"""
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test file includes Unicode em dashes ("—") in docstrings/comments. Repo encoding policy requires printable ASCII in all source files to avoid Windows encoding issues. Replace em dashes with ASCII (e.g., "-" or "--").

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All em-dashes replaced with ASCII --. See commit fa8b7a4.

Comment on lines +1 to +7
"""Tests for URL-source resolver behaviour.

Covers:
- resolve_plugin_source() with skill-md and archive Agent Skills types
- _resolve_url_source() with non-GitHub HTTPS URLs
- resolve_marketplace_plugin() end-to-end with a URL marketplace source
"""
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test file includes Unicode em dashes ("—") in docstrings/comments. Repo encoding policy requires printable ASCII in all source files to avoid Windows encoding issues. Replace em dashes with ASCII (e.g., "-" or "--").

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All em-dashes replaced with ASCII --. See commit fa8b7a4.

Copy link
Copy Markdown
Collaborator

@sergio-sisternes-epam sergio-sisternes-epam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this feature, @Vicente-Pastor! The implementation quality is high -- strong security posture (HTTPS enforcement at two layers, post-redirect validation, SHA-256 digest integrity), thorough test suite (174 new test functions), and clean backward compatibility. The Agent Skills RFC v0.2.0 parser with strict $schema validation and skill-name rules is well done.

A few items need addressing before we can approve:

Must-fix:

  1. Non-ASCII em-dash characters in source files (see inline comments) -- the project requires all source code to stay within printable ASCII (U+0020-U+007E) to avoid charmap codec errors on Windows cp1252 terminals.
  2. Missing CHANGELOG.md entry -- this is a significant user-facing feature (apm marketplace add <URL>). Please add an entry under ## [Unreleased] > Added.
  3. Missing documentation updates -- docs/src/content/docs/ marketplace pages and packages/apm-guide/.apm/skills/apm-usage/commands.md need to reflect URL-based marketplace support.

Should-fix:
4. archive.py path validation uses custom _check_archive_member() instead of the centralized path_security.py guards (see inline).
5. No response size limit on _fetch_url_direct() -- a malicious server could return a multi-GB JSON payload (see inline).
6. The TestCacheKey tests deleted from test_marketplace_client.py (GitHub-path cache key tests) should be preserved alongside the new URL cache key tests -- those cover the existing GitHub/GHE cache key behavior.

Note: This PR will have merge conflicts with #677 (marketplace versioning) which touches the same files (models.py, client.py, resolver.py). Recommend rebasing after #677 lands.

Also, the PR body mentions "marketplace browse -- handles URL sources in display" but there is no diff for the browse command. Could you verify whether browse actually works for URL sources or update the PR description?

Copy link
Copy Markdown
Collaborator

@sergio-sisternes-epam sergio-sisternes-epam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comments for the items referenced in the main review above.

Comment on lines +15 to +31
# Agent Skills Discovery RFC v0.2.0 — the only schema version we accept.
_AGENT_SKILLS_SCHEMA = "https://schemas.agentskills.io/discovery/0.2.0/schema.json"

# RFC skill-name rule: 1-64 chars, lowercase alphanumeric + hyphens,
# no leading/trailing/consecutive hyphens.
_SKILL_NAME_RE = re.compile(r"^[a-z0-9]([a-z0-9-]{0,62}[a-z0-9])?$")


@dataclass(frozen=True)
class MarketplaceSource:
"""A registered marketplace repository.

Stored in ``~/.apm/marketplaces.json``.

Two source types are supported:
- ``"github"`` (default) — a GitHub-hosted marketplace.json index.
- ``"url"`` — an arbitrary HTTPS Agent Skills discovery endpoint.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[must-fix] Em-dashes (U+2014) appear in several lines in this range:

  • Line 15: -- the only schema version
  • Line 30: (default) -- a GitHub-hosted
  • Line 31: -- an arbitrary HTTPS

Replace all em-dashes with -- throughout. The encoding convention applies to all .py source files, including comments and docstrings. Please also check the rest of the PR for other occurrences (e.g. client.py docstrings).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. All em-dashes replaced with -- throughout models.py (lines 15, 30, 31 and others). Also swept all other PR files for em-dashes and non-ASCII chars (arrows, section signs). See commits fa8b7a4 and 516370a.

"""Raised when an archive cannot be downloaded or extracted safely."""


def _check_archive_member(member_path: str) -> None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[should-fix] The project convention requires using the centralized path security guards from src/apm_cli/utils/path_security.py (validate_path_segments() and ensure_path_within()) for any filesystem path built from external data. This custom _check_archive_member() is functionally equivalent but creates maintenance drift.

Consider importing and reusing the centralized utilities, or if the archive use case needs additional checks (null bytes, Windows UNC), extending path_security.py so all integrators benefit.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Refactored archive.py to import and use validate_path_segments() and ensure_path_within() from path_security.py. Archive-specific guards (null bytes, Windows UNC/drive-letter paths) are kept as additional checks on top of the centralized ones. See commit fa8b7a4.

if resp.status_code == 404:
raise MarketplaceFetchError(url, "404 Not Found")
resp.raise_for_status()
raw = resp.content
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[should-fix] resp.content loads the entire response into memory with no size guard. A malicious or misconfigured server could return a multi-GB JSON payload. Consider adding a Content-Length check before reading, e.g.:

_MAX_INDEX_BYTES = 10 * 1024 * 1024  # 10 MB

content_length = resp.headers.get("Content-Length")
if content_length and int(content_length) > _MAX_INDEX_BYTES:
    raise MarketplaceFetchError(url, f"Index exceeds size limit ({content_length} bytes)")

Archives already have the 512 MB decompression bomb guard -- the index fetch deserves a similar safety net (10 MB would be generous for any realistic skill index).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added _MAX_INDEX_BYTES = 10 * 1024 * 1024 (10 MB) constant. Checks both Content-Length header before reading and actual body size after reading. Added 3 tests. See commit fa8b7a4.

for s in sources:
table.add_row(s.name, f"{s.owner}/{s.repo}", s.branch, s.path)
if s.is_url_source:
table.add_row(s.name, s.url, "—", "—")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[must-fix] This uses Unicode em-dash (U+2014). The project encoding convention requires all source files stay within printable ASCII (U+0020-U+007E) to avoid charmap codec errors on Windows cp1252 terminals.

Replace with "--" or "n/a".

See: STATUS_SYMBOLS in src/apm_cli/utils/console.py for the project's ASCII-only pattern.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Replaced Unicode em-dash with ASCII - in the Rich table output. See commit fa8b7a4.

Vicente-Pastor and others added 2 commits April 13, 2026 17:14
- C01/C02: Fix CodeQL URL substring sanitization in test assertions
- C03/C11-C16/C19: Replace all Unicode em-dashes with ASCII in PR files
- C04/C05: Case-insensitive HTTPS scheme validation (urlparse-based)
- C06: Reject empty URL in MarketplaceSource.from_dict()
- C07: Strict field validation in parse_agent_skills_index()
- C08: HTTPS enforcement + post-redirect check in archive download
- C09: Reject non-regular tar members (device files, FIFOs, symlinks)
- C10/C17: Centralize path security using path_security.py guards
- C18: 10 MB response size limit on _fetch_url_direct()
- C20: Add CHANGELOG.md entries for URL marketplace feature
- C21: Update marketplace docs and command reference for URL support
- C22: Restore deleted TestCacheKey tests
- C23: Verified browse command works for URL sources

387 tests passing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace non-ASCII arrows (U+2192) and section sign (U+00A7) with ASCII
- Update cli-commands.md with URL marketplace add syntax and examples
- Update marketplace group description to mention URL sources
- Add PR number (microsoft#691) alongside issue number (microsoft#676) in CHANGELOG

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Vicente-Pastor
Copy link
Copy Markdown
Author

Addressing review body items

@sergio-sisternes-epam Thanks for the thorough review! All items have been addressed:

Inline comments (C01-C19)

All 19 inline review comments have been fixed and replied to individually. Summary:

  • Em-dashes: All non-ASCII characters replaced with ASCII equivalents across all PR files (also caught Unicode arrows and section signs)
  • Security: HTTPS enforcement + post-redirect validation in archive.py, centralized path_security.py integration, 10MB response size limit, tar member type checks, field validation in parser
  • Correctness: Case-insensitive scheme validation (urlparse-based), empty URL rejection in from_dict(), CodeQL fixes

Review body items

  1. CHANGELOG.md -- Added 6 entries under [Unreleased] > Added with ([FEATURE] Support URL-based marketplace registration for Agent Skills discovery indexes #676, Feature/676 url based marketplace #691) references
  2. Documentation -- Updated docs/src/content/docs/guides/marketplaces.md (URL registration section), docs/src/content/docs/reference/cli-commands.md (URL syntax/examples), and packages/apm-guide/.apm/skills/apm-usage/commands.md
  3. Deleted TestCacheKey tests -- Restored all 3 tests (test_github_default_unchanged, test_non_default_host_includes_host, test_different_hosts_different_keys)
  4. Browse command -- Verified it works for URL sources (uses get_marketplace_by_name -> fetch_marketplace which handles URL sources transparently)
  5. Merge conflicts with feat: marketplace-based version management (#514) #677 -- Noted. Will rebase after feat: marketplace-based version management (#514) #677 merges

Test status

387 tests passing (up from 384 before review fixes). All new tests follow AAA pattern with descriptive names.

See commits fa8b7a4 and 516370a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Support URL-based marketplace registration for Agent Skills discovery indexes

4 participants