Skip to content

Latest commit

 

History

History
135 lines (91 loc) · 28.7 KB

File metadata and controls

135 lines (91 loc) · 28.7 KB

Security Findings

SF-015: Malformed GIT_CONFIG_COUNT environment variable causes unhandled ValueError in _build_github_auth_env

  • Severity: Low
  • Location: agr/git.py:_build_github_auth_env
  • Description: _build_github_auth_env() reads the GIT_CONFIG_COUNT environment variable and parses it with int(os.environ.get("GIT_CONFIG_COUNT", "0")). If GIT_CONFIG_COUNT is set to a non-numeric string (e.g. by a CI step that misconfigures the environment, or by another tool), this raises an uncaught ValueError. The exception is not caught by _run_git() (which only catches OSError) nor by any upstream caller, so it surfaces to users as a raw Python traceback rather than a friendly AgrError. In CI environments with imperfect variable management this causes a confusing, hard-to-diagnose failure. Additionally, an excessively large numeric value (e.g. GIT_CONFIG_COUNT=999999999) would produce valid env var names like GIT_CONFIG_KEY_999999999, causing git to scan for 10⁹ config entries at startup — a minor DoS on the git subprocess. Git itself silently ignores a malformed GIT_CONFIG_COUNT; agr should do the same.
  • Resolution: Wrapped the int() call in a try/except ValueError that falls back to 0, matching git's own behaviour. Added 1 test: test_ignores_malformed_git_config_count — verifies that a non-integer GIT_CONFIG_COUNT ("notanumber") is ignored and the auth env is still populated correctly at index 0.
  • Status: Fixed (2026-04-17)

SF-014: git sparse-checkout set missing -- separator allows option injection from repo-supplied paths

  • Severity: Low
  • Location: agr/git.py:checkout_sparse_paths
  • Description: checkout_sparse_paths built the git sparse-checkout set command by appending path arguments (derived from git ls-tree output of the remote repository) without inserting a -- end-of-options separator. Without --, git parses every subsequent argument for option flags; a repository that contains a file or directory whose path starts with - (e.g., -C, --cone, --no-cone) would cause git to interpret that path as a command-line option rather than a sparse-checkout pattern. A malicious remote repository could craft such a path to alter git's sparse-checkout behaviour — for example, --no-cone would silently switch sparse-checkout from cone mode to non-cone mode, potentially causing all files in the repo to be checked out instead of only the requested skill directory, expanding the attack surface for subsequent install steps (including symlink-following, SF-001 mitigated separately). While GitHub does not allow filenames beginning with - in repository paths, git hosting services that do permit them, or on-premise gitea/Gitea instances used as custom sources, could serve such paths. The impact is bounded by what git option names are valid for sparse-checkout set, but the absence of -- is an unsafe pattern that violates the principle of treating data separately from commands.
  • Resolution: Fixed. Added "--" as the argument immediately after "set" in the _git_cmd call inside checkout_sparse_paths, so the final command is git -C <repo> sparse-checkout set -- <paths...>. This ensures all subsequent arguments are treated as paths regardless of their content. Added 4 tests: verifying -- is present and immediately follows set; paths appear after --; a --prefixed path is placed after -- rather than being parsed as an option; and the empty-paths guard still raises AgrError.
  • Status: Fixed (2026-04-17)

SF-013: SDK list_skills bypassed _validate_component checks on repo handle components

  • Severity: Low
  • Location: agr/sdk/hub.py:list_skills
  • Description: list_skills(repo_handle) split the repo handle on / and used the resulting components directly in the GitHub API URL (_github_tree_url) without passing them through _validate_component. This bypassed all the security checks introduced by SF-003, SF-008, SF-009, SF-010, SF-011, and SF-012: control characters, whitespace, YAML flow/indicator/block-scalar characters, path-traversal components (.., .), and the reserved -- separator were all accepted. In contrast, skill_info() — the other public SDK function — correctly called parse_remote_handle() which routes through _validate_component. The impact in the SDK context is limited because (1) the base URL is hardcoded to https://api.github.com, preventing host-based SSRF, and (2) httpx rejects URLs with embedded control characters at the HTTP layer. However, crafted inputs with # (URL fragment delimiter) could silently redirect the API request to a different GitHub API endpoint — e.g., list_skills("owner#evil/repo") would request https://api.github.com/repos/owner (fragment truncates the path) instead of raising an error. The inconsistency also means developer code relying on list_skills for validated input identity (e.g., caching by handle) could be surprised by inputs that skill_info would reject.
  • Resolution: Fixed. Replaced the raw split("/") + direct-use pattern in list_skills with calls to parse_remote_handle using a dummy skill-name suffix (/placeholder) so that _validate_component runs on the owner and repo components before they are interpolated into any URL. For the 1-part case ("owner") the call is parse_remote_handle("owner/placeholder"); for the 2-part case ("owner/repo") it is parse_remote_handle("owner/repo/placeholder"). The validated username and repo fields from the resulting ParsedHandle are then used instead of the raw split parts. Added 10 tests covering control chars, whitespace, YAML special chars, path traversal in owner and repo positions, and positive regression tests confirming valid handles still proceed to the (mocked) network call.
  • Status: Fixed (2026-04-17)

SF-012: YAML block-scalar and quoted-scalar characters in handle components corrupt SKILL.md frontmatter

  • Severity: Low
  • Location: agr/handle.py:_validate_component, agr/skill.py:update_skill_md_name
  • Description: _validate_component blocked whitespace/control chars (SF-008), YAML flow-collection delimiters {}[] (SF-010), and YAML indicator characters #*&! (SF-011), but did not block YAML block-scalar indicators | and > or quoted-scalar start characters ' and ". When these appear in a skill directory name, update_skill_md_name writes them directly into the SKILL.md frontmatter as name: <value>. In YAML block style, these characters corrupt the name field: (1) |skill — starts a block literal scalar with skill as the block header; because s is not a valid chomping/indentation indicator, strict YAML parsers raise a parse error; (2) >skill — same mechanism for block folded scalars; (3) 'skill — opens a single-quoted scalar that is never closed, causing the YAML parser to consume all subsequent frontmatter lines (including description:) as part of the name value, suppressing those fields from agents; (4) "skill — same swallowing mechanism for double-quoted scalars. The ' and " cases are particularly impactful because they do not cause an immediate parse error — instead they silently hide the description and other frontmatter fields that agents rely on for skill discovery. The attack is most effective through transitive package dependencies where the user never reviews the raw handle. None of these characters are valid in GitHub usernames or repository names; they can appear in skill directory names in a Linux-hosted git repository.
  • Resolution: Fixed. Added _YAML_BLOCK_SCALAR_CHARS = frozenset("|>'\\"") constant and _validate_no_yaml_block_scalar_chars() helper in handle.py. Wired into _validate_component after the existing indicator-char check, so all remote handle parsing branches apply the new check. Added 7 tests covering | and > in skill name, username, and repo positions; ' and " in skill name; and | in the one-part default_owner path.
  • Status: Fixed (2026-04-17)

SF-001: Symlink-following in skill installation allows local file exfiltration

  • Severity: High
  • Location: agr/_install_common.py:copy_resource_to_destination, agr/sdk/cache.py:cache_skill
  • Description: shutil.copytree() was called with the default symlinks=False, which dereferences symlinks and copies the content of the files they point to. A malicious remote skill repository could include symlinks targeting sensitive local files (e.g. ~/.ssh/id_rsa, .env, ~/.aws/credentials). When a user installs such a skill via agr add or agrx, the symlinks are dereferenced during the copy, placing the actual file contents into the installed skill directory. Since skills are read by AI agents as context, the sensitive content could then be exposed in agent responses.
  • Resolution: Fixed. Added _ignore_symlinks helper that filters out symlink entries during shutil.copytree. Applied to both copy_resource_to_destination (main install path) and cache_skill (SDK cache path). Added tests confirming that file symlinks and directory symlinks are both skipped during installation.
  • Status: Fixed (2026-04-12)

SF-002: Format string injection in source URL template allows Python attribute access

  • Severity: Medium
  • Location: agr/source.py:SourceConfig.build_repo_url, agr/config.py:_parse_source_entry, agr/commands/config_cmd.py
  • Description: SourceConfig.build_repo_url() used Python's str.format() to interpolate {owner} and {repo} into the source URL template. Because str.format() supports dotted attribute access (e.g. {owner.__class__.__mro__}), a malicious agr.toml in a cloned repository could craft a [[source]] URL template that leaks Python runtime internals. When a user runs agr sync or agr add against such a project, the format string is evaluated and the resolved attributes are embedded in the git clone URL — potentially exfiltrating runtime metadata to an attacker-controlled server. Additionally, source URLs accepted arbitrary schemes (e.g. ftp://, data:), which could be abused for protocol-level attacks.
  • Resolution: Fixed. Replaced str.format() with literal str.replace("{owner}", owner).replace("{repo}", repo) in build_repo_url(), which prevents attribute access, index access, and conversion specifiers. Added URL scheme validation in both _parse_source_entry() (TOML loading) and the config add sources CLI command, restricting URLs to https://, http://, ssh://, git://, file://, and absolute paths. Added tests confirming that format string attribute access and conversion specifiers are not evaluated, and that unsupported URL schemes are rejected.
  • Status: Fixed (2026-04-12)

SF-003: Path traversal in remote handle components allows writes outside skills directory

  • Severity: High
  • Location: agr/handle.py:parse_handle
  • Description: Remote handle components (username, repo, skill name) were validated against the reserved -- separator but not against path traversal sequences . and ... A handle like user/repo/.. would parse successfully with name="..", causing the install destination to resolve to skills_dir / ".." — the parent of the skills directory. In the worst case (agr sync --force or agr upgrade), shutil.rmtree() would delete the parent directory before shutil.copytree() replaces it with skill contents. This is especially dangerous through the package expansion path (package.py:expand_packages): a malicious remote package could include a sub-dependency with a traversal handle in its agr.toml, and the user would never see the raw handle — they'd only have a top-level type = "package" dependency. The same issue applied to . as a component, which would resolve the destination to the skills directory itself. Local handles were already protected (. produces an empty name caught by not name, .. is explicitly rejected), but the remote handle branch had no such checks.
  • Resolution: Fixed. Added _validate_no_path_traversal() helper in handle.py that rejects . and .. as handle components. Applied to all remote handle parsing branches (1-part with default_owner, 2-part user/skill, 3-part user/repo/skill) alongside the existing _validate_no_separator() check. Added 8 tests covering traversal in each component position and both . and .. variants.
  • Status: Fixed (2026-04-12)

SF-004: Lockfile commit field accepts arbitrary git refs, defeating --frozen immutability

  • Severity: Medium
  • Location: agr/git.py:fetch_and_checkout_commit, agr/lockfile.py:LockedEntry.from_dict
  • Description: The lockfile's commit field was loaded from disk as an arbitrary string with no format validation. In --frozen mode, fetch_and_checkout_commit() passed this string directly to git fetch --depth=1 origin <commit> and git checkout <commit>. While agr always writes valid 40-character hex SHAs when generating the lockfile, a tampered lockfile (e.g. via a malicious pull request that modifies agr.lock) could substitute a branch name (main), tag (v1.0), or other git ref for a pinned commit SHA. When another developer then runs agr sync --frozen, git would resolve the ref to whatever commit it points to at that moment — potentially a different (malicious) commit than the one originally pinned. This defeats the core purpose of --frozen mode: ensuring reproducible, immutable installs from exact commit SHAs. The attack is particularly effective because lockfile diffs are often large and reviewers tend to skim them.
  • Resolution: Fixed. Added validate_commit_sha() function in git.py with a regex check (^[0-9a-f]{40}$) that rejects anything other than a full 40-character lowercase hex SHA. Called at the top of fetch_and_checkout_commit() before any git operations. Added 11 tests covering valid SHAs, branch names, tags, short SHAs, uppercase hex, empty strings, oversized strings, and ref paths.
  • Status: Fixed (2026-04-12)

SF-005: GitHub token exposed in process command-line arguments via URL embedding

  • Severity: Medium
  • Location: agr/git.py:_apply_github_token, agr/git.py:downloaded_repo
  • Description: _apply_github_token() embedded the GitHub token directly into the git clone URL (e.g. https://TOKEN:x-oauth-basic@github.com/owner/repo.git). This URL was passed as a command-line argument to git clone and git ls-remote via subprocess.run(). On Linux, command-line arguments are world-readable via /proc/PID/cmdline; on macOS, they are visible to all users via ps aux. Any user or process on the same machine could observe the token during the (brief) git operation window. This is especially concerning on shared CI/CD runners, multi-user development servers, and environments with process monitoring. Although the token exposure window is short (the lifetime of the git subprocess), automated process scanners or audit logging could capture it persistently.
  • Resolution: Fixed. Added _build_github_auth_env() function that passes the GitHub token via GIT_CONFIG_COUNT/GIT_CONFIG_KEY_N/GIT_CONFIG_VALUE_N environment variables (git 2.31+). These env vars configure http.https://github.com/.extraheader with an AUTHORIZATION: bearer header, scoped to github.com only. The _run_git() function now automatically merges the auth env into every git subprocess call. Environment variables are only readable by the owning user (unlike cmdline which is world-readable), and the URL-scoped http.extraheader ensures the token is never sent to non-GitHub hosts. The function also correctly appends to any existing GIT_CONFIG_COUNT entries. Added 7 tests covering token presence in env values, absence from keys, env count appending, GH_TOKEN fallback, empty token handling, and github.com scoping.
  • Status: Fixed (2026-04-12)

SF-006: Remote packages can inject local path dependencies to read arbitrary directories

  • Severity: High
  • Location: agr/package.py:expand_packages
  • Description: expand_packages() rejected local packages (type = "package" with a path field) but silently accepted local skill/ralph sub-dependencies from remote packages. A malicious remote package could include {path = "../../.env", type = "skill"} in its agr.toml. During agr sync or agr add, the package expansion would add this local path dependency to the flat list. When later installed, the path resolves relative to the user's repo_root, causing the contents of arbitrary directories to be copied into the skills directory where AI agents read them. The attack is especially effective because package sub-dependencies are expanded transparently — the user only sees a top-level type = "package" entry in their agr.toml and never reviews the transitive local paths. A nested package chain (package A depends on package B which contains the malicious local dep) further obscures the attack.
  • Resolution: Fixed. Added a check at the top of the sub-dependency loop in expand_packages() that rejects any local path dependency (sub_dep.is_local) found in a remote package's sub-manifest, raising ConfigError with a message naming both the offending path and the parent package. This applies to all dependency types (skills, ralphs, and nested packages). Added 5 tests covering: local skill paths, local ralph paths, traversal paths, nested package chains with local deps, and error message content verification.
  • Status: Fixed (2026-04-12)

SF-007: Tampered lockfile can inject local-path transitive deps in --frozen mode, bypassing SF-006

  • Severity: Medium
  • Location: agr/lockfile.py:LockedEntry.from_dict, agr/commands/sync.py:_sync_from_lockfile
  • Description: SF-006 prevents local paths from appearing as transitive dependencies at package-expansion time: expand_packages() converts any local sub-dep in a remote package's agr.toml to a same-repo remote handle via _remote_dep_from_repo_path. However, --frozen/--locked sync skips package expansion entirely and trusts the lockfile. LockedEntry.from_dict accepted any combination of fields, including an entry with both a path and a parent (package parent). _locked_transitive_deps_for_packages then returned the tampered entry, _dep_from_locked_entry wrapped it into a Dependency(path=..., type="skill"), and _sync_dep_from_lockfile installed it as a local dep — handle.resolve_local_path(repo_root) resolves the attacker-chosen path (absolute or with .. traversal) and install_local_skill copies its contents into the skills directory where AI agents read them. The attacker path (e.g. ./.cache/agr/skills/..., the user's own skills/foo/, or any directory containing a SKILL.md) only needs to contain a valid SKILL.md for the install to proceed. Because a legitimate agr sync never produces path + parent entries (SF-006 converts them to remote before lockfile write), this combination is only reachable via lockfile tampering — exactly the threat model SF-004 addressed for commit SHAs. The attack is effective in PR-review scenarios where lockfile diffs tend to be skimmed.
  • Resolution: Fixed. Added a validation block in LockedEntry.from_dict that raises ConfigError whenever an entry has both path and a parent/parents value, with a message indicating possible corruption or tampering. This stops the tampered entry at lockfile-load time so every downstream code path (including --frozen sync) is protected. Added 4 tests covering: rejection with scalar parent, rejection with parents list, that direct local entries (no parent) still load, and that remote transitive entries (handle + parent) still load.
  • Status: Fixed (2026-04-17)

SF-008: Handle components accept whitespace and control chars, enabling SKILL.md frontmatter injection

  • Severity: Medium
  • Location: agr/handle.py:_validate_component, agr/skill.py:update_skill_md_name
  • Description: Remote handle components (username, repo, skill name) were validated against the reserved -- separator and against ./.. path traversal, but not against whitespace or ASCII control characters. update_skill_md_name — invoked as the post_copy callback in copy_resource_to_destination — writes the handle's skill name into the installed SKILL.md's YAML frontmatter as name: {new_name}. When the name contains a newline (e.g. skill\ndescription: injected-prompt\n), the newline closes the name: line and the attacker-supplied YAML keys are appended verbatim to the frontmatter. This gives a malicious remote package a covert channel to inject a forged description: (or other frontmatter fields) that agents may rely on when picking / summarising skills — the user only sees a top-level type = "package" dep and never reviews the transitive handle string. The same names also end up as literal directory names on disk (containing newlines) and as the installed_name field in .agr.json. Legitimate GitHub usernames, repo names, and skill names never contain whitespace or control characters, so rejecting them is safe.
  • Resolution: Fixed. Added _validate_no_control_chars() helper in handle.py that rejects any component containing a whitespace character (str.isspace(), which includes space, tab, newline, carriage return, vertical tab, form feed) or an ASCII control byte (ord < 0x20 or 0x7F). Wired into _validate_component alongside the existing separator and traversal checks, so every remote parsing branch (1-part with default_owner, 2-part user/skill, 3-part user/repo/skill) now applies the check. Local handles continue to use _validate_no_separator only — the frontmatter-injection attack is blocked for remote/transitive handles (the actual threat vector). Added 10 tests covering newline / tab / space / carriage return / null byte / DEL characters in each component position plus a positive test that legitimate hyphenated names still parse.
  • Status: Fixed (2026-04-17)

SF-010: YAML flow-structure characters in handle components corrupt SKILL.md frontmatter value type

  • Severity: Low
  • Location: agr/handle.py:_validate_component, agr/skill.py:update_skill_md_name
  • Description: _validate_component rejected --, ./.., whitespace, and control characters, but allowed YAML flow-collection delimiters: {, }, [, ]. A git repository can contain a skill directory named with these characters (e.g., {skill} or [item]). When agr installs such a skill, update_skill_md_name writes the handle's skill name directly into the installed SKILL.md frontmatter as name: {skill} or name: [item]. In YAML, {skill} is an invalid flow mapping (the skill key has no value, causing a parse error in strict parsers), while [item] is a valid flow sequence — meaning the name field's type becomes a list instead of a string. This corrupts the frontmatter in a way that can: (1) cause YAML parse errors in strict agents, (2) produce TypeError crashes in agents that assume name is always a string, and (3) mislead skill-routing logic that matches against the name field. While these characters cannot inject new top-level YAML keys (line breaks required — blocked by SF-008) or enable YAML deserialization exploits (complex payloads require spaces — also blocked by SF-008), they do corrupt the existing name field's type. The attack is particularly effective through transitive package dependencies: a malicious remote package can include a sub-dependency whose directory name contains { or [, and the user never sees the raw handle. Legitimate GitHub usernames, repository names, and skill directory names never contain these characters, so rejecting them is safe.
  • Resolution: Fixed. Added _validate_no_yaml_flow_chars() helper in handle.py that rejects any component containing {, }, [, or ]. Wired into _validate_component alongside the existing checks, so all remote handle parsing branches apply the new check. Added 9 tests covering brace/bracket characters in each component position (username, repo, skill name, one-part with default_owner) and a positive test confirming underscore/hyphen names still parse.
  • Status: Fixed (2026-04-17)

SF-009: default_repo config field accepts control chars and ./.., embedding them in git URLs

  • Severity: Low
  • Location: agr/config.py:_validate_config_identifier
  • Description: _validate_config_identifier validated default_owner and default_repo from agr.toml against / and the reserved -- separator, but not against internal whitespace, ASCII control characters, or the literal ./.. path-traversal values. Trailing/leading whitespace was stripped, but characters embedded in the middle of the value were preserved verbatim. While default_owner has a backstop in parse_handle._validate_component (called when 1-part handles are resolved), default_repo flows directly into the git clone URL via iter_repo_candidateshandle.get_github_reposource.build_repo_url(owner, repo) without ever passing through _validate_component. A malicious project's agr.toml with default_repo = "skills\nHost: evil.com" (or similar control-char-bearing value) would produce a git clone URL containing newlines, which (depending on the libcurl version git was built against) could either be rejected with a confusing error, URL-encoded, or — in older curl builds without strict CTL filtering — be transmitted as part of the HTTP request line. Setting default_repo = ".." produces URLs like github.com/owner/...git which curl/GitHub then normalize away. The realistic exploit potential is low because (a) the handle's username always comes from the explicit handle (not default_repo), so the URL host/owner can't be redirected, (b) the existing / rejection blocks the dangerous URL-path-traversal redirect attack, and (c) modern curl rejects URLs with embedded control characters. Still, validating at the application layer matches the defense-in-depth pattern established by SF-008 and gives a clearer error at config load time instead of an opaque curl/git failure later.
  • Resolution: Fixed. Extended _validate_config_identifier with two additional checks applied after the existing strip/empty///-- validation: (1) per-character iteration that rejects any whitespace character (str.isspace() — includes space, tab, newline, CR, vertical tab, form feed) or ASCII control byte (ord < 0x20 or 0x7F); (2) literal-value rejection of . and .. to block path-traversal-like segments in the constructed git URL. Internal dots in repo names (e.g. website.github.io) remain valid because GitHub itself allows them. Added 9 tests in test_config.py covering newline / internal space / tab in default_repo and default_owner, the ./.. rejection for both keys, plus a positive test confirming default_repo = "foo.bar" still loads.
  • Status: Fixed (2026-04-17)

SF-011: YAML indicator characters in handle components corrupt or null-out SKILL.md name field

  • Severity: Low
  • Location: agr/handle.py:_validate_component, agr/skill.py:update_skill_md_name
  • Description: _validate_component blocked whitespace/control chars (SF-008) and YAML flow-collection delimiters {}[] (SF-010), but did not block YAML indicator characters #, *, &, and !. When these appear in a skill directory name, update_skill_md_name writes them directly into the SKILL.md frontmatter as name: <value>. In YAML block style, these characters alter value semantics: (1) # preceded by whitespace is a comment indicator — name: #foo sets name to null, silently discarding the skill name; (2) * is an alias indicator — name: *anchor triggers an alias dereference; if the anchor is undefined (the normal case), most YAML parsers raise a parse error; (3) & is an anchor indicator — name: &anchor sets an anchor on a null scalar, so name becomes null; (4) ! is a tag indicator — name: !!null or name: !tag val changes the inferred type, potentially producing a non-string name field. The most impactful case is #, which silently nulls the name without any parse error, confusing agent skill-routing logic that relies on the name string. None of these characters are valid in GitHub usernames, repository names, or conventional skill directory names, so rejecting them is safe and has no user-visible impact on legitimate handles. The attack is most effective through transitive package dependencies where the user never sees the raw handle.
  • Resolution: Fixed. Added _validate_no_yaml_indicator_chars() helper in handle.py that rejects any component containing #, *, &, or !. Wired into _validate_component after the existing flow-char check, so all remote handle parsing branches (1-part with default_owner, 2-part user/skill, 3-part user/repo/skill) apply the new check. Added 7 tests covering # in skill name, username, and repo positions; *, &, and ! in skill name; and # in the one-part default_owner path.
  • Status: Fixed (2026-04-17)