feat: add cursor native tools#1
Conversation
…nd update ignore configurations
…vice modules for auth, tools, models, and provider logic
…shell tools, and expand registration integration tests
…across file and search tools
…hile sanitizing reasoning efforts
…ing, and glob search functionality
# Conflicts: # package.json
|
Warning Review limit reached
More reviews will be available in 25 minutes. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR refactors the Grok CLI extension from a monolithic implementation into a modular architecture. It introduces OAuth 2.0+PKCE authentication, splits payload sanitization and tools into dedicated modules, centralizes provider registration with quota caching and model resolution, and adds comprehensive Cursor-style tool shims for file operations, searching, and shell execution. ChangesOAuth, Error Types, and Model Catalog Foundation
Payload Sanitization and Tool Infrastructure
Provider Registration and Integration
Configuration and Documentation Updates
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 11
🧹 Nitpick comments (2)
README.md (1)
76-76: 💤 Low valueConsider expanding the token bypass documentation.
The description "(no auto-refresh)" is accurate but users may not understand the implications. Consider noting that this bypasses the OAuth flow entirely and requires manual token management when it expires.
📝 Suggested documentation enhancement
-| `GROK_CLI_OAUTH_TOKEN` | — | Direct token bypass (no auto-refresh) | +| `GROK_CLI_OAUTH_TOKEN` | — | Direct token bypass (skips OAuth flow; no auto-refresh; requires manual token management) |🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` at line 76, Update the README entry for the environment variable GROK_CLI_OAUTH_TOKEN to clarify that it bypasses the OAuth flow entirely and that tokens must be managed manually: expand the description to state that supplying GROK_CLI_OAUTH_TOKEN prevents automatic refresh/renewal, requires users to provide a valid access token obtained externally, and they must replace or rotate the token when it expires to restore CLI functionality.src/provider/toolScope.ts (1)
3-13: ⚡ Quick winKeep the Grok tool list in one production source.
Lines 3-13 need to stay in lockstep with the names registered elsewhere. Make this the canonical export consumed by registration/scoping code, or import the registration list here, so adding or renaming a tool can't leave scope management out of sync.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/provider/toolScope.ts` around lines 3 - 13, GROK_TOOL_NAMES is currently a standalone array that can drift from registered tool names; make this the single source of truth by either exporting the registration list from the module that registers tools and replacing this array with an import of that exported list, or move/export GROK_TOOL_NAMES from this file into the registration/scoping module and update all consumers to import it; locate usages referencing GROK_TOOL_NAMES and the registration logic (the tool registration function or module) and update imports so only one exported symbol controls tool names to keep registration and scoping in sync.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/auth/oauth.ts`:
- Around line 131-136: The JSON parsing of the discovery response (the await
response.json() assigned to payload) can throw a SyntaxError for HTML/proxy
bodies and must be wrapped — catch errors from response.json() and rethrow them
as XaiOAuthError with the DISCOVERY_FAILED classification so callers retain the
discovery-failure semantics; update the block around payload = (await
response.json())... to try/catch, on failure throw new
XaiOAuthError('DISCOVERY_FAILED', '...') including the original error
message/details, and keep using
validateEndpoint(String(payload.authorization_endpoint ?? ''),
'authorization_endpoint') and validateEndpoint(String(payload.token_endpoint ??
''), 'token_endpoint') as before.
- Around line 217-240: The current oauth callback server flow never surfaces the
dedicated error codes CALLBACK_BIND_FAILED and CALLBACK_TIMEOUT; modify the
listen()/server startup and the waitForCallback implementation so that if both
listen(CALLBACK_PORT) and listen(0) fail you return/resolve using the
CALLBACK_BIND_FAILED result (instead of letting the raw error leak) and change
the timeout branch in waitForCallback to resolve with CALLBACK_TIMEOUT (instead
of the generic 'timeout'/'AUTHORIZATION_FAILED' mapping upstream). Update the
logic around the listen calls and the Promise.race in the waitForCallback
returned object (referencing listen, CALLBACK_PORT, CALLBACK_HOST,
CALLBACK_PATH, server, callbackPromise, waitForCallback, CALLBACK_BIND_FAILED,
CALLBACK_TIMEOUT) to produce those specific CallbackResult error codes so
upstream can distinguish bind failures from timeouts.
- Around line 252-272: The token POSTs (token exchange and refresh) lack
timeouts and don't convert transport/JSON errors to XaiOAuthError; wrap both
fetch calls (the one using tokenEndpoint with body including
grant_type/authorization_code and the refresh flow around lines 384-406) in
try/catch, use an AbortController with a short configurable timeout to abort
stalled requests, and on any fetch/network/response.json() error throw
XaiOAuthError with the appropriate XaiErrorCode (use
XaiErrorCode.TOKEN_EXCHANGE_FAILED for the authorization_code exchange and
XaiErrorCode.REFRESH_FAILED for refresh), preserving useful context
(status/text/inner error) in the error message; also ensure non-OK responses are
still handled as before but include the aborted/parse errors in the same
XaiOAuthError shape.
In `@src/models/catalog.ts`:
- Around line 45-52: The composer entries declare thinkingLevelMap values that
indicate no supported reasoning levels, but supportsReasoningEffort() currently
returns true for all grok-composer* ids causing sanitizePayload() to retain
reasoning.effort; update supportsReasoningEffort() to look up the model entry
(from the same catalog where thinkingLevelMap is defined) and return true only
if the model's reasoning flag is true AND the thinkingLevelMap contains at least
one valid level (i.e., a value that is neither null nor the sentinel 'none');
use the catalog entry key and the thinkingLevelMap symbol to implement this
check so sanitizePayload() will correctly strip reasoning.effort for
grok-composer models.
In `@src/payload/sanitize.ts`:
- Around line 278-282: The current code only deletes next.response_format when
it creates next.text, leaving response_format present if next.text already
exists; update the logic in sanitize.ts so that if next.response_format is
present you set next.text = { format: next.response_format } only when next.text
is absent (preserving existing next.text) but in all cases remove
next.response_format afterward—i.e., always delete next.response_format after
handling it while avoiding overwriting an existing next.text.
- Around line 248-265: The current loop only extracts leading
'system'/'developer' messages into instructionParts but leaves any such messages
later in input; instead iterate over the entire input array, collect and remove
every element whose role is 'system' or 'developer' (use textFromContent((item
as Record<string, unknown>).content).trim() to extract text) into
instructionParts, then merge into next.instructions as before (respecting
existing value and joining with '\n\n'); update the input array in-place or
replace it with the filtered array so no 'system'/'developer' entries remain
before sending to xAI.
In `@src/tools/files.ts`:
- Line 141: The code currently computes targetPath via resolve(ctx.cwd,
params.path) (see targetPath, ctx.cwd, params.path, resolve) but accepts
absolute/../ inputs and can escape the workspace; add a single guard function
(e.g., ensurePathInWorkspace or canonicalizeWithinWorkspace) that: 1) joins and
resolves the candidate path, 2) calls fs.realpath on the candidate and on
ctx.cwd to canonicalize symlinks, 3) verifies the real candidate path starts
with the real workspace root (reject otherwise), and 4) returns the
canonicalized path; replace the direct resolve(...) usage in all file-tool
locations that set targetPath (the occurrences around
targetPath/ctx.cwd/params.path and the other listed sites) to call this guard
and throw a clear error when outside the workspace.
In `@src/tools/rendering.ts`:
- Around line 125-160: fileError and toolError currently discard failure state
in the returned details, causing collapsed renderers to treat errors as
empty-success results; update both functions (fileError and toolError) to
include an explicit error indicator in the details object (e.g., add an "error"
string/message or a boolean "failed" flag) so ToolResult<T>.details carries the
failure info (ensure the generic T is extended/augmented or the functions cast
to include the new field) and update the Grep special-case to still set that
error indicator when returning "No matches found" only for genuine no-match vs.
actual failures.
In `@src/tools/search.ts`:
- Around line 192-202: The fallback using execFileAsync + find builds matcher
from globToRegExp(normalizePath(params.pattern)) but always tests against
normalizePath(relative(ctx.cwd, file)), which makes basename-only patterns like
"*.ts" fail for nested files; update the filter in the search fallback so that
if params.pattern contains a path separator (i.e., includes '/' after
normalization) you test matcher.test(normalizePath(relative(ctx.cwd, file))) but
if it is a basename-only pattern you test matcher.test(path.basename(file)) (use
path.basename or equivalent), keeping use of matcher, globToRegExp,
normalizePath, params.pattern and execFileAsync/find logic intact.
In `@src/tools/shell.ts`:
- Around line 47-52: The shell runner currently hardcodes execFileAsync('bash',
['-c', params.command], ...) which will fail on systems without bash; update the
code in the function that invokes execFileAsync to detect the platform and
available shells (use process.platform to choose 'bash' on POSIX where present,
fallback to 'sh' for generic POSIX, and use 'cmd.exe' / 'powershell.exe' on
Windows as appropriate), or attempt to stat/which the chosen shell and if not
found return a clear, typed error like "unsupported platform or shell not found"
instead of calling execFileAsync; reference the execFileAsync call,
params.command, MAX_OUTPUT_BYTES, timeout, and signal when implementing the
platform-aware runner and error path.
In `@tests/tools/search.test.ts`:
- Around line 151-155: The test mutates process.env.PATH and calls
vi.resetModules() before performing the dynamic import and collectTools call,
which can leak a modified PATH if the import fails; wrap the dynamic import of
'../../src/tools/search.js' (and the call to collectTools and vi.resetModules())
inside a try block and restore the original PATH in a finally block so
process.env.PATH is always reset, i.e., save const oldPath = process.env.PATH
before changing it, move process.env.PATH = bin and vi.resetModules() into the
try, perform the import and collectTools(registerSearchTools) there, and in
finally set process.env.PATH = oldPath to guarantee restoration even on error.
---
Nitpick comments:
In `@README.md`:
- Line 76: Update the README entry for the environment variable
GROK_CLI_OAUTH_TOKEN to clarify that it bypasses the OAuth flow entirely and
that tokens must be managed manually: expand the description to state that
supplying GROK_CLI_OAUTH_TOKEN prevents automatic refresh/renewal, requires
users to provide a valid access token obtained externally, and they must replace
or rotate the token when it expires to restore CLI functionality.
In `@src/provider/toolScope.ts`:
- Around line 3-13: GROK_TOOL_NAMES is currently a standalone array that can
drift from registered tool names; make this the single source of truth by either
exporting the registration list from the module that registers tools and
replacing this array with an import of that exported list, or move/export
GROK_TOOL_NAMES from this file into the registration/scoping module and update
all consumers to import it; locate usages referencing GROK_TOOL_NAMES and the
registration logic (the tool registration function or module) and update imports
so only one exported symbol controls tool names to keep registration and scoping
in sync.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0723d8fe-2c5b-4b30-b124-7b80d5c19c0f
⛔ Files ignored due to path filters (1)
bun.lockis excluded by!**/*.lock
📒 Files selected for processing (47)
.gitignore.ignore.release-tools/cache.json.release-tools/config.tsREADME.mdbiome.jsonknip.jsonpackage.jsonsrc/auth/oauth.tssrc/errors.tssrc/index.tssrc/models.tssrc/models/catalog.tssrc/oauth.tssrc/payload/sanitize.tssrc/provider/quota.tssrc/provider/register.tssrc/provider/status.tssrc/provider/stream.tssrc/provider/toolScope.tssrc/sanitize.tssrc/shared/errors.tssrc/tools/files.tssrc/tools/register.tssrc/tools/rendering.tssrc/tools/search.tssrc/tools/shell.tstests/auth/oauth.test.tstests/errors.test.tstests/index.test.tstests/models.test.tstests/models/catalog.test.tstests/oauth.test.tstests/package.test.tstests/payload/sanitize.test.tstests/provider/package.test.tstests/provider/register.test.tstests/sanitize.test.tstests/shared/errors.test.tstests/tools/files.test.tstests/tools/register.test.tstests/tools/rendering.test.tstests/tools/search.test.tstests/tools/shell.test.tstests/tools/toolTestHelpers.tstsconfig.jsonvitest.config.ts
💤 Files with no reviewable changes (10)
- tests/index.test.ts
- tests/package.test.ts
- src/errors.ts
- src/sanitize.ts
- tests/sanitize.test.ts
- tests/errors.test.ts
- tests/models.test.ts
- src/oauth.ts
- src/models.ts
- tests/oauth.test.ts
|
@greptileai review |
Greptile SummaryThis PR adds Cursor-native tool shims (file, search, shell tools) for Grok CLI models running inside the pi extension host, along with OAuth authentication, rate-limit quota caching, and payload sanitization for xAI's Responses API quirks. Several previously-reported issues (Glob's
Confidence Score: 4/5Safe to merge for macOS/Linux users; Windows users will find the Grep tool non-functional without ripgrep installed. The core tool shims, OAuth flow, workspace boundary enforcement, and payload sanitization are all well-constructed. One functional gap remains: src/tools/rendering.ts — the Important Files Changed
Sequence DiagramsequenceDiagram
participant LLM
participant pi as pi Extension Host
participant GrokTool as Grok Tool Shim
participant FS as Filesystem
participant API as cli-chat-proxy.grok.com
LLM->>pi: Tool call (e.g. Read, Write, Grep, Shell)
pi->>GrokTool: prepareArguments (normalize Cursor arg shapes)
GrokTool->>GrokTool: canonicalizeWithinWorkspace (enforce workspace boundary)
GrokTool->>FS: File / search / shell operation
FS-->>GrokTool: result
GrokTool-->>pi: ToolResult (content + details)
pi-->>LLM: tool_result
LLM->>pi: Next message (may include images)
pi->>pi: "before_provider_request -> sanitizePayload"
pi->>pi: Normalize image_url to input_image + base64
pi->>pi: Strip reasoning items, move system to instructions
pi->>API: "POST /v1/responses (with x-grok-* headers)"
API-->>pi: SSE stream + rate-limit headers
pi->>pi: "captureRateLimit -> persist quota cache"
pi-->>LLM: assistant message
Reviews (2): Last reviewed commit: "fix: make tool fallbacks and image paths..." | Re-trigger Greptile |
|
@greptileai review |
| export async function execWithRgFallback( | ||
| rgArgs: string[], | ||
| grepArgs: string[], | ||
| options: { cwd: string; signal?: AbortSignal }, | ||
| ): Promise<string> { | ||
| if (await hasRipgrep()) { | ||
| const result = await execFileAsync('rg', rgArgs, { | ||
| cwd: options.cwd, | ||
| maxBuffer: MAX_OUTPUT_BYTES, | ||
| signal: options.signal, | ||
| }); | ||
| return result.stdout; | ||
| } | ||
| const result = await execFileAsync('grep', grepArgs, { | ||
| cwd: options.cwd, | ||
| maxBuffer: MAX_OUTPUT_BYTES, | ||
| signal: options.signal, | ||
| }); | ||
| return result.stdout; | ||
| } |
There was a problem hiding this comment.
Grep fallback uses system
grep, broken on Windows without ripgrep
execWithRgFallback falls back to execFileAsync('grep', ...) when ripgrep is unavailable. grep is not a native Windows binary, so on Windows without ripgrep installed the Grep tool will always fail at runtime. This is the same class of problem the Glob fallback already fixed in this PR (switching from Unix find to a portable listFilesRecursive Node.js walk). The Grep fallback needs a similarly portable alternative — for example, using listFilesRecursive to enumerate candidate files and then applying a JS regex over their contents.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary by CodeRabbit
New Features
Documentation
Tests