feat: add web search tool for grok models#2
Conversation
Register WebSearch when pi-web-access is optional-installed, delegate to its web_search tool, and reconcile active tools so grok-cli uses WebSearch instead of web_search. Bind the delegate on session_start.
- webSearch.ts: 47% → 100% (renderCall/renderResult branches, tool_call interceptor) - sanitize.ts: 91% → 96% (.jpg, file:// protocol, unsupported ext, string output parts) - rendering.ts: 91% → 95% (globToRegExp wildcards, grep fallback path include) - search.ts: 91% → 93% (sort tiebreaker, grep fallback with path include) - shell.ts: added timeout kill test - webSearchDelegate.ts: added bind/ensure/getLoadError tests
Greptile SummaryThis PR adds optional
Confidence Score: 4/5Safe to merge for most environments; the WebSearch feature works correctly when pi-web-access is installed, and nothing changes for users without it. The core delegation pattern, tool scoping, and interceptor logic are sound and thoroughly tested. The main concern is in webSearchDelegate.ts: the loader path dist/core/extensions/loader.js is an undocumented internal of @earendil-works/pi-coding-agent, and resolvePiCodingAgentRoot() derives the package root with a single dirname + .. step that assumes the entry point is exactly one directory deep. A package restructure could silently make WebSearch non-functional for the session without a clear error pointing at the bad path. This is not a regression risk for existing functionality, but it is a maintenance hazard for the new feature. src/tools/webSearchDelegate.ts — specifically the loader path resolution and the permanent-cache-on-failure behavior of ensureWebSearchDelegate Important Files Changed
Sequence DiagramsequenceDiagram
participant Host as pi host
participant Register as registerGrokCli
participant Delegate as webSearchDelegate
participant PiWebAccess as pi-web-access (jiti)
Host->>Register: extension load
Register->>Delegate: isPiWebAccessInstalled()?
alt installed
Register->>Register: registerWebSearchTool(pi)
Note over Register: WebSearch tool + tool_call interceptor registered
end
Host->>Register: session_start event
Register->>Delegate: isPiWebAccessInstalled()?
alt installed
Register->>Delegate: bindLivePiWebAccess(pi)
Note over Delegate: reset webSearchExecute + loadPromise
Register->>Delegate: ensureWebSearchDelegate(pi)
Delegate->>PiWebAccess: importPiExtensionLoader() → internal loader.js
Delegate->>PiWebAccess: importPiWebAccessFactory(entry) via jiti
PiWebAccess-->>Delegate: factory(fakeApi) → extension.tools.get("web_search")
Delegate->>Delegate: store webSearchExecute (bound execute fn)
end
Host->>Register: tool_call: web_search (grok-cli model)
Register-->>Host: block: true (interceptor in webSearch.ts)
Host->>Register: WebSearch tool execute
Register->>Delegate: ensureWebSearchDelegate(pi) [no-op if loaded]
Register->>Delegate: getWebSearchDelegate()
Delegate-->>Register: webSearchExecute fn
Register->>PiWebAccess: delegate(toolCallId, params, signal, onUpdate, ctx)
PiWebAccess-->>Register: AgentToolResult
Register-->>Host: result
Reviews (1): Last reviewed commit: "chore: improve test coverage from 80% to..." | Re-trigger Greptile |
| function resolvePiCodingAgentRoot() { | ||
| const mainEntry = fileURLToPath(import.meta.resolve('@earendil-works/pi-coding-agent')); | ||
| return join(dirname(mainEntry), '..'); | ||
| } | ||
|
|
||
| async function importPiExtensionLoader() { | ||
| return import(join(resolvePiCodingAgentRoot(), 'dist/core/extensions/loader.js')) as Promise<{ | ||
| createExtensionRuntime: () => Record<string, unknown>; | ||
| loadExtensionFromFactory: ( | ||
| factory: (api: ExtensionAPI) => void | Promise<void>, | ||
| cwd: string, | ||
| eventBus: ReturnType<typeof createEventBus>, | ||
| runtime: Record<string, unknown>, | ||
| extensionPath?: string, | ||
| ) => Promise<{ tools: Map<string, { definition: { execute: WebSearchExecute } }> }>; | ||
| }>; | ||
| } |
There was a problem hiding this comment.
Fragile dependency on internal package path
resolvePiCodingAgentRoot() derives the package root by going one level up from the main entry (dirname(mainEntry) + ".."``), then importPiExtensionLoaderappends the hardcoded internal pathdist/core/extensions/loader.js. Both steps are coupled to undocumented internals of @earendil-works/pi-coding-agent. If the package's main entry moves deeper than one level inside dist/(e.g.dist/esm/index.js), resolvePiCodingAgentRoot()would return/distinstead of, making the constructed loader path wrong. If the internal file is ever renamed or relocated in a patch release, the dynamic import()would throw, settinglastLoadErrorand leavingWebSearch` permanently broken for the session with no diagnostic about the path that failed.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| export async function ensureWebSearchDelegate(pi?: ExtensionAPI) { | ||
| if (!isPiWebAccessInstalled()) return; | ||
|
|
||
| const livePi = pi ?? boundLivePi; | ||
| if (!livePi) return; | ||
|
|
||
| if (webSearchExecute) return; | ||
| if (loadPromise) return loadPromise; | ||
|
|
||
| loadPromise = (async () => { | ||
| lastLoadError = undefined; | ||
| try { | ||
| await captureWebSearchFromLivePi(livePi); | ||
| } catch (err) { | ||
| lastLoadError = err instanceof Error ? err.message : String(err); | ||
| webSearchExecute = undefined; | ||
| } | ||
| })(); | ||
|
|
||
| return loadPromise; | ||
| } |
There was a problem hiding this comment.
Failed load is permanently cached within a session
When captureWebSearchFromLivePi throws, the inner IIFE catches the error, sets lastLoadError, and resolves loadPromise. On every subsequent call to ensureWebSearchDelegate, the if (loadPromise) return loadPromise guard fires immediately — returning the already-settled promise — so no reload is attempted for the rest of the session. The user will see the missingDelegateMessage on every WebSearch invocation with no actionable suggestion beyond restarting. Consider at least logging the path that was attempted inside the catch block to make the failure diagnosable without a full session restart.
|
Warning Review limit reached
More reviews will be available in 23 minutes and 36 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughThis PR adds a WebSearch tool that conditionally depends on the optional ChangesWebSearch tool with pi-web-access integration
🎯 3 (Moderate) | ⏱️ ~20 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsStopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
tests/provider/toolScope.test.ts (1)
22-49: ⚡ Quick winAdd a provider round-trip regression test.
These cases never assert
openai -> grok-cli -> openai, so they miss the current loss ofweb_searchafter suppression. One round-trip test here would lock down the intended restoration behavior and catch this regression immediately.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/provider/toolScope.test.ts` around lines 22 - 49, The tests for syncGrokTools/syncForGrokCli are missing a provider round-trip asserting that switching from openai to grok-cli and back to openai preserves/restores web_search; add a new test that (1) starts with an openai active toolset that includes 'web_search', (2) applies syncGrokTools or syncForGrokCli to simulate switching to grok-cli (verifying suppression of 'web_search' and addition of 'WebSearch' as applicable), then (3) switches back to openai and asserts that 'web_search' is restored and grok shims ('WebSearch') are removed. Target the existing helpers syncGrokTools and syncForGrokCli in the test to validate the full round-trip behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/provider/toolScope.ts`:
- Around line 17-28: The current logic that computes nextTools when switching
providers drops suppressed tools permanently because the non-grok branch simply
assigns baseTools; update the tool-switch logic to restore previously active
non-Grok tools by preserving suppressed entries and re-applying them when
provider !== 'grok-cli'. Specifically, record the set of suppressed tool names
(GROK_SUPPRESSED_TOOL_NAMES) or the filtered-out tools when computing nextTools
for provider === 'grok-cli' (use the same filtering code that references
baseTools and GROK_SUPPRESSED_TOOL_NAMES), store that set (e.g., in a
preservedSuppressedTools variable or a savedActiveTools structure) and on the
else branch rebuild nextTools from baseTools plus any preserved non-Grok active
tools (or simply restore the savedActiveTools) so that functions/variables like
nextTools, provider, baseTools, GROK_SUPPRESSED_TOOL_NAMES, and
grokToolsToActivate() are used to both suppress and later restore the
appropriate tools.
In `@src/tools/webSearch.ts`:
- Around line 93-97: In execute(), normalize and validate the incoming params
before calling the delegate: create a normalizedParams copy of params, and if it
contains a queries array (or a single query string), trim each string and filter
out empty/whitespace-only entries (e.g., queries =
queries.map(s=>s.trim()).filter(Boolean)); if the resulting queries list is
empty set queries to [] (or remove the field) so the delegate never receives
blank queries; then pass normalizedParams to delegate(toolCallId,
normalizedParams, ...). Update the execute function (and keep using
ensureWebSearchDelegate, getWebSearchDelegate, missingDelegateMessage as-is) so
rendered UI and delegated input are consistent.
In `@src/tools/webSearchDelegate.ts`:
- Around line 135-139: bindLivePiWebAccess currently clears globals but allows
stale async capture to overwrite webSearchExecute and leaves loadPromise set on
failure so future loads never retry; add a generation/token (e.g., increase a
numeric bindGeneration) on bindLivePiWebAccess and store the current generation
in captureWebSearchFromLivePi/ensureWebSearchDelegate; before assigning
webSearchExecute or loadPromise, verify the generation still matches the latest
boundGeneration (and that boundLivePi is the same), and on any load failure
ensure loadPromise is cleared (set to undefined) so subsequent
ensureWebSearchDelegate() calls can retry.
In `@tests/tools/shell.test.ts`:
- Around line 105-116: The test currently can pass without exercising timeout
logic; update the test in tests/tools/shell.test.ts to run a more portable
long-running command (e.g., use Node to sleep: command: 'node -e
"setTimeout(()=>{},10000)"') when calling
executeTool(collectTools(registerShellTool).get('Shell'), ...), and replace the
loose assertions at firstText(result) and result.details.exitCode with an
explicit timeout/killed assertion: expect(result.details.signal).toBeDefined()
(or expect(result.details.signal).toMatch(/TERM|KILL/)) or assert exitCode is
null and signal is set, so the test verifies the process was killed by the
timeout rather than failing immediately. Ensure you still check
result.details.command === 'node -e "setTimeout(()=>{},10000)"' to confirm the
invoked command.
In `@tests/tools/webSearchDelegate.test.ts`:
- Around line 52-56: The test is mocking the filesystem-backed probe
isPiWebAccessInstalled; instead, modify tests/tools/webSearchDelegate.test.ts to
avoid spying on internal logic by either (a) creating the real filesystem state
the probe expects (e.g., create/remove the pi-web-access marker/path in a temp
dir) before calling ensureWebSearchDelegate(), or (b) refactor so the filesystem
probe is injected (e.g., pass a probe function or a small wrapper module) and
stub that boundary in the test; reference ensureWebSearchDelegate and
isPiWebAccessInstalled to locate the code and update the test setup/teardown to
actually reflect installed/uninstalled state rather than using
vi.spyOn(...).mockReturnValue.
---
Nitpick comments:
In `@tests/provider/toolScope.test.ts`:
- Around line 22-49: The tests for syncGrokTools/syncForGrokCli are missing a
provider round-trip asserting that switching from openai to grok-cli and back to
openai preserves/restores web_search; add a new test that (1) starts with an
openai active toolset that includes 'web_search', (2) applies syncGrokTools or
syncForGrokCli to simulate switching to grok-cli (verifying suppression of
'web_search' and addition of 'WebSearch' as applicable), then (3) switches back
to openai and asserts that 'web_search' is restored and grok shims ('WebSearch')
are removed. Target the existing helpers syncGrokTools and syncForGrokCli in the
test to validate the full round-trip behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 193fdeac-81da-41f6-aab6-00b41d923ba9
📒 Files selected for processing (18)
README.mdpackage.jsonsrc/provider/register.tssrc/provider/toolScope.tssrc/tools/register.tssrc/tools/webSearch.tssrc/tools/webSearchDelegate.tstests/payload/sanitize.test.tstests/provider/package.test.tstests/provider/register.test.tstests/provider/toolScope.test.tstests/tools/register.test.tstests/tools/rendering.test.tstests/tools/search.test.tstests/tools/shell.test.tstests/tools/toolTestHelpers.tstests/tools/webSearch.test.tstests/tools/webSearchDelegate.test.ts
| async execute(toolCallId, params, signal, onUpdate, ctx) { | ||
| await ensureWebSearchDelegate(pi); | ||
| const delegate = getWebSearchDelegate(); | ||
| if (!delegate) return missingDelegateMessage(); | ||
| return delegate(toolCallId, params as Record<string, unknown>, signal, onUpdate, ctx); |
There was a problem hiding this comment.
Validate and normalize queries before delegating.
renderCall() trims/filter blanks, but execute() forwards the raw params object unchanged. That means {} still reaches the delegate even though the UI shows (no query), and { queries: [' ', 'foo'] } renders as one query while sending two downstream.
Suggested fix
async execute(toolCallId, params, signal, onUpdate, ctx) {
+ const normalizedQueries = queryListFromArgs(params as Record<string, unknown>);
+ if (normalizedQueries.length === 0) {
+ return {
+ content: [
+ {
+ type: 'text' as const,
+ text: 'WebSearch requires at least one non-empty query.',
+ },
+ ],
+ details: { error: 'missing query' },
+ };
+ }
+
+ const normalizedParams =
+ normalizedQueries.length === 1
+ ? { ...params, query: normalizedQueries[0] }
+ : { ...params, queries: normalizedQueries };
+
await ensureWebSearchDelegate(pi);
const delegate = getWebSearchDelegate();
if (!delegate) return missingDelegateMessage();
- return delegate(toolCallId, params as Record<string, unknown>, signal, onUpdate, ctx);
+ return delegate(
+ toolCallId,
+ normalizedParams as Record<string, unknown>,
+ signal,
+ onUpdate,
+ ctx,
+ );
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async execute(toolCallId, params, signal, onUpdate, ctx) { | |
| await ensureWebSearchDelegate(pi); | |
| const delegate = getWebSearchDelegate(); | |
| if (!delegate) return missingDelegateMessage(); | |
| return delegate(toolCallId, params as Record<string, unknown>, signal, onUpdate, ctx); | |
| async execute(toolCallId, params, signal, onUpdate, ctx) { | |
| const normalizedQueries = queryListFromArgs(params as Record<string, unknown>); | |
| if (normalizedQueries.length === 0) { | |
| return { | |
| content: [ | |
| { | |
| type: 'text' as const, | |
| text: 'WebSearch requires at least one non-empty query.', | |
| }, | |
| ], | |
| details: { error: 'missing query' }, | |
| }; | |
| } | |
| const normalizedParams = | |
| normalizedQueries.length === 1 | |
| ? { ...params, query: normalizedQueries[0] } | |
| : { ...params, queries: normalizedQueries }; | |
| await ensureWebSearchDelegate(pi); | |
| const delegate = getWebSearchDelegate(); | |
| if (!delegate) return missingDelegateMessage(); | |
| return delegate( | |
| toolCallId, | |
| normalizedParams as Record<string, unknown>, | |
| signal, | |
| onUpdate, | |
| ctx, | |
| ); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/tools/webSearch.ts` around lines 93 - 97, In execute(), normalize and
validate the incoming params before calling the delegate: create a
normalizedParams copy of params, and if it contains a queries array (or a single
query string), trim each string and filter out empty/whitespace-only entries
(e.g., queries = queries.map(s=>s.trim()).filter(Boolean)); if the resulting
queries list is empty set queries to [] (or remove the field) so the delegate
never receives blank queries; then pass normalizedParams to delegate(toolCallId,
normalizedParams, ...). Update the execute function (and keep using
ensureWebSearchDelegate, getWebSearchDelegate, missingDelegateMessage as-is) so
rendered UI and delegated input are consistent.
Summary by CodeRabbit
New Features
pi-web-accessis installed.Chores
pi-tuiandpi-web-accessnow marked as optional.