Skip to content

feat: add web search and web fetch support#118

Merged
alexeykazakov merged 3 commits into
codeready-toolchain:masterfrom
alexeykazakov:feat/web-search-support
May 13, 2026
Merged

feat: add web search and web fetch support#118
alexeykazakov merged 3 commits into
codeready-toolchain:masterfrom
alexeykazakov:feat/web-search-support

Conversation

@alexeykazakov
Copy link
Copy Markdown
Contributor

@alexeykazakov alexeykazakov commented May 13, 2026

  • Add WebSearchSpec and WebFetchSpec to Claw CRD with CEL validation requiring secretRef for API-keyed providers (brave, tavily)
  • New claw_web_search.go: provider registry, validation, ConfigMap injection (operator.json), and proxy env var mounting
  • Proxy injects search API keys transparently; gateway sees only a placeholder — supports brave (api_key), tavily (bearer), duckduckgo (passthrough), and gemini (reuses google credential)
  • Wire validation, ConfigMap injection, proxy config routes, secret version stamping, and secret watch into the reconciler
  • Comprehensive tests covering validation, ConfigMap injection, proxy routes, deployment env vars, secret stamping, and full reconcile
  • Update PLATFORM.md skill with examples and provider-setup.md docs

Summary by CodeRabbit

  • New Features

    • Configurable web search (Brave, Tavily, DuckDuckGo, Gemini) and web fetch (URL retrieval) with provider-specific credential handling and proxy integration.
    • Web fetch defaults to enabled and is gated by the proxy allowlist; proxy is auto-configured and can receive injected credentials when required.
    • New CRD fields to declare webSearch and webFetch settings; status exposes WebSearchConfigured condition.
  • Documentation

    • Provider setup guide and platform docs updated with examples, credential guidance, and usage notes.

- Add WebSearchSpec and WebFetchSpec to Claw CRD with CEL validation
  requiring secretRef for API-keyed providers (brave, tavily)
- New claw_web_search.go: provider registry, validation, ConfigMap
  injection (operator.json), and proxy env var mounting
- Proxy injects search API keys transparently; gateway sees only a
  placeholder — supports brave (api_key), tavily (bearer), duckduckgo
  (passthrough), and gemini (reuses google credential)
- Wire validation, ConfigMap injection, proxy config routes, secret
  version stamping, and secret watch into the reconciler
- Comprehensive tests covering validation, ConfigMap injection, proxy
  routes, deployment env vars, secret stamping, and full reconcile
- Update PLATFORM.md skill with examples and provider-setup.md docs

Signed-off-by: Alexey Kazakov <alkazako@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 5f1ce4a8-159b-4c5d-8282-87886b8070f4

📥 Commits

Reviewing files that changed from the base of the PR and between 8ef1504 and 027fcee.

📒 Files selected for processing (1)
  • internal/controller/claw_web_search_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/controller/claw_web_search_test.go
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Unit Tests
  • GitHub Check: E2E Tests

Walkthrough

Adds Web Search and Web Fetch to Claw: new CRD types and schema, controller validation/status (WebSearchConfigured), operator.json and proxy wiring for search/fetch, secret-version stamping, reconcile integration, tests, and docs updates.

Changes

Web Search & Web Fetch feature

Layer / File(s) Summary
API types, deepcopy, and CRD schema
api/v1alpha1/claw_types.go, api/v1alpha1/zz_generated.deepcopy.go, config/crd/bases/claw.sandbox.redhat.com_claws.yaml
Add WebSearchSpec (provider, SecretRef, Config) and WebFetchSpec (enabled); extend ClawSpec with webSearch and webFetch; add deepcopy methods and CRD schema/validation (including webFetch.enabled default and provider/secret rules).
Validation, provider metadata, and status conditions
internal/controller/claw_web_search.go, internal/controller/claw_web_search_test.go
Introduce known provider metadata (injector types, domains, credential requirements); implement validateWebSearchConfig and set WebSearchConfigured / influence Ready; unit and integration tests assert validation outcomes for providers and secret cases.
Operator ConfigMap (operator.json) injection & webFetch wiring
internal/controller/claw_web_search.go, internal/assets/manifests/claw/configmap.yaml, internal/controller/claw_web_search_test.go
Mutate data.operator.json to set tools.web.search and plugin entries, merge user config JSON into plugin config, and enable tools.web.fetch.enabled when configured; tests validate generated JSON and merged config behavior.
Proxy route generation and injector wiring
internal/controller/claw_proxy.go, internal/controller/claw_proxy_test.go, internal/controller/claw_web_search_test.go
Extend generateProxyConfig to accept WebSearchSpec; add webSearchRoute to produce provider-specific proxy routes (api_key header, bearer, passthrough, or none); tests cover route generation and interaction with existing exact-match routes.
Proxy deployment secret wiring & secret stamping
internal/controller/claw_web_search.go, internal/controller/claw_proxy.go, internal/controller/claw_proxy_test.go, internal/controller/claw_web_search_test.go
configureProxyForWebSearch injects secret-backed env var into proxy container for keyed providers; stampSecretVersionAnnotation records web-search secret ResourceVersion so pod template annotations change on secret updates; tests assert env var presence and annotation stamping.
Reconcile integration and secret references
internal/controller/claw_resource_controller.go, internal/controller/claw_channels_test.go
Reconcile validates Spec.WebSearch, wires injection into enrichConfigAndNetworkPolicy/configureDeployments/applyProxyResources, passes WebSearch to generateProxyConfig, and treats Spec.WebSearch.SecretRef as a secret trigger; tests updated to match new generateProxyConfig signature.
Tests and documentation
internal/controller/claw_web_search_test.go, internal/controller/*_test.go, docs/provider-setup.md, internal/assets/manifests/claw/configmap.yaml
Add comprehensive tests for validation, config injection, proxy behavior, and reconcile outcomes; update docs and ConfigMap PLATFORM content with provider usage, examples, and web-fetch guidance.
sequenceDiagram
  participant User as Operator (kubectl/app)
  participant Controller as Claw Controller
  participant ConfigMap as operator.json (ConfigMap)
  participant Proxy as MITM Proxy Deployment
  participant Secret as K8s Secret
  participant ProxyConfig as Proxy Config / Route

  User->>Controller: Create/Update Claw CR (spec.webSearch / webFetch)
  Controller->>Secret: Read referenced secret (if provider needs key)
  Controller->>Controller: validateWebSearchConfig -> set WebSearchConfigured condition
  Controller->>ConfigMap: inject webSearch/webFetch into operator.json
  Controller->>Proxy: configure proxy deployment (env var secretKeyRef if required)
  Controller->>ProxyConfig: generateProxyConfig(..., webSearch)
  Proxy->>ProxyConfig: add search route / header injector (provider-specific)
  Secret-->>Proxy: mounted as env (via secretKeyRef) [rgba(0,128,255,0.5)]
  Controller->>Controller: stamp secret ResourceVersion into pod template annotations
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • xcoulon
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 39.13% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'feat: add web search and web fetch support' directly matches the main changes—adding WebSearchSpec, WebFetchSpec, web search provider support, and web fetch configuration throughout the codebase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added documentation Improvements or additions to documentation feature New feature or request test Work that adds, fixes, or maintains automated tests or coverage (unit, integration, e2e, flakiness) labels May 13, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/controller/claw_proxy.go (1)

126-127: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent duplicate same-domain routes when web search is enabled.

At Line 220, the web-search route is appended unconditionally. If that domain is already introduced by credentials or MCP passthrough, the final config can contain conflicting routes for the same domain (none vs bearer/api_key), and matching precedence becomes ambiguous.

💡 Suggested fix
 func generateProxyConfig(
 	credentials []resolvedCredential,
 	mcpServers map[string]clawv1alpha1.McpServerSpec,
 	webSearch *clawv1alpha1.WebSearchSpec,
 ) ([]byte, error) {
 	var exact, suffix []proxyRoute

 	coveredDomains := make(map[string]bool)
 	for _, rc := range credentials {
 		coveredDomains[strings.ToLower(rc.Domain)] = true
 	}

 	for _, bp := range builtinPassthroughDomains {
 		if !coveredDomains[bp.Domain] {
 			coveredDomains[bp.Domain] = true
 			exact = append(exact, proxyRoute{Domain: bp.Domain, Injector: "none", AllowedPaths: bp.AllowedPaths})
 		}
 	}
+
+	// Reserve/add web search route before MCP passthrough extraction to avoid
+	// duplicate same-domain routes with conflicting injectors.
+	if wsRoute, ok := webSearchRoute(webSearch); ok {
+		wsDomain := strings.ToLower(wsRoute.Domain)
+		if !domainCovered(wsDomain, coveredDomains) {
+			coveredDomains[wsDomain] = true
+			if strings.HasPrefix(wsRoute.Domain, ".") {
+				suffix = append(suffix, wsRoute)
+			} else {
+				exact = append(exact, wsRoute)
+			}
+		}
+	}

 	exact = append(exact, mcpPassthroughRoutes(mcpServers, coveredDomains)...)
@@
-	if route, ok := webSearchRoute(webSearch); ok {
-		exact = append(exact, route)
-	}

Also applies to: 220-222

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/controller/claw_proxy.go` around lines 126 - 127, The code appends
the web-search route unconditionally which can create duplicate/conflicting
routes for the same domain already introduced by credentials or
mcpPassthroughRoutes; update the logic around where exact is appended
(referencing exact, mcpPassthroughRoutes(mcpServers, coveredDomains),
mcpServers, coveredDomains and the web-search append) to first build a set of
domains present in exact (and from mcpPassthroughRoutes) and only append the
web-search route if its domain is not already in that set (or alternatively
perform a dedupe pass over exact by domain before returning), ensuring no
duplicate same-domain routes are produced.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@internal/controller/claw_proxy.go`:
- Around line 126-127: The code appends the web-search route unconditionally
which can create duplicate/conflicting routes for the same domain already
introduced by credentials or mcpPassthroughRoutes; update the logic around where
exact is appended (referencing exact, mcpPassthroughRoutes(mcpServers,
coveredDomains), mcpServers, coveredDomains and the web-search append) to first
build a set of domains present in exact (and from mcpPassthroughRoutes) and only
append the web-search route if its domain is not already in that set (or
alternatively perform a dedupe pass over exact by domain before returning),
ensuring no duplicate same-domain routes are produced.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: af3b2101-e5ad-46cd-bea0-f1435f88c92c

📥 Commits

Reviewing files that changed from the base of the PR and between 78d13df and f3a511b.

📒 Files selected for processing (11)
  • api/v1alpha1/claw_types.go
  • api/v1alpha1/zz_generated.deepcopy.go
  • config/crd/bases/claw.sandbox.redhat.com_claws.yaml
  • docs/provider-setup.md
  • internal/assets/manifests/claw/configmap.yaml
  • internal/controller/claw_channels_test.go
  • internal/controller/claw_proxy.go
  • internal/controller/claw_proxy_test.go
  • internal/controller/claw_resource_controller.go
  • internal/controller/claw_web_search.go
  • internal/controller/claw_web_search_test.go
📜 Review details
🧰 Additional context used
📓 Path-based instructions (6)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • docs/provider-setup.md
  • api/v1alpha1/claw_types.go
  • config/crd/bases/claw.sandbox.redhat.com_claws.yaml
  • internal/assets/manifests/claw/configmap.yaml
  • internal/controller/claw_channels_test.go
  • internal/controller/claw_web_search_test.go
  • api/v1alpha1/zz_generated.deepcopy.go
  • internal/controller/claw_proxy.go
  • internal/controller/claw_resource_controller.go
  • internal/controller/claw_proxy_test.go
  • internal/controller/claw_web_search.go
api/v1alpha1/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

After modifying API types in api/v1alpha1/, always run both make manifests (regenerate CRD YAML) and make generate (regenerate zz_generated.deepcopy.go)

Files:

  • api/v1alpha1/claw_types.go
  • api/v1alpha1/zz_generated.deepcopy.go
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Include license header on all Go source files using template from hack/boilerplate.go.txt

Files:

  • api/v1alpha1/claw_types.go
  • internal/controller/claw_channels_test.go
  • internal/controller/claw_web_search_test.go
  • api/v1alpha1/zz_generated.deepcopy.go
  • internal/controller/claw_proxy.go
  • internal/controller/claw_resource_controller.go
  • internal/controller/claw_proxy_test.go
  • internal/controller/claw_web_search.go
internal/assets/manifests/**/*.yaml

📄 CodeRabbit inference engine (CLAUDE.md)

internal/assets/manifests/**/*.yaml: Configure pod security with non-root user (uid 65532), restricted seccomp profile, and all capabilities dropped
Set readOnlyRootFilesystem: true on proxy and wait-for-proxy containers (not on init-config or gateway containers which need writable paths for Node.js and AI tools)

Files:

  • internal/assets/manifests/claw/configmap.yaml
**/*_test.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*_test.go: Use require for fatal setup errors and assert for value comparisons in tests
Write tests using t.Run() subtests, t.Cleanup(), and table-driven test patterns
Use the waitFor(t, timeout, interval, condition, message) helper for async assertions with 10s timeout and 250ms poll interval

Files:

  • internal/controller/claw_channels_test.go
  • internal/controller/claw_web_search_test.go
  • internal/controller/claw_proxy_test.go
internal/controller/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Set owner references on all created resources via controllerutil.SetControllerReference

Files:

  • internal/controller/claw_channels_test.go
  • internal/controller/claw_web_search_test.go
  • internal/controller/claw_proxy.go
  • internal/controller/claw_resource_controller.go
  • internal/controller/claw_proxy_test.go
  • internal/controller/claw_web_search.go
🪛 LanguageTool
docs/provider-setup.md

[grammar] ~1004-~1004: Ensure spelling is correct
Context: ...ed by credentials, search providers, or builtin passthroughs are reachable. To allow ad...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🔇 Additional comments (10)
internal/controller/claw_web_search.go (4)

1-58: LGTM!

License header is present, and the provider registry design cleanly separates API-keyed providers (brave, tavily), passthrough providers (duckduckgo), and LLM-credential-backed providers (gemini).


65-153: LGTM!

The validation flow correctly handles all provider types, validates secret existence and key presence, and maintains proper status conditions.


155-282: LGTM!

The ConfigMap injection correctly handles the different provider types:

  • API-keyed providers get placeholder apiKey in plugin entries
  • User-provided config is properly merged
  • Gemini uses the "google" plugin entry as expected

284-348: LGTM!

The proxy configuration correctly mounts the search API key secret as an environment variable for API-keyed providers only. Error handling properly covers missing deployment and container cases.

internal/controller/claw_resource_controller.go (3)

433-440: LGTM!

Web search validation is correctly positioned after credential resolution (required for gemini provider's google credential check) and before proxy resource generation. The error handling follows the established pattern.


637-639: LGTM!

Web search integration is properly wired into:

  • ConfigMap enrichment for operator.json
  • Proxy deployment configuration for env vars
  • Proxy config generation for routes

Also applies to: 667-669, 695-695


1346-1350: LGTM!

Secret reference tracking for web search enables the controller to reconcile when the web search secret changes, maintaining consistency with how credential and MCP secrets are handled.

internal/controller/claw_web_search_test.go (3)

37-230: LGTM!

Comprehensive validation tests covering:

  • All supported providers (brave, tavily, duckduckgo, gemini)
  • Secret validation (missing secret, wrong key)
  • LLM credential cross-reference (gemini → google)
  • Unknown provider rejection

Test patterns correctly use require for setup and assert for comparisons.


232-392: LGTM!

ConfigMap injection tests thoroughly verify operator.json structure for all provider types, including plugin entry configuration and the no-op case.


394-916: LGTM!

Excellent test coverage including:

  • Proxy route generation for all providers
  • Environment variable injection
  • Secret version stamping
  • Full reconciliation flows
  • Edge cases (nil webSearch, nil secretRef)

- Move web search route insertion before MCP passthrough extraction
  and gate it with domainCovered() to skip when a credential or
  suffix-match already covers the domain
- Add tests for exact-match and suffix-match deduplication

Signed-off-by: Alexey Kazakov <alkazako@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/controller/claw_web_search_test.go`:
- Around line 321-324: The test currently asserts that the entire "plugins"
block is absent which is brittle; instead check whether the "duckduckgo" entry
is missing. Update the assertion around the config variable: if plugins exist
(hasPlugins := config["plugins"]), cast or type-assert plugins to the map type
(e.g., pluginsMap := config["plugins"].(map[string]interface{}) or the
appropriate type) and replace assert.False(t, hasPlugins) with an assertion that
"duckduckgo" is not present (e.g., assert.NotContains(t, pluginsMap,
"duckduckgo") or assert.False(t, _, ok) for pluginsMap["duckduckgo"]). Reference
the variables config and the "plugins" key and the plugin name "duckduckgo" when
making the change.
- Around line 291-295: Replace unchecked type assertions on config paths with
require-guarded checks so test failures are reported instead of panics: for each
step (config -> plugins -> entries -> tavilyEntry -> tavilyConfig -> webSearch)
first require the key exists/is not nil and then perform a type assertion using
the two-value form (v, ok := ... .(map[string]any)) and call require.True(t, ok,
"message") or require.IsType/require.NotNil as appropriate; update occurrences
that set plugins, entries, tavilyEntry, tavilyConfig, and webSearch to follow
this pattern.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: bd6fb5f7-41c0-4669-ae77-0b12d4f9f4f3

📥 Commits

Reviewing files that changed from the base of the PR and between f3a511b and 8ef1504.

📒 Files selected for processing (2)
  • internal/controller/claw_proxy.go
  • internal/controller/claw_web_search_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/controller/claw_proxy.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: E2E Tests
  • GitHub Check: Unit Tests
🧰 Additional context used
📓 Path-based instructions (4)
**/*_test.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*_test.go: Use require for fatal setup errors and assert for value comparisons in tests
Write tests using t.Run() subtests, t.Cleanup(), and table-driven test patterns
Use the waitFor(t, timeout, interval, condition, message) helper for async assertions with 10s timeout and 250ms poll interval

Files:

  • internal/controller/claw_web_search_test.go
internal/controller/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Set owner references on all created resources via controllerutil.SetControllerReference

Files:

  • internal/controller/claw_web_search_test.go
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Include license header on all Go source files using template from hack/boilerplate.go.txt

Files:

  • internal/controller/claw_web_search_test.go
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • internal/controller/claw_web_search_test.go
🔇 Additional comments (1)
internal/controller/claw_web_search_test.go (1)

37-968: Coverage is strong across validation, injection, proxy wiring, and reconcile flows.

Nice breadth here, especially provider-specific behavior and route dedup scenarios.

Comment thread internal/controller/claw_web_search_test.go Outdated
Comment thread internal/controller/claw_web_search_test.go Outdated
- Guard type assertions in tavily and gemini subtests with require
  to produce clear failures instead of panics
- Replace brittle "no plugins block" assertion in duckduckgo test
  with a targeted check that duckduckgo has no plugin entry

Signed-off-by: Alexey Kazakov <alkazako@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@alexeykazakov alexeykazakov merged commit 0f979fe into codeready-toolchain:master May 13, 2026
4 checks passed
@alexeykazakov alexeykazakov deleted the feat/web-search-support branch May 13, 2026 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation feature New feature or request test Work that adds, fixes, or maintains automated tests or coverage (unit, integration, e2e, flakiness)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant