Skip to content

feat: support browser-level CDP websocket fallback#408

Open
coolxll wants to merge 10 commits intojackwener:mainfrom
coolxll:feat/chrome-browser-ws-fallback
Open

feat: support browser-level CDP websocket fallback#408
coolxll wants to merge 10 commits intojackwener:mainfrom
coolxll:feat/chrome-browser-ws-fallback

Conversation

@coolxll
Copy link
Copy Markdown

@coolxll coolxll commented Mar 25, 2026

Summary / 摘要

This PR improves Chrome CDP connectivity for newer remote debugging flows and adds a practical direct-CDP mode for browser-backed commands.

这个 PR 一方面增强了 opencli 对新版 Chrome 远程调试流程的兼容性,另一方面为浏览器型命令补齐了可实际使用的直连 CDP 模式。

What This PR Includes / 本 PR 包含的内容

  1. Browser-level websocket fallback when classic /json discovery is unavailable.

  2. Explicit --browser-cdp mode for browser-backed commands to bypass the daemon / extension path.

  3. Global OPENCLI_BROWSER_CDP=1 default to enable the same behavior without repeating the flag.

  4. Safer auto mode that opens a fresh page by default, instead of accidentally attaching to an unrelated existing tab.

  5. 当经典 /json 发现路径不可用时,回退到浏览器级 websocket。

  6. 为浏览器型命令增加显式 --browser-cdp 模式,直接绕过 daemon / extension 链路。

  7. 增加全局 OPENCLI_BROWSER_CDP=1,无需每次重复传 flag。

  8. 改进 auto 模式,默认新开干净页面,避免误附着到不相关的现有标签页。

Official Reference / 官方背书

This change is aligned with Chrome's newer DevTools / MCP direction documented by the Chrome team:
https://developer.chrome.com/blog/chrome-devtools-mcp-debug-your-browser-session?hl=zh-cn

The official post shows that newer Chrome debugging sessions can rely on browser-session level connectivity rather than the older /json discovery path alone.

这个改动和 Chrome 官方介绍的新 DevTools / MCP 方向是一致的:
https://developer.chrome.com/blog/chrome-devtools-mcp-debug-your-browser-session?hl=zh-cn

官方文章说明,较新的 Chrome 调试会话可以围绕浏览器会话级连接来工作,而不再只依赖旧的 /json 发现路径。

Why / 背景与原因

In newer Chrome remote debugging setups, the debugging port can be active while classic endpoints like /json return 404.
In that situation, opencli's previous CDP implementation could not connect to ordinary Chrome pages even though the browser debugging session was actually available.

Also, for normal Chrome website automation, users may want an explicit or global command mode that clearly routes through browser CDP and does not depend on whether the daemon / extension path is working.
During local testing, auto-attaching to an existing browser tab could also land on the wrong page context, which made website commands fail with unrelated page errors.

在较新的 Chrome 远程调试模式下,调试端口可能已经开启,但经典的 /json 接口会返回 404
这种情况下,opencli 之前的 CDP 实现即使面对真实可用的浏览器调试会话,也无法连接普通 Chrome 页面。

另外,在普通 Chrome 网站自动化链路里,用户既需要显式模式,也需要全局默认模式,明确要求走浏览器 CDP,而不是依赖 daemon / extension 是否可用。
同时在本地测试中,auto 模式如果附着到已有标签页,也可能落到错误页面上下文,导致网站命令报出与目标站点无关的页面错误。

Changes / 修改内容

  • add fallback logic from http://.../json to browser-level websocket discovery via DevToolsActivePort

  • support selecting inspectable targets from Target.getTargets results even when they do not include webSocketDebuggerUrl

  • attach to the chosen page target via Target.attachToTarget(flatten: true) and route subsequent commands through the returned session id

  • add explicit --browser-cdp support for built-in browser commands and browser-backed registry commands

  • add global OPENCLI_BROWSER_CDP=1 support so browser-backed commands can default to direct CDP mode

  • add --no-browser-cdp so one command can opt out even when the global default is enabled

  • when browser CDP auto-discovery is used without an explicit target hint, open a fresh blank page first instead of attaching to a random existing tab

  • keep the existing explicit OPENCLI_CDP_ENDPOINT path unchanged when users already configure it

  • add unit tests for browser websocket detection, auto-discovery parsing, runtime override behavior, and CLI option registration

  • 增加从 http://.../json 回退到 DevToolsActivePort 浏览器级 websocket 发现的逻辑

  • 支持从 Target.getTargets 返回的目标中选择可附着页面,即使这些目标没有 webSocketDebuggerUrl

  • 通过 Target.attachToTarget(flatten: true) 附着到目标页面,并将后续命令路由到对应 session id

  • 为内置浏览器命令和 browser-backed registry commands 增加显式 --browser-cdp

  • 增加全局 OPENCLI_BROWSER_CDP=1,让浏览器命令默认走 direct CDP 模式

  • 增加 --no-browser-cdp,即使全局默认开启也可以单次关闭

  • 当 auto-discovery 且没有显式 target hint 时,默认先新开空白页,而不是附着到随机现有标签页

  • 对已经显式设置 OPENCLI_CDP_ENDPOINT 的场景保持原有行为不变

  • 增加浏览器 websocket 检测、自动发现解析、runtime 覆盖逻辑和 CLI 选项注册的单元测试

Verification / 验证

  • npm run typecheck

  • npm test

  • npm run build

  • local smoke test with Chrome on 127.0.0.1:9222

  • verified that opencli ... --browser-cdp reaches browser command execution without falling back to the daemon / extension error path

  • verified that OPENCLI_BROWSER_CDP=1 enables the same path globally

  • verified a real local command: opencli linux-do categories --browser-cdp

  • verified the same command also works with only OPENCLI_BROWSER_CDP=1

  • npm run typecheck

  • npm test

  • npm run build

  • 在本地 127.0.0.1:9222 Chrome 会话上完成 smoke test

  • 已验证 opencli ... --browser-cdp 可以进入浏览器命令执行流程,不再掉回 daemon / extension 报错链路

  • 已验证 OPENCLI_BROWSER_CDP=1 可以全局开启同一路径

  • 已验证本地真实命令:opencli linux-do categories --browser-cdp

  • 也已验证只设置 OPENCLI_BROWSER_CDP=1 时同一命令可正常运行

Screenshot / 截图

PR #408 verification screenshot

Example / 示例

opencli cascade https://news.ycombinator.com --browser-cdp
opencli explore https://linux.do --browser-cdp

export OPENCLI_BROWSER_CDP=1
opencli linux-do categories
opencli linux-do categories --no-browser-cdp

@coolxll coolxll force-pushed the feat/chrome-browser-ws-fallback branch from 27ee008 to 3644d37 Compare March 25, 2026 05:34
@Astro-Han
Copy link
Copy Markdown
Contributor

Good direction — browser-level CDP fallback aligns with Chrome's newer debugging model. Found some issues in the current implementation:

Critical

Created tabs are never cleaned up

attachToBrowserTarget creates an about:blank page via Target.createTarget, but close() only disconnects the WebSocket without calling Target.closeTarget. Each command run leaks one blank tab in Chrome.

--browser-cdp breaks desktop CDP commands

The option is registered on all browser: true commands, including desktop adapters (chatwise, cursor, antigravity). These commands rely on ensureRequiredEnv() to fail fast when OPENCLI_CDP_ENDPOINT is missing. With --browser-cdp, the endpoint is silently set to auto, bypassing this check and connecting to an unrelated Chrome browser instead of the target desktop app.

Suggestion: only register --browser-cdp on commands that don't already require OPENCLI_CDP_ENDPOINT via requiredEnv.

Warnings

Target.getTargets doesn't filter by target type — can attach to service_worker, worker, or browser targets where Page.navigate/Runtime.evaluate will fail. Should filter to type === 'page'.

Fallback chain skips /json/version — the standard way to get webSocketDebuggerUrl at the browser level. Going directly from /json failure to local DevToolsActivePort files means remote Chrome, Docker, and custom profile setups won't benefit from the fallback.

/json errors silently swallowed — the catch {} block hides connection timeouts, DNS failures, etc. The final error message doesn't explain which step failed. At minimum, log the original error under --verbose.

--browser-cdp registration repeated 5 times — identical option + withBrowserEnvOverrides wrapping in explore, generate, record, cascade, and antigravity. Should be extracted into a shared helper.

Host whitelist missing IPv6 loopback['127.0.0.1', 'localhost'] doesn't include [::1] (relevant for WSL2/Linux).

@coolxll
Copy link
Copy Markdown
Author

coolxll commented Mar 26, 2026

Follow-up update:

  • Consolidated the command-level CDP override work from feat: add command-level CDP overrides / 增加命令级 CDP 覆盖 #406 into this branch, so the two PRs no longer conflict.
  • Addressed the review items raised here: blank tab cleanup, browser-cdp scoping for desktop adapters, page-only target selection, /json/version fallback, verbose logging for discovery failures, shared CLI option registration, IPv6 loopback support, and early --cdp-endpoint validation.
  • Merged the latest upstream/main into this branch to resolve the GitHub merge conflict.

Verification completed on the consolidated implementation before sync:

  • npm run typecheck
  • npm test
  • npm run build

PR #406 has been closed as superseded by this PR.

后续更新:

  • 已将 feat: add command-level CDP overrides / 增加命令级 CDP 覆盖 #406 里的命令级 CDP override 改动并入当前分支,因此两个 PR 不再彼此冲突。
  • 已处理这里 review 提到的问题:空白标签页清理、桌面适配器的 browser-cdp 范围限制、仅附着 page 类型 target、补上 /json/version fallback、在 --verbose 下保留 discovery 失败信息、提取共享 CLI 选项注册逻辑、支持 IPv6 loopback,以及提前校验 --cdp-endpoint。
  • 已将最新 upstream/main 合入当前分支,并解决 GitHub 上显示的合并冲突。

合并前验证:

  • npm run typecheck
  • npm test
  • npm run build

PR #406 已关闭,并由当前 PR 接续。

@coolxll
Copy link
Copy Markdown
Author

coolxll commented Mar 26, 2026

@Astro-Han followed up on your feedback here:

  • consolidated the command-level CDP override work from feat: add command-level CDP overrides / 增加命令级 CDP 覆盖 #406 into this PR so the two branches no longer conflict
  • fixed blank-tab cleanup on browser-level attach
  • scoped --browser-cdp away from desktop/localhost UI adapters
  • filtered browser target attachment to type === "page"
  • added /json/version fallback before DevToolsActivePort fallback
  • kept discovery failures visible under --verbose
  • extracted shared browser env option registration
  • added IPv6 loopback support
  • added early --cdp-endpoint validation
  • merged latest upstream/main into this branch and resolved the merge conflict

Verification on the consolidated implementation before sync:

  • npm run typecheck
  • npm test
  • npm run build

If you have time for a re-check, this branch is ready for another pass.

@Astro-Han
Copy link
Copy Markdown
Contributor

Thanks for the thorough follow-up — 6 out of 8 fixes verified cleanly. The blank tab cleanup, page-only targeting, /json/version fallback, verbose logging, shared helper extraction, and IPv6 support all look good.

Two remaining items I noticed, both around desktop adapter scoping:

supportsBrowserCdp not in manifest

The field is correctly derived at registration time via cli(), but build-manifest.ts doesn't seem to include it. In production mode, commands loaded from manifest would have supportsBrowserCdp === undefined, which commanderAdapter.ts treats as true (!== false). Desktop adapters might still get --browser-cdp in production builds.

Default derivation might miss non-localhost desktop apps

The condition checks opts.domain === 'localhost', but doubao-app uses domain: 'doubao-app'. Checking whether requiredEnv includes OPENCLI_CDP_ENDPOINT might be a more reliable way to identify desktop adapters.

Everything else looks solid. Nice work consolidating #406 into this branch.

@coolxll
Copy link
Copy Markdown
Author

coolxll commented Mar 26, 2026

Thanks for the follow-up review. I fixed both remaining desktop-scoping issues in this branch.

  • added supportsBrowserCdp to the compiled manifest and preserved it when loading commands from manifest-backed discovery, so production builds no longer default desktop adapters back to allowing --browser-cdp
  • tightened the default derivation for browser CDP support so local desktop/UI targets are treated as unsupported by default, including non-localhost app-style domains such as doubao-app; real website UI adapters such as x.com remain allowed

Also added regression coverage for:

  • registry default derivation for local desktop UI vs remote website UI
  • manifest compilation of supportsBrowserCdp
  • manifest-backed discovery preserving the flag

Verification after the follow-up fix:

  • npm test -- src/registry.test.ts src/build-manifest.test.ts src/engine.test.ts src/commanderAdapter.test.ts
  • npm run typecheck
  • npm run build

Please take another look when you have time.

感谢跟进 review。我已经把这两个剩余的 desktop scope 问题修掉了。

  • 已将 supportsBrowserCdp 写入编译产出的 manifest,并在 manifest 模式的命令加载路径中保留该字段,因此生产构建下不会再把桌面适配器默认回退成允许 --browser-cdp
  • 已收紧 browser CDP 支持的默认推导逻辑:本地桌面/UI 目标现在默认视为不支持,包括 doubao-app 这类非 localhost 的 app-style 域名;像 x.com 这样的真实网站 UI 适配器仍然保留支持

另外补了回归测试,覆盖:

  • registry 对本地 desktop UI 和远程 website UI 的默认推导
  • supportsBrowserCdp 的 manifest 编译
  • manifest 加载路径对该字段的保留

修复后的验证:

  • npm test -- src/registry.test.ts src/build-manifest.test.ts src/engine.test.ts src/commanderAdapter.test.ts
  • npm run typecheck
  • npm run build

有空的话麻烦再帮忙看一轮。

@Astro-Han
Copy link
Copy Markdown
Contributor

修复确认了,manifest 序列化和推导逻辑都没问题,回归测试也补上了。辛苦!

@coolxll
Copy link
Copy Markdown
Author

coolxll commented Mar 26, 2026

Follow-up fix for the latest review items:

  • rewrote loopback page websocket URLs from /json onto the original proxied endpoint origin as well, so proxied HTTP(S) CDP setups no longer reconnect to the caller's own localhost before reaching the /json/version browser fallback
  • guarded Page.loadEventFired waiting immediately when navigation starts, so if Page.navigate itself rejects (for example target closed / transport dropped), it no longer leaves a late unhandled rejection behind

Verification:

  • npm test -- src/browser/cdp.test.ts
  • npm run typecheck

最新一轮 review 的两点也已修复:

  • /json 返回的 loopback page websocket URL 现在也会改写到原始代理 endpoint 的 host 上,避免反向代理 CDP 场景先错误连回调用方本机 localhost
  • 在导航开始时就立即给 Page.loadEventFired 的等待 promise 挂上保护;如果 Page.navigate 自身直接失败(例如 target 关闭或传输中断),不会再留下一个稍后才爆出的未处理拒绝

验证:

  • npm test -- src/browser/cdp.test.ts
  • npm run typecheck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants