release: bump version to v2026.3.26

sav7ng · sav7ng · commit c93bc5b2116d · 2026-03-26T18:05:10.000+08:00
- summarize changes from 2e81fa1 - sync plugin/package/doc version references - add changelog updates for v2026.3.26 Made-with: Cursor
diff --git a/README.md b/README.md
@@ -6,7 +6,7 @@ OpenClaw CloudPhone is a plugin that gives AI agents device management and UI au
 
 With natural language instructions, an agent can list devices, power them on or off, capture screenshots, tap, swipe, type text, and perform other UI actions without writing manual scripts.
 
-Starting from `v1.0.7`, the package also ships with a built-in skill, `basic-skill`, which helps agents combine these tools in a more reliable way.
+Starting from `v1.1.0`, the package ships with built-in skills (including `basic-skill`) that help agents combine these tools in a more reliable way.
 
 ## Quick Start
 
@@ -75,7 +75,7 @@ Once the plugin is loaded successfully, the agent can use all CloudPhone tools.
 
 This repository is first and foremost an **OpenClaw plugin**. Its job is to expose the CloudPhone OpenAPI as tools that an agent can call.
 
-Starting from `v1.0.7`, the package also includes an **OpenClaw skill**:
+Starting from `v1.1.0`, the package includes **OpenClaw skills**:
 
 - Plugin: defines **what the agent can do** by providing `cloudphone_*` tools
 - Skill: defines **how the agent should do it reliably** by teaching call order, recovery steps, and safer workflows
@@ -164,6 +164,56 @@ After the plugin is installed, the agent automatically gets the following capabi
 | `cloudphone_snapshot` | Capture a screenshot or UI tree snapshot from the device |
 | `cloudphone_render_image` | Render a screenshot URL as an image directly in chat |
 
+## planActionTool (`cloudphone_plan_action`)
+
+`planActionTool` maps to `cloudphone_plan_action`. It lets the agent call an AutoGLM model to analyze the current screenshot and goal, then return a structured next-action plan for CloudPhone UI automation.
+
+Typical scenarios:
+- uncertain next step on a dynamic UI
+- deciding tap/swipe/input intent before execution
+- recovering when repeated direct actions fail
+
+### Prerequisites
+
+Configure these plugin fields before using `cloudphone_plan_action`:
+- required: `autoglmBaseUrl`, `autoglmApiKey`, `autoglmModel`
+- optional: `autoglmMaxTokens` (default `3000`), `autoglmLang` (default `cn`)
+
+Example (`plugins.entries.cloudphone.config`):
+
+```json
+{
+  "autoglmBaseUrl": "https://open.bigmodel.cn/api/paas/v4",
+  "autoglmApiKey": "your-api-key",
+  "autoglmModel": "autoglm-phone",
+  "autoglmMaxTokens": 3000,
+  "autoglmLang": "cn"
+}
+```
+
+### Parameters and minimal example
+
+Core input:
+- `device_id`: target cloud phone device ID
+- `goal`: natural language task goal
+
+Minimal example:
+
+```text
+device_id: "your-device-id"
+goal: "Open WeChat and enter the search page"
+```
+
+Expected output:
+- model reasoning summary for the current screen
+- a suggested next action that can be executed with `cloudphone_*` tools
+
+### Notes
+
+- If required `autoglm*` fields are missing, the tool returns a config error.
+- Recommended flow: `cloudphone_snapshot` -> `cloudphone_plan_action` -> execute with `cloudphone_tap`/`cloudphone_swipe`/`cloudphone_input_text` -> verify with new snapshot.
+- Keep each goal focused to one immediate UI objective for better planning quality.
+
 ## Usage Examples
 
 After installation and configuration, you can control cloud phones through natural language prompts.
@@ -283,7 +333,7 @@ Make sure `plugins.entries.cloudphone.enabled` is set to `true` in `openclaw.jso
 
 **Q: The tools work, but the agent is not very stable when operating a cloud phone UI.**
 
-Starting from `v1.0.7`, the package ships with the `basic-skill` skill. It teaches the agent to use the tools in a short loop: observe -> act -> verify -> observe again. Make sure you installed a recent version and restarted the Gateway so the latest skill was loaded.
+Starting from `v1.1.0`, the package ships with built-in skills such as `basic-skill`. They teach the agent to use the tools in a short loop: observe -> act -> verify -> observe again. Make sure you installed a recent version and restarted the Gateway so the latest skills were loaded.
 
 **Q: A tool call fails with a request error or timeout.**
 
@@ -301,7 +351,13 @@ The agent should call `cloudphone_render_image` automatically to turn that URL i
 
 ## Changelog
 
-Current version: **v1.1.0**
+Current version: **v2026.3.26**
+
+### v2026.3.26
+
+- Added verbose step-by-step logs for cloudphone_plan_action to improve debugging and failure tracing
+- Expanded planActionTool documentation with prerequisites, usage flow, and safety notes in both English and Chinese README
+- Synced built-in skills wording and release docs to align with the current v1.1.0+ behavior
 
 ### v1.1.0
 
diff --git a/README.zh-CN.md b/README.zh-CN.md
@@ -6,7 +6,7 @@ OpenClaw 云手机插件，让 AI Agent 具备云手机的设备管理与 UI 自
 
 通过自然语言对话即可完成云手机的查询、开关机、截图、点击、滑动、输入等操作，无需手动编写脚本。
 
-从 `v1.0.7` 开始，插件会一并发布内置 skill `basic-skill`，用于教 Agent 更稳定地组合使用这些工具。
+从 `v1.1.0` 开始，插件会一并发布内置 skills（包含 `basic-skill`），用于教 Agent 更稳定地组合使用这些工具。
 
 ## 快速开始
 
@@ -75,7 +75,7 @@ openclaw gateway restart
 
 这个仓库首先是一个 **OpenClaw 插件**，职责是把云手机 OpenAPI 暴露为 Agent 可调用的工具。
 
-从 `v1.0.7` 开始，仓库还会随包发布一个 **OpenClaw Skill**：
+从 `v1.1.0` 开始，仓库会随包发布 **OpenClaw Skills**：
 
 - 插件：解决“能做什么”，提供 `cloudphone_*` 工具
 - skill：解决“怎样更稳地做”，教 Agent 何时调用工具、如何按顺序调用、失败后如何恢复
@@ -164,6 +164,56 @@ skills/basic-skill/
 | `cloudphone_snapshot` | 获取设备截图或 UI 树快照 |
 | `cloudphone_render_image` | 将截图 URL 渲染为对话中可直接展示的图片 |
 
+## planActionTool（`cloudphone_plan_action`）
+
+`planActionTool` 对应工具名 `cloudphone_plan_action`。它会调用 AutoGLM 模型，结合当前截图与任务目标，产出结构化的下一步操作规划，帮助云手机 UI 自动化更稳地决策。
+
+典型场景：
+- 页面状态复杂，不确定下一步动作
+- 执行前先判断应点击/滑动/输入什么
+- 直接操作多次失败后用于恢复决策
+
+### 前置配置
+
+使用 `cloudphone_plan_action` 前需要在插件配置中设置：
+- 必填：`autoglmBaseUrl`、`autoglmApiKey`、`autoglmModel`
+- 可选：`autoglmMaxTokens`（默认 `3000`）、`autoglmLang`（默认 `cn`）
+
+示例（`plugins.entries.cloudphone.config`）：
+
+```json
+{
+  "autoglmBaseUrl": "https://open.bigmodel.cn/api/paas/v4",
+  "autoglmApiKey": "your-api-key",
+  "autoglmModel": "autoglm-phone",
+  "autoglmMaxTokens": 3000,
+  "autoglmLang": "cn"
+}
+```
+
+### 参数与最小示例
+
+核心入参：
+- `device_id`：目标云手机设备 ID
+- `goal`：自然语言任务目标
+
+最小示例：
+
+```text
+device_id: "your-device-id"
+goal: "打开微信并进入搜索页面"
+```
+
+预期输出：
+- 对当前页面的分析摘要
+- 可由 `cloudphone_*` 工具执行的下一步建议动作
+
+### 注意事项
+
+- 缺少必填 `autoglm*` 配置时，工具会返回配置错误。
+- 推荐链路：`cloudphone_snapshot` -> `cloudphone_plan_action` -> 用 `cloudphone_tap`/`cloudphone_swipe`/`cloudphone_input_text` 执行 -> 再截图验证。
+- 每次 `goal` 尽量聚焦一个短目标，可提升规划质量与稳定性。
+
 ## 使用示例
 
 安装配置完成后，可以直接通过自然语言对话操控云手机。
@@ -283,7 +333,7 @@ image_url : string - HTTPS 图片地址（必填）
 
 **Q: 工具能用，但 Agent 不太会稳定操作云手机 UI？**
 
-从 `v1.0.7` 开始，插件会随包发布 `basic-skill` skill。它会教 Agent 按“观察 -> 行动 -> 验证 -> 再观察”的闭环使用工具。请确认当前安装的是较新版本，并已重启 Gateway 让最新 skill 被加载。
+从 `v1.1.0` 开始，插件会随包发布内置 skills（如 `basic-skill`）。它们会教 Agent 按“观察 -> 行动 -> 验证 -> 再观察”的闭环使用工具。请确认当前安装的是较新版本，并已重启 Gateway 让最新 skills 被加载。
 
 **Q: 调用工具报请求失败或超时？**
 
@@ -301,7 +351,13 @@ Agent 应该会自动调用 `cloudphone_render_image` 将该 URL 转成可展示
 
 ## 更新日志
 
-当前版本：**v1.1.0**
+当前版本：**v2026.3.26**
+
+### v2026.3.26
+
+- 为 cloudphone_plan_action 增加详细分步日志，提升调试与失败排查效率
+- 完善 planActionTool 文档说明，补充前置配置、调用流程和注意事项（中英文 README 同步）
+- 同步内置 skills 相关表述与发布文档，使其与当前 v1.1.0+ 行为保持一致
 
 ### v1.1.0
 
diff --git a/openclaw.plugin.json b/openclaw.plugin.json
@@ -1,7 +1,7 @@
 {
   "id": "cloudphone",
   "name": "CloudPhone Plugin",
-  "version": "1.1.0",
+  "version": "2026.3.26",
   "description": "OpenClaw CloudPhone plugin that exposes CloudPhone OpenAPI capabilities for user info, device management, and UI automation as agent tools.",
   "configSchema": {
     "type": "object",
diff --git a/package-lock.json b/package-lock.json
diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@suqiai/cloudphone",
-  "version": "1.1.0",
+  "version": "2026.3.26",
   "license": "MIT",
   "description": "OpenClaw CloudPhone plugin that gives AI agents cloud device management and UI automation capabilities through natural language, including device queries, power actions, screenshots, taps, swipes, and text input.",
   "main": "dist/index.js",
diff --git a/src/tools.ts b/src/tools.ts
@@ -88,6 +88,11 @@ function getApiErrorMessage(body: Record<string, unknown>): string {
 
 const LOG_PREFIX = "[cloudphone]";
 
+function summarizeTextForLog(value: string, limit = 120): string {
+  if (!value) return "";
+  return value.length > limit ? `${value.slice(0, limit)}…` : value;
+}
+
 /** Safe for logs: origin + pathname only (no query — pre-signed URLs must not be logged in full). */
 function safeUrlForLog(url: string): string {
   try {
@@ -1311,11 +1316,26 @@ const planActionTool: ToolDefinition = {
   },
   optional: true,
   execute: async (_id, params) => {
+    const traceId = `planAction:${Date.now()}:${Math.random().toString(36).slice(2, 8)}`;
+    const startedAll = Date.now();
     const autoglmBaseUrl = runtimeConfig.autoglmBaseUrl;
     const autoglmApiKey = runtimeConfig.autoglmApiKey;
     const autoglmModel = runtimeConfig.autoglmModel;
+    const deviceId = String(params.device_id ?? "");
+    const task = String(params.task ?? "");
+    const context = params.context ? String(params.context) : undefined;
+    const maxTokens = Number(runtimeConfig.autoglmMaxTokens ?? 3000);
+    const lang = String(runtimeConfig.autoglmLang ?? "cn");
+
+    console.log(
+      `${LOG_PREFIX} planAction start trace=${traceId} device_id=${deviceId || "(empty)"} task_len=${task.length} context_len=${context?.length ?? 0} lang=${lang} max_tokens=${maxTokens}`
+    );
+    console.log(
+      `${LOG_PREFIX} planAction config trace=${traceId} has_base_url=${!!autoglmBaseUrl} has_api_key=${!!autoglmApiKey} has_model=${!!autoglmModel}`
+    );
 
     if (!autoglmBaseUrl || !autoglmApiKey || !autoglmModel) {
+      console.error(`${LOG_PREFIX} planAction config missing trace=${traceId}`);
       return toJsonText({
         ok: false,
         message:
@@ -1327,45 +1347,65 @@ const planActionTool: ToolDefinition = {
       });
     }
 
-    const deviceId = String(params.device_id);
-    const task = String(params.task ?? "");
-    const context = params.context ? String(params.context) : undefined;
-    const maxTokens = Number(runtimeConfig.autoglmMaxTokens ?? 3000);
-    const lang = String(runtimeConfig.autoglmLang ?? "cn");
-
     // 1. Take snapshot
+    const startedSnapshot = Date.now();
+    console.log(`${LOG_PREFIX} planAction step1 snapshot start trace=${traceId}`);
     const snapshotResult = await apiRequest("POST", "/devices/snapshot", { device_id: deviceId }, 15000);
+    console.log(
+      `${LOG_PREFIX} planAction step1 snapshot done trace=${traceId} elapsed=${Date.now() - startedSnapshot}ms content_items=${snapshotResult.content.length}`
+    );
     const first = snapshotResult.content[0];
     if (!first || first.type !== "text") {
+      console.error(`${LOG_PREFIX} planAction step1 snapshot invalid_content trace=${traceId}`);
       return toJsonText({ ok: false, message: "Snapshot did not return text content" });
     }
 
     let snapshotData: Record<string, unknown>;
     try {
       snapshotData = JSON.parse(first.text);
     } catch {
+      console.error(`${LOG_PREFIX} planAction step1 snapshot parse_failed trace=${traceId}`);
       return toJsonText({ ok: false, message: "Failed to parse snapshot response" });
     }
 
     if (snapshotData.ok === false) {
+      console.error(
+        `${LOG_PREFIX} planAction step1 snapshot failed trace=${traceId} message=${summarizeTextForLog(String(snapshotData.message ?? ""))}`
+      );
       return toJsonText({ ok: false, message: String(snapshotData.message ?? "Snapshot failed") });
     }
 
     const screenshotUrl = String(snapshotData.screenshot_url ?? "");
     if (!screenshotUrl) {
+      console.error(`${LOG_PREFIX} planAction step1 snapshot missing_url trace=${traceId}`);
       return toJsonText({ ok: false, message: "Snapshot did not return a screenshot_url" });
     }
+    console.log(
+      `${LOG_PREFIX} planAction step1 snapshot success trace=${traceId} screenshot=${safeUrlForLog(screenshotUrl)}`
+    );
 
     // 2. Fetch image as base64
+    const startedImgFetch = Date.now();
+    console.log(`${LOG_PREFIX} planAction step2 image_fetch start trace=${traceId}`);
     const imgResult = await fetchImageAsBase64(screenshotUrl);
     if ("error" in imgResult) {
+      console.error(
+        `${LOG_PREFIX} planAction step2 image_fetch failed trace=${traceId} elapsed=${Date.now() - startedImgFetch}ms error=${summarizeTextForLog(imgResult.error)}`
+      );
       return toJsonText({ ok: false, message: `Image fetch error: ${imgResult.error}` });
     }
+    console.log(
+      `${LOG_PREFIX} planAction step2 image_fetch success trace=${traceId} elapsed=${Date.now() - startedImgFetch}ms mime=${imgResult.mimeType} base64_len=${imgResult.base64.length} width=${imgResult.width ?? "?"} height=${imgResult.height ?? "?"}`
+    );
 
     // 3. Call autoglm model for action decision
     let thinking: string;
     let actionStr: string;
     let rawContent: string;
+    const startedAutoglm = Date.now();
+    console.log(
+      `${LOG_PREFIX} planAction step3 autoglm start trace=${traceId} base_url=${safeUrlForLog(autoglmBaseUrl)} model=${autoglmModel} task_preview=${summarizeTextForLog(task, 80)} context_preview=${summarizeTextForLog(context ?? "", 80)}`
+    );
     try {
       ({ thinking, actionStr, rawContent } = await callAutoglmForAction(
         imgResult.base64,
@@ -1378,15 +1418,26 @@ const planActionTool: ToolDefinition = {
         maxTokens,
         lang
       ));
+      console.log(
+        `${LOG_PREFIX} planAction step3 autoglm success trace=${traceId} elapsed=${Date.now() - startedAutoglm}ms thinking_len=${thinking.length} action_len=${actionStr.length} raw_len=${rawContent.length}`
+      );
     } catch (err) {
       const errMsg = err instanceof Error ? err.message : String(err);
+      console.error(
+        `${LOG_PREFIX} planAction step3 autoglm failed trace=${traceId} elapsed=${Date.now() - startedAutoglm}ms error=${summarizeTextForLog(errMsg)}`
+      );
       return toJsonText({ ok: false, message: `AutoGLM call failed: ${errMsg}` });
     }
 
     // 4. Parse action string into structured object
+    const startedParse = Date.now();
     const action = parseAutoglmAction(actionStr);
+    console.log(
+      `${LOG_PREFIX} planAction step4 parse_action trace=${traceId} elapsed=${Date.now() - startedParse}ms action_type=${action.type} has_element=${!!action.element} has_start=${!!action.start} has_end=${!!action.end}`
+    );
 
     // 5. Look up resolution and convert normalized 0-999 coords to device pixels
+    const startedConvert = Date.now();
     const resolution = await getDeviceResolutionByDeviceId(deviceId);
 
     if (resolution) {
@@ -1409,6 +1460,9 @@ const planActionTool: ToolDefinition = {
         ];
       }
     }
+    console.log(
+      `${LOG_PREFIX} planAction step5 convert_coords trace=${traceId} elapsed=${Date.now() - startedConvert}ms resolution=${resolution ? `${resolution.width}x${resolution.height}` : "unknown"} coord_system=${resolution ? "pixel" : "normalized"}`
+    );
 
     const out: Record<string, unknown> = {
       ok: true,
@@ -1424,6 +1478,7 @@ const planActionTool: ToolDefinition = {
       out.resolution_height = resolution.height;
     }
 
+    console.log(`${LOG_PREFIX} planAction done trace=${traceId} elapsed=${Date.now() - startedAll}ms`);
     return toJsonText(out);
   },
 };

Original file line number	Diff line number	Diff line change
`@@ -1,7 +1,7 @@`
`1`	`1`	`{`
`2`	`2`	`"id": "cloudphone",`
`3`	`3`	`"name": "CloudPhone Plugin",`
`4`		`- "version": "1.1.0",`
	`4`	`+ "version": "2026.3.26",`
`5`	`5`	`"description": "OpenClaw CloudPhone plugin that exposes CloudPhone OpenAPI capabilities for user info, device management, and UI automation as agent tools.",`
`6`	`6`	`"configSchema": {`
`7`	`7`	`"type": "object",`
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "@suqiai/cloudphone",`
`3`		`- "version": "1.1.0",`
	`3`	`+ "version": "2026.3.26",`
`4`	`4`	`"license": "MIT",`
`5`	`5`	`"description": "OpenClaw CloudPhone plugin that gives AI agents cloud device management and UI automation capabilities through natural language, including device queries, power actions, screenshots, taps, swipes, and text input.",`
`6`	`6`	`"main": "dist/index.js",`