You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+60-4Lines changed: 60 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ OpenClaw CloudPhone is a plugin that gives AI agents device management and UI au
6
6
7
7
With natural language instructions, an agent can list devices, power them on or off, capture screenshots, tap, swipe, type text, and perform other UI actions without writing manual scripts.
8
8
9
-
Starting from `v1.0.7`, the package also ships with a built-in skill, `basic-skill`, which helps agents combine these tools in a more reliable way.
9
+
Starting from `v1.1.0`, the package ships with built-in skills (including `basic-skill`) that help agents combine these tools in a more reliable way.
10
10
11
11
## Quick Start
12
12
@@ -75,7 +75,7 @@ Once the plugin is loaded successfully, the agent can use all CloudPhone tools.
75
75
76
76
This repository is first and foremost an **OpenClaw plugin**. Its job is to expose the CloudPhone OpenAPI as tools that an agent can call.
77
77
78
-
Starting from `v1.0.7`, the package also includes an **OpenClaw skill**:
78
+
Starting from `v1.1.0`, the package includes **OpenClaw skills**:
79
79
80
80
- Plugin: defines **what the agent can do** by providing `cloudphone_*` tools
81
81
- Skill: defines **how the agent should do it reliably** by teaching call order, recovery steps, and safer workflows
@@ -164,6 +164,56 @@ After the plugin is installed, the agent automatically gets the following capabi
164
164
|`cloudphone_snapshot`| Capture a screenshot or UI tree snapshot from the device |
165
165
|`cloudphone_render_image`| Render a screenshot URL as an image directly in chat |
166
166
167
+
## planActionTool (`cloudphone_plan_action`)
168
+
169
+
`planActionTool` maps to `cloudphone_plan_action`. It lets the agent call an AutoGLM model to analyze the current screenshot and goal, then return a structured next-action plan for CloudPhone UI automation.
170
+
171
+
Typical scenarios:
172
+
- uncertain next step on a dynamic UI
173
+
- deciding tap/swipe/input intent before execution
174
+
- recovering when repeated direct actions fail
175
+
176
+
### Prerequisites
177
+
178
+
Configure these plugin fields before using `cloudphone_plan_action`:
- a suggested next action that can be executed with `cloudphone_*` tools
210
+
211
+
### Notes
212
+
213
+
- If required `autoglm*` fields are missing, the tool returns a config error.
214
+
- Recommended flow: `cloudphone_snapshot` -> `cloudphone_plan_action` -> execute with `cloudphone_tap`/`cloudphone_swipe`/`cloudphone_input_text` -> verify with new snapshot.
215
+
- Keep each goal focused to one immediate UI objective for better planning quality.
216
+
167
217
## Usage Examples
168
218
169
219
After installation and configuration, you can control cloud phones through natural language prompts.
@@ -283,7 +333,7 @@ Make sure `plugins.entries.cloudphone.enabled` is set to `true` in `openclaw.jso
283
333
284
334
**Q: The tools work, but the agent is not very stable when operating a cloud phone UI.**
285
335
286
-
Starting from `v1.0.7`, the package ships with the `basic-skill` skill. It teaches the agent to use the tools in a short loop: observe -> act -> verify -> observe again. Make sure you installed a recent version and restarted the Gateway so the latest skill was loaded.
336
+
Starting from `v1.1.0`, the package ships with built-in skills such as `basic-skill`. They teach the agent to use the tools in a short loop: observe -> act -> verify -> observe again. Make sure you installed a recent version and restarted the Gateway so the latest skills were loaded.
287
337
288
338
**Q: A tool call fails with a request error or timeout.**
289
339
@@ -301,7 +351,13 @@ The agent should call `cloudphone_render_image` automatically to turn that URL i
301
351
302
352
## Changelog
303
353
304
-
Current version: **v1.1.0**
354
+
Current version: **v2026.3.26**
355
+
356
+
### v2026.3.26
357
+
358
+
- Added verbose step-by-step logs for cloudphone_plan_action to improve debugging and failure tracing
359
+
- Expanded planActionTool documentation with prerequisites, usage flow, and safety notes in both English and Chinese README
360
+
- Synced built-in skills wording and release docs to align with the current v1.1.0+ behavior
Copy file name to clipboardExpand all lines: openclaw.plugin.json
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
{
2
2
"id": "cloudphone",
3
3
"name": "CloudPhone Plugin",
4
-
"version": "1.1.0",
4
+
"version": "2026.3.26",
5
5
"description": "OpenClaw CloudPhone plugin that exposes CloudPhone OpenAPI capabilities for user info, device management, and UI automation as agent tools.",
Copy file name to clipboardExpand all lines: package.json
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
{
2
2
"name": "@suqiai/cloudphone",
3
-
"version": "1.1.0",
3
+
"version": "2026.3.26",
4
4
"license": "MIT",
5
5
"description": "OpenClaw CloudPhone plugin that gives AI agents cloud device management and UI automation capabilities through natural language, including device queries, power actions, screenshots, taps, swipes, and text input.",
0 commit comments