You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Default upgrade coverage on macOS should now include: fresh snapshot -> site installer pinned to the latest stable tag -> `openclaw update --channel dev` on the guest. Treat this as part of the default Tahoe regression plan, not an optional side quest.
49
+
-`parallels-macos-smoke.sh --mode upgrade` should run that release-to-dev lane by default. Keep the older host-tgz upgrade path only when the caller explicitly passes `--target-package-spec`.
50
+
- Because the default upgrade lane no longer needs a host tgz, skip `npm pack` + host HTTP server startup for `--mode upgrade` unless `--target-package-spec` is set. Keep the pack/server path for `fresh` and `both`.
49
51
- If that release-to-dev lane fails with `reason=preflight-no-good-commit` and repeated `sh: pnpm: command not found` tails from `preflight build`, treat it as an updater regression first. The fix belongs in the git/dev updater bootstrap path, not in Parallels retry logic.
52
+
- Until the public stable train includes that updater bootstrap fix, the macOS release-to-dev lane may seed a temporary guest-local `pnpm` shim immediately before `openclaw update --channel dev`. Keep that workaround scoped to the smoke harness and remove it once the latest stable no longer needs it.
50
53
- Default to the snapshot closest to `macOS 26.3.1 latest`.
51
54
- On Peter's Tahoe VM, `fresh-latest-march-2026` can hang in `prlctl snapshot-switch`; if restore times out there, rerun with `--snapshot-hint 'macOS 26.3.1 latest'` before blaming auth or the harness.
52
55
-`parallels-macos-smoke.sh` now retries `snapshot-switch` once after force-stopping a stuck running/suspended guest. If Tahoe still times out after that recovery path, then treat it as a real Parallels/host issue and rerun manually.
description: Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.
4
+
---
5
+
6
+
# OpenClaw QA Testing
7
+
8
+
Use this skill for `qa-lab` / `qa-channel` work. Repo-local QA only.
9
+
10
+
## Read first
11
+
12
+
-`docs/concepts/qa-e2e-automation.md`
13
+
-`docs/help/testing.md`
14
+
-`docs/channels/qa-channel.md`
15
+
-`qa/QA_KICKOFF_TASK.md`
16
+
-`qa/seed-scenarios.json`
17
+
-`extensions/qa-lab/src/suite.ts`
18
+
19
+
## Model policy
20
+
21
+
- Live OpenAI lane: `openai/gpt-5.4`
22
+
- Fast mode: on
23
+
- Do not use:
24
+
-`openai/gpt-5.4-pro`
25
+
-`openai/gpt-5.4-mini`
26
+
- Only change model policy if the user explicitly asks.
27
+
28
+
## Default workflow
29
+
30
+
1. Read the seed plan and current suite implementation.
short_description: "Run and debug qa-lab and qa-channel scenarios"
4
+
default_prompt: "Use $openclaw-qa-testing to run or extend the OpenClaw QA suite with qa-lab and qa-channel, using regular openai/gpt-5.4 in fast mode for live OpenAI runs."
- Agents/cache: stabilize cache-relevant system prompt fingerprints by normalizing equivalent structured prompt whitespace, line endings, hook-added system context, and runtime capability ordering so semantically unchanged prompts reuse KV/cache more reliably. Thanks @vincentkoc.
35
35
- Agents/tool prompts: remove the duplicate in-band tool inventory from agent system prompts so tool-calling models rely on the structured tool definitions as the single source of truth, improving prompt stability and reducing stale tool guidance.
36
36
- Tools/video generation: add bundled xAI (`grok-imagine-video`) and Alibaba Model Studio Wan video providers, plus live-test/default model wiring for both.
37
+
- Tools/video generation: add a bundled Runway video provider (`runway/gen4.5`) with native async task polling, local image/video reference support via data URIs, provider docs, and live-test wiring.
38
+
- Agents/video generation: register `video_generate` runs in the task ledger with task/run ids and lifecycle updates so long-running generations can be tracked more reliably.
39
+
- Agents/video generation: make session-backed `video_generate` runs detach into background tasks, wake the same agent session on completion, and have the agent post the finished video back into the original channel as a follow-up reply.
40
+
- Agents/video generation: add active-task prompt hints plus a hard duplicate guard so session-backed `video_generate` returns task status for in-flight jobs instead of spawning the same video request twice, and expose `action=status` for explicit lookup.
37
41
- Providers/CLI: remove bundled CLI text-provider backends and the `agents.defaults.cliBackends` surface, while keeping ACP harness sessions and Gemini media understanding on the native bundled providers.
38
42
- Matrix/exec approvals: clarify unavailable-approval replies so Matrix no longer claims chat approvals are unsupported when native exec approvals are merely unconfigured. (#61424) Thanks @gumadeiras.
39
43
- Docs/IRC: replace public IRC hostname examples with `irc.example.com` and recommend private servers for bot coordination while listing common public networks for intentional use.
40
44
- Memory/dreaming: write dreaming trail content to top-level `DREAMS.md` instead of daily memory notes, update `/dreaming` help text to point there, and keep `DREAMS.md` available for explicit reads without pulling it into default recall. Thanks @davemorin.
45
+
- Plugins/Lobster: run bundled Lobster workflows in process instead of spawning the external CLI, reducing transport overhead and unblocking native runtime integration. (#61523) Thanks @mbelinky.
41
46
42
47
### Fixes
43
48
@@ -52,6 +57,7 @@ Docs: https://docs.openclaw.ai
52
57
- Discord/reply tags: strip leaked `[[reply_to_current]]` control tags from preview text and honor explicit reply-tag threading during final delivery, so Discord replies stay attached to the triggering message instead of printing reply metadata into chat.
53
58
- Discord/replies: replace the unshipped `replyToOnlyWhenBatched` flag with `replyToMode: "batched"` so native reply references only attach on debounced multi-message turns while explicit reply tags still work.
54
59
- Discord/image generation: include the real generated `MEDIA:` paths in tool output, avoid duplicate plain-output media requeueing, and persist volatile workspace-generated media into durable outbound media before final reply delivery so generated image replies stop pointing at missing local files.
60
+
- Tools/image generation: ignore unsupported provider geometry overrides such as OpenAI `aspectRatio` hints, report the dropped overrides in tool output, and keep compatible provider fallbacks working instead of failing early.
55
61
- Slack: route live DM replies back to the concrete inbound DM channel while keeping persisted routing metadata user-scoped, so normal assistant replies stop disappearing when pairing and system messages still arrive. (#59030) Thanks @afurm.
56
62
- WhatsApp: restore `channels.whatsapp.blockStreaming` and reset watchdog timeouts after reconnect so quiet chats stop falling into reconnect loops. (#60007, #60069) Thanks @MonkeyLeeT and @mcaxtr.
57
63
- Android/Talk Mode: cancel in-flight `talk.speak` playback when speech is explicitly stopped, and restore spoken replies on both node-scoped and gateway-backed sessions by keeping reply routing and embedded transport overrides aligned with the current playback path. (#60306, #61164, #61214)
- Exec approvals: remove heuristic command-obfuscation gating from host exec so gateway and node runs rely on explicit policy, allowlist, and strict inline-eval rules only.
109
115
- Agents/tool results: cap live tool-result persistence and overflow-recovery truncation at 40k characters so oversized tool output stays bounded without discarding recent context entirely.
110
116
- Discord/video replies: split text-plus-video deliveries into a text reply followed by a media-only send, and let live provider auth checks honor manifest-declared API key env vars like `MODELSTUDIO_API_KEY`.
117
+
- Providers/fal video: switch long-running fal video generation to the queue-backed submit/status/result flow, and accept `FAL_API_KEY` as a compatibility alias for the canonical `FAL_KEY`.
111
118
- Config/All Settings: keep the raw config view intact when sensitive fields are blank instead of corrupting or dropping the rendered snapshot. (#28214) Thanks @solodmd.
112
119
- Plugin SDK/facades: back-fill bundled plugin facade sentinels before plugin-id tracking re-enters config loading, so CLI/provider startup no longer crashes with `shouldNormalizeGoogleProviderConfig is not a function` or other empty-facade reads during bundled plugin re-entry. Thanks @adam91holt.
113
120
- Plugins/facades: back-fill facade sentinels before tracked-plugin resolution re-enters config loading, so facade exports stay defined during circular provider normalization. (#61180) Thanks @adam91holt.
114
121
- Discord/image generation: include the real generated `MEDIA:` paths in tool output and avoid duplicate plain-output media requeueing so Discord image replies stop pointing at missing local files.
115
122
- Slack: route live DM replies back to the concrete inbound DM channel while keeping persisted routing metadata user-scoped, so normal assistant replies stop disappearing when pairing and system messages still arrive. (#59030) Thanks @afurm.
116
123
- Discord/reply tags: strip leaked `[[reply_to_current]]` control tags from preview text and honor explicit reply-tag threading during final delivery, so Discord replies stay attached to the triggering message instead of printing reply metadata into chat.
124
+
- CLI/update: block `openclaw update --channel dev` with a clearer explainer when the git checkout has edited local files, instead of failing later once commit-switching work starts.
117
125
- Telegram: fix current-model checks in the model picker, HTML-format non-default `/model` confirmations, explicit topic replies, persisted reaction ownership across restarts, caption-media placeholder and `file_id` preservation on download failure, and upgraded-install inbound image reads. (#60384, #60042, #59634, #59207, #59948, #59971) Thanks @sfuminya, @GitZhangChi, @dashhuang, @samzong, @v1p0r, and @neeravmakwana.
118
126
- Telegram: restore DM voice-note preflight transcription so direct-message audio stops arriving as raw `<media:audio>` placeholders. (#61008) Thanks @manueltarouca.
119
127
- Telegram/reasoning: only create a Telegram reasoning preview lane when the session is explicitly `reasoning:stream`, so hidden `<think>` traces from streamed replies stop surfacing as chat previews on normal sessions. Thanks @vincentkoc.
- Plugins/OpenAI: tune the OpenAI prompt overlay for live-chat cadence so GPT replies stay shorter, more human, and less wall-of-text by default.
159
167
- Providers/compat: stop forcing OpenAI-only defaults on proxy and custom OpenAI-compatible routes, preserve native vendor-specific reasoning/tool/streaming behavior across Anthropic-compatible, Moonshot, Mistral, ModelStudio, OpenRouter, xAI, and Z.ai endpoints, and route GitHub Copilot Claude models through Anthropic Messages instead of OpenAI Responses.
160
168
- Providers/GitHub Copilot: send IDE identity headers on runtime model requests and GitHub token exchange so IDE-authenticated Copilot runs stop failing with missing `Editor-Version`. (#60641) Thanks @VACInc and @vincentkoc.
169
+
- Discord/native commands: authorize slash commands and autocomplete in explicitly allowlisted guild channels when `commands.allowFrom` is unset, while still keeping `commands.allowFrom` authoritative when it is configured and denying non-allowlisted channels.
161
170
- Providers/OpenRouter failover: classify `403 “Key limit exceeded”` spending-limit responses as billing so model fallback continues instead of stopping on generic auth. (#59892) Thanks @rockcent.
162
171
- Providers/Anthropic: keep `claude-cli/*` auth on live Claude CLI credentials at runtime, avoid persisting stale bearer-token profiles, and suppress macOS Keychain prompts during non-interactive Claude CLI setup. (#61234) Thanks @darkamenosa.
163
172
- Providers/Anthropic: when Claude CLI auth becomes the default, write a real `claude-cli` auth profile so local and gateway agent runs can use Claude CLI immediately without missing-API-key failures. Thanks @vincentkoc.
| Subagent orchestration |`subagent`| Spawning a subagent via `sessions_spawn`|`done_only`|
78
78
| Cron jobs (all types) |`cron`| Every cron execution (main-session and isolated) |`silent`|
79
79
| CLI operations |`cli`|`openclaw agent` commands that run through the gateway |`silent`|
80
+
| Agent media jobs |`cli`| Session-backed `video_generate` runs |`silent`|
80
81
81
82
Main-session cron tasks use `silent` notify policy by default — they create records for tracking but do not generate notifications. Isolated cron tasks also default to `silent` but are more visible because they run in their own session.
82
83
84
+
Session-backed `video_generate` runs also use `silent` notify policy. They still create task records, but completion is handed back to the original agent session as an internal wake so the agent can write the follow-up message and attach the finished video itself.
85
+
86
+
While a session-backed `video_generate` task is still active, the tool also acts as a guardrail: repeated `video_generate` calls in that same session return the active task status instead of starting a second concurrent generation. Use `action: "status"` when you want an explicit progress/status lookup from the agent side.
87
+
83
88
**What does not create tasks:**
84
89
85
90
- Heartbeat turns — main-session; see [Heartbeat](/gateway/heartbeat)
Copy file name to clipboardExpand all lines: docs/cli/memory.md
+29Lines changed: 29 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,6 +34,8 @@ openclaw memory status --deep --index
34
34
openclaw memory status --deep --index --verbose
35
35
openclaw memory status --agent main
36
36
openclaw memory index --agent main --verbose
37
+
openclaw memory promote-explain "router vlan"
38
+
openclaw memory rem-harness --json
37
39
```
38
40
39
41
## Options
@@ -90,6 +92,33 @@ Full options:
90
92
-`--include-promoted`: include already promoted candidates in output.
91
93
-`--json`: print JSON output.
92
94
95
+
`memory promote-explain`:
96
+
97
+
Explain why a specific candidate would or would not promote, with a full score breakdown.
98
+
99
+
```bash
100
+
openclaw memory promote-explain "<selector>"
101
+
```
102
+
103
+
-`<selector>`: candidate key, path fragment, or snippet fragment to match.
104
+
-`--agent <id>`: scope to a single agent (default: the default agent).
105
+
-`--include-promoted`: include already promoted candidates.
106
+
-`--json`: print JSON output.
107
+
108
+
`memory rem-harness`:
109
+
110
+
Preview REM reflections, candidate truths, and deep promotion output without writing anything. Useful for staging and debugging the REM phase before it runs for real.
0 commit comments