Rule: a HUMAN runs every test by hand — NOT an AI. AI does the unit analysis + the surface-parity report only; the doing is manual.
Scope here = the VSCode extension. After it all passes, see the final section.
Run on devnet with a Helius key set. Each checkbox = one thing to actually click/type and confirm.
§0 — Marketplace & on-chain (individual actions)
§0b — Full round-trips (buy→use→commit, and make→use→commit)
§B — Wallet, cloud & session sync
§C — RPC onboarding (efficiency UX)
§1 — RPC / indexer / gateway: are all calls efficient & routed right?
§2 — Cross-device continuity (GitHub-based, including skills)
(light) Chat core — already used daily; just sanity-check
Approval flow intentionally excluded here (in steady daily use).
➡️ After all the above passes (VSCode): CLI / mobile UX parity report
For each unit tested above, an AI produces a report (analysis only, no manual runs): does the same capability exist in CLI and mobile, and how is its UX surfaced there (or missing)? Output = a per-feature Ă— per-surface matrix with UX notes + gaps.
§0 — Marketplace & on-chain (individual actions)
◆ ON-CHAINbadge) / SKILL.md body / price in SOL (default 0.1) → mints, creator auto-owns 1publish_skillMCP tool (compare the result to ①)post_skill_comment/post_agent_commentvia MCPpublish_skillwork end to end§0b — Full round-trips (buy→use→commit, and make→use→commit)
§B — Wallet, cloud & session sync
mysessions) written once per new session§C — RPC onboarding (efficiency UX)
rpcStatusbadge reflects devnet/mainnet + key state§1 — RPC / indexer / gateway: are all calls efficient & routed right?
§2 — Cross-device continuity (GitHub-based, including skills)
(light) Chat core — already used daily; just sanity-check
➡️ After all the above passes (VSCode): CLI / mobile UX parity report
For each unit tested above, an AI produces a report (analysis only, no manual runs): does the same capability exist in CLI and mobile, and how is its UX surfaced there (or missing)? Output = a per-feature Ă— per-surface matrix with UX notes + gaps.