diff --git a/README.md b/README.md index 12164fda..e09271c2 100644 --- a/README.md +++ b/README.md @@ -27,6 +27,7 @@ Development skills for AI coding agents. Plug into your favorite AI coding tool | `minimax-music-gen` | Generate vocal songs, instrumentals, and covers using MiniMax Music API. Two modes: Basic (one-liner in, song out) and Advanced Control (edit lyrics, refine prompt, plan structure). Supports lyrics generation, style vocabulary, streaming playback, and iterative feedback. | Official | | `buddy-sings` | Let your Claude Code pet (/buddy) sing a personalized song. Interprets the pet's name and personality into a unique cached vocal identity, auto-gathers context (conversation, memory, git history) for themed lyrics, and generates music via minimax-music-gen. | Official | | `minimax-music-playlist` | Generate personalized playlists by analyzing your music taste. Builds a taste profile (genre, mood, language, vocal preferences), plans a themed tracklist, generates songs with album cover art, and refines the profile from feedback. | Official | +| `mac-desktop-autopilot` | macOS local desktop automation: screenshot capture, mouse/keyboard control via pyautogui, and file upload to CDN for phone sharing. Triggers: 截图、截屏、截个图、控制电脑、发文件到手机。 | Community | ## Installation diff --git a/README_zh.md b/README_zh.md index b4b7f5f4..66afe7a9 100644 --- a/README_zh.md +++ b/README_zh.md @@ -27,6 +27,7 @@ | `minimax-music-gen` | 使用 MiniMax Music API 生成人声歌曲、纯音乐和翻唱。支持基础模式(一句话生成)和强控制模式(编辑歌词、调整 prompt、规划曲式)。内置歌词生成、风格词表、流式播放和迭代反馈。 | Official | | `buddy-sings` | 让你的 Claude Code 宠物(/buddy)唱一首专属歌曲。根据宠物名字和个性生成独特声线并缓存,自动采集上下文(对话、记忆、git 历史)生成主题歌词,调用 minimax-music-gen 完成创作。 | Official | | `minimax-music-playlist` | 分析用户音乐品味生成个性化歌单。构建音乐画像(曲风、情绪、语言、声线偏好),规划主题曲目,生成歌曲与专辑封面,根据反馈持续优化画像。 | Official | +| `mac-desktop-autopilot` | macOS 本地桌面自动化:截图、鼠标键盘控制(pyautogui)、文件上传到 CDN 分享到手机。触发词:截图、截屏、控制电脑、发文件到手机。 | Community | ## 安装 diff --git a/skills/mac-desktop-autopilot/SKILL.md b/skills/mac-desktop-autopilot/SKILL.md new file mode 100644 index 00000000..418a6f78 --- /dev/null +++ b/skills/mac-desktop-autopilot/SKILL.md @@ -0,0 +1,127 @@ +--- +name: mac-desktop-autopilot +description: | + macOS local desktop automation: screenshot capture, mouse/keyboard control, and file upload to CDN for phone sharing. + Use when: user asks to take a screenshot, capture screen, control the desktop, click a button, open an app, see what's on screen, + or send a file from computer to phone. Trigger phrases: 截图、截屏、截个图、控制电脑、操控电脑、发文件到手机、点一下、帮我打开xxx、看看屏幕现在是什么。 + Do NOT use for: web browser automation (use Playwright MCP instead) or remote desktop (SSH/VNC). +metadata: + version: "1.0" + category: productivity + license: MIT + platform: macOS + required_permissions: + - Screen Recording + - Accessibility + dependencies: + - pyautogui + - screencapture (built-in macOS) + sources: + - macOS built-in screencapture + - pyautogui (pip) +--- + +# macOS Desktop Autopilot + +Local macOS desktop automation via `screencapture` (screenshot) + `pyautogui` (mouse/keyboard) + Matrix CDN (file sharing). + +## Prerequisites + +1. **Screen Recording permission**: System Settings → Privacy & Security → Screen Recording → authorize Terminal or MiniMax Agent +2. **Accessibility permission**: System Settings → Privacy & Security → Accessibility → authorize Terminal or MiniMax Agent +3. **pyautogui installed**: `pip3 install pyautogui --default-timeout=120` + +## Procedure + +### Screenshot + +```bash +screencapture /tmp/screenshot.png +``` + +After capturing, use `describe_images` tool to read and describe the screenshot. For phone sharing, upload to CDN. + +### Mouse Operations + +```python +import pyautogui, time +pyautogui.FAILSAFE = True # move to corner to abort + +# Move to coordinate (pixels, origin = top-left) +pyautogui.moveTo(x, y, duration=0.3) + +# Left-click +pyautogui.click(x, y) + +# Double-click +pyautogui.doubleClick(x, y) + +# Right-click +pyautogui.rightClick(x, y) + +# Drag +pyautogui.dragTo(x2, y2, duration=0.5) + +# Scroll +pyautogui.scroll(-3) # scroll down 3 units +``` + +> Why FAILSAFE: moving the cursor to any screen corner triggers an emergency stop to prevent runaway automation. + +### Keyboard Operations + +```python +# Type text (cursor must already be in text field) +pyautogui.typewrite("Hello", interval=0.05) + +# Press a key +pyautogui.press("enter") +pyautogui.press("escape") +pyautogui.press("cmd", "s") # Cmd+S + +# Combo keys +pyautogui.hotkey("cmd", "c") # copy +pyautogui.hotkey("cmd", "v") # paste +pyautogui.hotkey("cmd", "a") # select all +``` + +### Send File to Phone + +1. Upload file via `matrix_upload_to_cdn` → get CDN URL +2. Share the URL with user → download on phone + +### OCR / Read Screen Content + +```bash +screencapture /tmp/ocr_screen.png +``` +Then analyze with `describe_images` tool to extract text. + +## Output Contract + +- **Screenshot**: display image directly or share CDN link +- **Mouse/keyboard**: take a confirmation screenshot immediately after action +- **File upload**: return CDN download link + +## Failure Handling + +| Error | Cause | Fix | +|-------|-------|-----| +| "could not create image" | Missing Screen Recording permission | User authorizes in System Settings | +| pyautogui runaway | FAILSAFE not enabled | Move cursor to screen corner to stop | +| Permission denied | Missing Accessibility permission | User authorizes in System Settings → Privacy → Accessibility | +| pyautogui not installed | pip install failed | Retry `pip3 install pyautogui --default-timeout=120` | + +## Examples + +**Screenshot to phone** +User: "截个图发给我" +→ `screencapture /tmp/page.png` → `matrix_upload_to_cdn(/tmp/page.png)` → return download link + +**Open an app** +User: "帮我打开微信" +→ Take screenshot → OCR locate Dock → click WeChat icon + +**What's on screen** +User: "现在屏幕是什么" +→ Screenshot → `describe_images` analyze → describe screen content