Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Development skills for AI coding agents. Plug into your favorite AI coding tool
| `minimax-music-gen` | Generate vocal songs, instrumentals, and covers using MiniMax Music API. Two modes: Basic (one-liner in, song out) and Advanced Control (edit lyrics, refine prompt, plan structure). Supports lyrics generation, style vocabulary, streaming playback, and iterative feedback. | Official |
| `buddy-sings` | Let your Claude Code pet (/buddy) sing a personalized song. Interprets the pet's name and personality into a unique cached vocal identity, auto-gathers context (conversation, memory, git history) for themed lyrics, and generates music via minimax-music-gen. | Official |
| `minimax-music-playlist` | Generate personalized playlists by analyzing your music taste. Builds a taste profile (genre, mood, language, vocal preferences), plans a themed tracklist, generates songs with album cover art, and refines the profile from feedback. | Official |
| `mac-desktop-autopilot` | macOS local desktop automation: screenshot capture, mouse/keyboard control via pyautogui, and file upload to CDN for phone sharing. Triggers: 截图、截屏、截个图、控制电脑、发文件到手机。 | Community |

## Installation

Expand Down
1 change: 1 addition & 0 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
| `minimax-music-gen` | 使用 MiniMax Music API 生成人声歌曲、纯音乐和翻唱。支持基础模式(一句话生成)和强控制模式(编辑歌词、调整 prompt、规划曲式)。内置歌词生成、风格词表、流式播放和迭代反馈。 | Official |
| `buddy-sings` | 让你的 Claude Code 宠物(/buddy)唱一首专属歌曲。根据宠物名字和个性生成独特声线并缓存,自动采集上下文(对话、记忆、git 历史)生成主题歌词,调用 minimax-music-gen 完成创作。 | Official |
| `minimax-music-playlist` | 分析用户音乐品味生成个性化歌单。构建音乐画像(曲风、情绪、语言、声线偏好),规划主题曲目,生成歌曲与专辑封面,根据反馈持续优化画像。 | Official |
| `mac-desktop-autopilot` | macOS 本地桌面自动化:截图、鼠标键盘控制(pyautogui)、文件上传到 CDN 分享到手机。触发词:截图、截屏、控制电脑、发文件到手机。 | Community |

## 安装

Expand Down
127 changes: 127 additions & 0 deletions skills/mac-desktop-autopilot/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
---
name: mac-desktop-autopilot
description: |
macOS local desktop automation: screenshot capture, mouse/keyboard control, and file upload to CDN for phone sharing.
Use when: user asks to take a screenshot, capture screen, control the desktop, click a button, open an app, see what's on screen,
or send a file from computer to phone. Trigger phrases: 截图、截屏、截个图、控制电脑、操控电脑、发文件到手机、点一下、帮我打开xxx、看看屏幕现在是什么。
Do NOT use for: web browser automation (use Playwright MCP instead) or remote desktop (SSH/VNC).
metadata:
version: "1.0"
category: productivity
license: MIT
platform: macOS
required_permissions:
- Screen Recording
- Accessibility
dependencies:
- pyautogui
- screencapture (built-in macOS)
sources:
- macOS built-in screencapture
- pyautogui (pip)
---

# macOS Desktop Autopilot

Local macOS desktop automation via `screencapture` (screenshot) + `pyautogui` (mouse/keyboard) + Matrix CDN (file sharing).

## Prerequisites

1. **Screen Recording permission**: System Settings → Privacy & Security → Screen Recording → authorize Terminal or MiniMax Agent
2. **Accessibility permission**: System Settings → Privacy & Security → Accessibility → authorize Terminal or MiniMax Agent
3. **pyautogui installed**: `pip3 install pyautogui --default-timeout=120`

## Procedure

### Screenshot

```bash
screencapture /tmp/screenshot.png
```

After capturing, use `describe_images` tool to read and describe the screenshot. For phone sharing, upload to CDN.

### Mouse Operations

```python
import pyautogui, time
pyautogui.FAILSAFE = True # move to corner to abort

# Move to coordinate (pixels, origin = top-left)
pyautogui.moveTo(x, y, duration=0.3)

# Left-click
pyautogui.click(x, y)

# Double-click
pyautogui.doubleClick(x, y)

# Right-click
pyautogui.rightClick(x, y)

# Drag
pyautogui.dragTo(x2, y2, duration=0.5)

# Scroll
pyautogui.scroll(-3) # scroll down 3 units
```

> Why FAILSAFE: moving the cursor to any screen corner triggers an emergency stop to prevent runaway automation.

### Keyboard Operations

```python
# Type text (cursor must already be in text field)
pyautogui.typewrite("Hello", interval=0.05)

# Press a key
pyautogui.press("enter")
pyautogui.press("escape")
pyautogui.press("cmd", "s") # Cmd+S

# Combo keys
pyautogui.hotkey("cmd", "c") # copy
pyautogui.hotkey("cmd", "v") # paste
pyautogui.hotkey("cmd", "a") # select all
```

### Send File to Phone

1. Upload file via `matrix_upload_to_cdn` → get CDN URL
2. Share the URL with user → download on phone

### OCR / Read Screen Content

```bash
screencapture /tmp/ocr_screen.png
```
Then analyze with `describe_images` tool to extract text.

## Output Contract

- **Screenshot**: display image directly or share CDN link
- **Mouse/keyboard**: take a confirmation screenshot immediately after action
- **File upload**: return CDN download link

## Failure Handling

| Error | Cause | Fix |
|-------|-------|-----|
| "could not create image" | Missing Screen Recording permission | User authorizes in System Settings |
| pyautogui runaway | FAILSAFE not enabled | Move cursor to screen corner to stop |
| Permission denied | Missing Accessibility permission | User authorizes in System Settings → Privacy → Accessibility |
| pyautogui not installed | pip install failed | Retry `pip3 install pyautogui --default-timeout=120` |

## Examples

**Screenshot to phone**
User: "截个图发给我"
→ `screencapture /tmp/page.png` → `matrix_upload_to_cdn(/tmp/page.png)` → return download link

**Open an app**
User: "帮我打开微信"
→ Take screenshot → OCR locate Dock → click WeChat icon

**What's on screen**
User: "现在屏幕是什么"
→ Screenshot → `describe_images` analyze → describe screen content