Skip to content

primatrix/skills

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

298 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Primatrix Skills Plugin Marketplace

A Claude Code plugin marketplace containing reusable Agent Skills. Also compatible with Codex and Gemini CLI.

What is a Skill?

A Skill is a set of structured instructions (defined in a SKILL.md file) that teaches an AI coding agent how to perform a specific workflow. When installed, the agent loads relevant skills based on your request, giving it domain-specific knowledge and step-by-step procedures.

Available Skills

Skill Description
exec-remote Execute Python scripts on remote GPU/TPU clusters via SkyPilot
beaver Beaver issue tracker: project setup, issue creation, and PR workflows for GitHub Projects V2
session-recorder Records the complete session content to a daily work directory
lint-fix Check and fix lint issues for changed Python files
xprof-profiling-analysis Analyze TPU/XLA profiling data (xprof, xplane, trace) for performance optimization

exec-remote

Execute Python scripts on remote GPU/TPU clusters via SkyPilot.

Use when: the user asks to run code on GPU, TPU, or any remote cluster.

This plugin contains three skills with a parent-child relationship:

exec-remote          ← Entry point: run scripts on a provisioned cluster
├── deploy-cluster   ← Deploy a SkyPilot-managed TPU cluster on GKE
└── apply-resource   ← Provision/manage the underlying GKE TPU cluster via xpk

exec-remote is the top-level skill. When a cluster doesn't exist yet, it delegates to deploy-cluster, which in turn delegates to apply-resource to create the GKE infrastructure.

Skill Description
exec-remote Executes Python scripts, tests, or benchmarks on a provisioned remote cluster (GPU or TPU). Entry point — delegates to the sub-skills as needed.
deploy-cluster Deploys a SkyPilot-managed TPU cluster on GKE. Generates ~/.sky/config.yaml, fetches GKE credentials, and runs sky launch.
apply-resource Manages GKE TPU clusters using xpk. Creates, deletes, and lists TPU Nodepool resources. Multi-user safe — always queries GKE in real-time.

Capabilities:

  • Provision GPU clusters (H100, A100, L4, etc.) or TPU clusters (v4, v6e, etc.) on GCP
  • Execute Python scripts and pytest tests on remote instances
  • Automatically sync local working directory to the remote cluster
  • Manage full cluster lifecycle (create GKE cluster → deploy SkyPilot → execute → teardown)

Prerequisites:

  • SkyPilot installed and configured
  • xpk installed (for GKE TPU cluster management)
  • Google Cloud SDK with gcloud auth login completed
  • kubectl with gke-gcloud-auth-plugin
  • uv for dependency management

GKE TPU Getting Started

For running code on TPU via GKE (the full apply-resource → deploy-cluster → exec-remote pipeline), additional tools are required:

Tool Install Verify
xpk Install guide xpk --help
kubectl gcloud components install kubectl kubectl version --client

After installing the plugin (see Installation), open your AI agent in your project directory and paste the prompt below.

Replace YOUR_REPO_PATH/sglang-jax/benchmark/moe/bench_ep_moe.py with your actual script path.

COPY THIS PROMPT

[Context] I'm working on a JAX-based ML project with pyproject.toml that has a tpu extra dependency group. No remote cluster exists yet — .cluster_name_tpu is absent. The exec-remote plugin provides a three-stage pipeline: apply-resource (creates GKE cluster via xpk cluster create-pathways --spot) → deploy-cluster (deploys SkyPilot on GKE via its scripts/deploy.py) → exec-remote (runs code via sky exec). The GCP project is tpu-service-473302. The deploy script writes .cluster_name_tpu as the integration point between stages.

[Objective] Run Full CI Tests parallel on a TPU cluster by provisioning the full GKE infrastructure from scratch, following the complete apply-resource → deploy-cluster → exec-remote pipeline.

[Style] Step-by-step automated execution. Collect all cluster parameters from me once upfront (cluster name, TPU type, number of slices, GCP zone), then carry them through every subsequent step — never re-ask. Auto-calculate --num-nodes from TPU type (total_chips / 4, e.g. v6e-8 = 2 nodes, v6e-4 = 1 node). After xpk creates the GKE cluster, poll gcloud container clusters list until status is RUNNING — do NOT proceed while RECONCILING or PROVISIONING (deploying SkyPilot in these states causes SSL errors).

[Tone] Proactive — execute each pipeline step automatically and report progress. Only pause to collect user input during initial parameter gathering.

[Audience] ML engineer who wants to run training code on cloud TPU without manually managing infrastructure.

[Response] After each stage (GKE creation, SkyPilot deployment, code execution), briefly report the result and verify success before proceeding. Use sky exec with --extra tpu for dependencies and --workdir . to sync the local directory. If any step fails, diagnose the error and suggest a fix before continuing.

The agent will ask you for cluster parameters (cluster name, TPU type, number of slices, GCP zone) once, then execute the full pipeline automatically:

apply-resource   →  Creates GKE cluster with TPU nodepool via xpk
                     Polls until cluster status is RUNNING
deploy-cluster   →  Configures and launches SkyPilot on GKE
                     Writes .cluster_name_tpu
exec-remote      →  Syncs local directory, runs your script
                     with correct --num-nodes and --extra tpu

beaver

Beaver issue tracker for GitHub Projects V2 — project setup, issue creation, and PR workflows.

Use when: the user wants to create a GitHub Project V2 with structured issue tracking, create Beaver-tracked issues (Goal/Task/SubTask), or open PRs linked to Beaver issues.

This plugin contains one command and two skills:

create-beaver-project  ← Command: set up a new GitHub Project V2 with Beaver config
beaver-pr              ← Skill: commit, push, and open a PR linked to a Beaver issue
create-beaver-issue    ← Skill: create a Beaver-tracked GitHub Issue with Project V2 fields
Skill / Command Description
create-beaver-project Creates a GitHub Project V2 with custom fields (Level, Status, Progress), a beaver-config README block, and initializes the issue repo with issue types, labels, and milestones.
create-beaver-issue Creates a Beaver-tracked GitHub Issue (Goal/Task/SubTask) with automatic Project V2 field setup, tracking rules, and parent issue linking.
beaver-pr Commits changes, pushes the branch, and opens a GitHub PR with optional Beaver issue association.

Capabilities:

  • Create GitHub Project V2 with standardized custom fields (Level, Status, Progress)
  • Create structured issues with three hierarchy levels: Goal, Task, SubTask
  • Automatic Project V2 field setup and tracking rule configuration
  • PR workflow with optional Beaver issue linking
  • Issue body templates with Chinese (中文) convention for 目标/验收标准 sections
  • Milestone management with weekly milestone auto-generation

Prerequisites:

  • GitHub CLI (gh) installed and authenticated
  • Token scopes: project and admin:org (for issue types)
  • Organization-level GitHub Projects V2 access

lint-fix

Check and fix lint issues for changed Python files. Supports single commit, commit range, and unstaged/staged working tree changes.

Use when: the user wants to verify or fix lint compliance for specific changes.

Capabilities:

  • Lint files changed in a single commit or a range of commits
  • Lint unstaged or staged working tree changes
  • Auto-fix issues using isort, ruff, black, and codespell
  • Manually fix remaining issues that cannot be auto-resolved
  • Stage fixes for user review before committing

Supported Linters:

Prerequisites:

  • A git repository with Python files
  • Linter tools installed (isort, ruff, black, codespell)

xprof-profiling-analysis

Comprehensive methodology for analyzing TPU/XLA profiling data: from trace parsing to performance bottleneck identification and optimization.

Use when: analyzing TPU/XLA profiling data (xprof, trace.json.gz, op_stats, xplane), understanding HLO op performance, backward pass structure (gmm/tgmm), MFU calculation for MoE models, communication bottleneck identification, or comparing GPU vs TPU training performance.

Capabilities:

  • Parse and analyze trace.json.gz and xplane.pb profiling data
  • Classify HLO ops (matmul, attention, communication, custom kernels, etc.)
  • Identify forward/backward pass structure using tf_op fields
  • Calculate MFU for dense and MoE models
  • Analyze communication patterns (AllGather, ReduceScatter, AllReduce, AllToAll)
  • Diagnose communication overlap issues via async-done stall analysis
  • Map XLA flags to performance symptoms
  • Detect truncated traces (1M event limit) and fall back to xplane.pb

Covers:

  • Trace event structure and TPU device identification
  • HLO operator classification rules
  • GMM/TGMM backward pass structure for MoE models
  • MFU formula and MoE-specific considerations
  • Communication primitive to parallelism strategy mapping
  • XLA flags reference (Continuation Fusion, SparseCore offload, DP overlap)
  • Roofline and system bound analysis
  • Common pitfalls and anti-patterns

Prerequisites:

  • TPU profiling data (trace.json.gz, xplane.pb, or op_stats_v2.pb)
  • Python with TensorFlow/JAX for xplane.pb parsing

Installation

Claude Code

Install plugins via the plugin marketplace:

# add this repository as a community marketplace
/plugin marketplace add primatrix/skills

# install plugins from the marketplace
/plugin install exec-remote@primatrix-skills
/plugin install beaver@primatrix-skills
/plugin install lint-fix@primatrix-skills
/plugin install xprof-profiling-analysis@primatrix-skills

# project scope (default is user scope)
/plugin install exec-remote@primatrix-skills --scope project
/plugin install beaver@primatrix-skills --scope project

Codex

Install skills via skills.sh:

npx skills add primatrix/skills@exec-remote -a codex
npx skills add primatrix/skills@beaver -a codex
npx skills add primatrix/skills@lint-fix -a codex
npx skills add primatrix/skills@xprof-profiling-analysis -a codex

Gemini CLI

Install skills via the built-in skill commands:

# install from GitHub (user scope by default: ~/.gemini/skills)
gemini skills install https://github.com/primatrix/skills.git --path plugins/exec-remote/skills/exec-remote
gemini skills install https://github.com/primatrix/skills.git --path plugins/beaver/skills/beaver-pr
gemini skills install https://github.com/primatrix/skills.git --path plugins/beaver/skills/create-beaver-issue
gemini skills install https://github.com/primatrix/skills.git --path plugins/lint-fix/skills/lint-fix
gemini skills install https://github.com/primatrix/skills.git --path plugins/xprof-profiling-analysis/skills/xprof-profiling-analysis

# project/workspace scope (.gemini/skills in current project)
gemini skills install https://github.com/primatrix/skills.git --path plugins/exec-remote/skills/exec-remote --scope workspace
gemini skills install https://github.com/primatrix/skills.git --path plugins/beaver/skills/beaver-pr --scope workspace

Cross-platform (skills.sh)

skills.sh is a universal package manager that works across Claude Code, Codex, and Gemini CLI:

npx skills add primatrix/skills@exec-remote
npx skills add primatrix/skills@beaver
npx skills add primatrix/skills@lint-fix
npx skills add primatrix/skills@xprof-profiling-analysis

# install to a specific agent
npx skills add primatrix/skills -a claude-code
npx skills add primatrix/skills -a codex

Verify installation

# Claude Code — start a session and run:
/exec-remote
/beaver-pr
/create-beaver-issue
/lint-fix

# Codex — start a session and run:
/skills

# Gemini CLI
gemini skills list

Repository Structure

This repository is a Claude Code plugin marketplace. Each plugin wraps one or more skills.

.claude-plugin/
└── marketplace.json          # Marketplace definition (plugin registry)
plugins/
├── <plugin-name>/
│   ├── .claude-plugin/
│   │   └── plugin.json       # Plugin manifest (name, description, version)
│   └── skills/
│       └── <skill-name>/
│           ├── SKILL.md      # Skill definition (frontmatter + instructions)
│           └── scripts/      # Optional supporting scripts

The SKILL.md file contains:

  • YAML frontmattername, description, and optional metadata
  • Markdown body — detailed instructions the agent follows when the skill is activated

Local Development

When developing or modifying skills locally, you need a way to test changes before pushing to GitHub. The plugin marketplace installs from the remote repository, so local edits won't take effect by default.

The solution is to symlink the plugin cache to your local working directory:

Setup

# 1. Clone the repository
git clone https://github.com/primatrix/skills.git
cd skills

# 2. Install the plugin from the marketplace first (this sets up the registry)
#    In a Claude Code session:
/plugin marketplace add primatrix/skills
/plugin install exec-remote@primatrix-skills

# 3. Replace the cache with a symlink to your local directory
rm -rf ~/.claude/plugins/cache/primatrix-skills/exec-remote/1.0.0
ln -s /path/to/skills/plugins/exec-remote \
      ~/.claude/plugins/cache/primatrix-skills/exec-remote/1.0.0

# 4. Restart Claude Code session, then verify
/plugin

Repeat step 3 for each plugin you want to develop locally:

rm -rf ~/.claude/plugins/cache/primatrix-skills/<plugin-name>/1.0.0
ln -s /path/to/skills/plugins/<plugin-name> \
      ~/.claude/plugins/cache/primatrix-skills/<plugin-name>/1.0.0

Workflow

Once the symlink is in place, local edits are reflected immediately (after restarting the session):

Local edit → Restart Claude Code → /exec-remote → Verify

Caveats

  • Do not run /plugin install again — it will overwrite the symlink with a fresh copy from GitHub.
  • Restart the session after making changes — plugins are loaded at session startup.
  • plugin.json only needs name, description, and version. Do not add a skills field — Claude Code auto-discovers skills from the skills/ subdirectory.

Contributing

To add a new plugin:

  1. Create a new directory under plugins/ with your plugin name
  2. Add .claude-plugin/plugin.json with name, description, and version
  3. Add skills under skills/<skill-name>/SKILL.md with proper frontmatter and instructions
  4. Include any supporting scripts or templates in the skill directory
  5. Register the plugin in .claude-plugin/marketplace.json
  6. Submit a pull request

License

Apache License 2.0


session-recorder

Records the complete session's content and logs it to a daily work directory with a dynamic filename.

Use when: the user wants to record their full progress for documentation purposes.

Capabilities:

  • Record complete, unedited session history (questions and answers)
  • Automatically exclude cancelled operations
  • Generate dynamic log files named {$cli-name}-session.md
  • Organize logs by date in a YYYY-MM-DD directory structure

Prerequisites:

  • Python 3.x installed
  • Base directory for logs set to /Users/leos/python/daily_work

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors