Production-ready DevOps & SRE skills for Claude Code and Codex CLI.
Find idle and underutilized cloud resources, safely isolate and observe them, and decommission the real zombies — reducing cloud spend without breaking production.
flowchart LR
A[Discover] --> B[Filter] --> C[Score] --> D[Deep Scan]
D --> E[Report] --> F[Isolate] --> G[Observe]
G --> H[Decide] --> I[Backup] --> J[Cleanup]
Multi-cloud Idle Detection — Scan 10+ resource types across any cloud provider. Per-type signals: CPU/network/login for compute, connections/QPS for databases, IOPS/throughput for storage, replica status for Kubernetes workloads. No vendor lock-in — works with any cloud API or plain SSH.
Deep Technical Profiling — SSH into candidate resources to capture real-time traffic topology graphs, running services, cron jobs, disk usage, and business ownership from cloud tags, CMDB, and audit logs.
Safe Isolation & Auto-Rollback — Type-specific isolation methods (iptables, security groups, scale-to-zero). Environment-appropriate observation periods with anomaly detection. Automatic rollback on critical alerts.
Pre-deletion Safety — Per-resource-type backups with encryption for sensitive data. Seven-point re-evaluation before any destructive action. Human approval required at every critical gate.
Interactive Reports & Cost Tracking — Self-contained HTML reports with per-resource service topology graphs, process tables, storage usage, and cost savings analysis. Full audit trail for every action.
# Claude Code
claude plugin install ./plugins/ico
# Or via marketplace
claude plugin marketplace add https://github.com/KnoxOps/open-devops-skills
claude plugin install ico@open-devops-skills
# Codex CLI
codex plugin marketplace add https://github.com/KnoxOps/open-devops-skills
codex plugin add ico@open-devops-skillsOnce installed, ask Claude Code or Codex:
/ico:orchestrator Scan my cloud account for idle and underutilized resources/ico:orchestrator Deep scan 3 hosts and show traffic topology/ico:orchestrator Which resources can I safely downsize or decommission?
We plan to grow this into a full SRE skill library. Vote or contribute to shape what comes next.
| Area | Ideas |
|---|---|
| Cost Optimization | Reserved Instance analysis, Savings Plan audit, spot instance recommendations |
| Kubernetes | Pod troubleshooting, OOM diagnosis, cluster right-sizing, orphaned resource detection |
| Security | Secret sprawl detection, IAM permission audit, exposed resource scanning |
| Monitoring | Coverage gap analysis, alert noise reduction, dashboard health scoring |
Skills are organized as plugins under plugins/<namespace>/. Each plugin has unified Claude Code and Codex manifests plus a set of runbook-defined skills.
- Fork the repo
- Create
plugins/<your-plugin>/with.claude-plugin/plugin.jsonand.codex-plugin/plugin.json - Define skills as
agent-runbooks/<skill>/runbook.yaml - Run
python scripts/gen-skill plugins/<your-plugin>/agent-runbooks/<skill>/runbook.yaml --lang en - Add your plugin to
.claude-plugin/marketplace.json - Open a PR
MIT