AI Agent Map is a practical, visual-first guide for comparing mainstream AI agents, agent platforms, runtimes, and orchestration tools.
The goal is simple: help readers get to a sensible shortlist faster.
- The agent landscape is crowded.
- Many resources explain ideas, but not fit, anti-fit, or operating cost.
- People usually need a comparison layer, not another pile of links.
This repo stays focused on selection: what a system is good at, where it breaks down, and what kind of operator cost comes with it.
Popularity is not fit.
This table tracks projects that showed up as especially hot in the latest weekly GitHub snapshot. The rank follows the 7-day gain. The total star counts below were checked when this repo was updated.
Last updated: 2026-05-23 · Snapshot window: 2026-05-16 → 2026-05-22 (7-day gain, approximate) · Star counts: checked at update time
Project names link to the upstream GitHub repo. When this map has a written profile, it is linked separately in the "Map status" column.
| Rank | Project | Current stars | Snapshot gain | Map status | How to read it |
|---|---|---|---|---|---|
| #1 | mattpocock/skills | 100.9k | +7,600 | Watchlist (Skills Wave) | Matt Pocock's curated .claude/skills directory crossed 100k this week — still the largest week-over-week gainer among already-tracked entries |
| #2 (new) | anthropics/skills | 139.3k | ~+7,000 | Watchlist (Skills Wave canonical) | Anthropic's own reference .claude/skills repo — the upstream source of the wave; tracked as content asset, not agent surface |
| #3 | openhuman | 25.7k | +6,300 | In scope · profile (new) | Second straight week of strong growth — open-source desktop life-integration agent (Rust + Tauri, GPL-3.0); positioning now reads clearly enough to write up |
| #4 (new) | colbymchenry/codegraph | 16.4k | ~+6,000 | In scope · profile (new) | Pre-indexed code knowledge graph + MCP server targeting Claude Code, Cursor, Codex CLI, opencode, Hermes Agent — first agent-context-infrastructure entry |
| #5 | Hermes Agent | 163.1k | +5,800 | In scope · profile | Fifth consecutive week of growth — still the in-scope absolute leader (now past 163k) |
| #6 (new) | HKUDS/CLI-Anything | 39.5k | ~+5,500 | In scope · profile (new) | HKUDS lab's "make all software agent-native" framework — auto-generates Click-based CLIs from arbitrary source code so agents can drive non-API apps |
| #7 | Superpowers | 202.8k | +5,100 | In scope · profile | Crossed 200k — the agentic skills framework keeps growing alongside the broader wave |
| #8 (new) | K-Dense-AI/scientific-agent-skills | 25.2k | ~+4,500 | Watchlist (Skills Wave) | Ready-to-use Agent Skills for research / science / engineering / finance / writing — another curated collection in the wave |
| #9 (new) | Imbad0202/academic-research-skills | 19.0k | ~+4,500 | Watchlist (Skills Wave) | Curated academic research skills for Claude Code — research → write → review → revise → finalize pipeline |
| #10 (new) | humanlayer/12-factor-agents | 21.7k | ~+3,500 | Out of scope | "12-factor" principles doc for production LLM-powered software — methodology asset, not an agent surface, but a meaningful market signal |
- Heat is useful for discovery, not for selection by itself.
- This week added three new in-scope profiles — each one opens a route the map had not yet covered: openhuman (life-integration desktop agent), codegraph (agent context infrastructure), CLI-Anything (agent-native software bridge).
- The
.claude/skillswave that started last week has intensified. Six of the top ten are skills collections, frameworks, or methodology docs (mattpocock/skills,anthropics/skills,Superpowers,scientific-agent-skills,academic-research-skills,12-factor-agents). Policy unchanged: curated collections and principles docs are tracked as Skills Wave entries, not profiled — the framework end is already covered through Superpowers. - humanlayer/12-factor-agents is intentionally out of scope: principles and methodology docs are valuable signals but this map only profiles runnable agent surfaces and runtime infrastructure.
- agentmemory 16.4k (+2.8k), DeepSeek-TUI 33.5k (+1.2k), Ruflo 54.2k (+1.2k), Pi 52.9k (+1.4k), addyosmani/agent-skills 44.9k (+1.3k), TradingAgents 78.6k (+1.4k), Codex CLI 84.7k (+1.0k), financial-services 26.7k (+1.0k) all continued to grow at modest pace, dropped out of (or never made) the top 10 as the new wave absorbed most of the trending oxygen.
The wave that broke into trending in mid-May has intensified. Across the May 12 → 22 snapshots, six of the top ten trending repositories are now skills frameworks, curated skills collections, or methodology docs built on Anthropic's skill pattern:
| Repo | Stars | Shape |
|---|---|---|
| anthropics/skills | 139.3k | Anthropic's own canonical Agent Skills reference repository — the upstream source of the pattern |
| Superpowers | 202.8k | A complete skills framework + methodology, with plugin integrations into Claude Code, Codex, Cursor, GitHub Copilot, Gemini, OpenCode, and Factory Droid |
| mattpocock/skills | 100.9k | A curated personal .claude/skills directory from Matt Pocock — now past 100k stars |
| addyosmani/agent-skills | 44.9k | A production-grade engineering skills set for AI coding agents from Addy Osmani |
| K-Dense-AI/scientific-agent-skills | 25.2k | Ready-to-use Agent Skills for research / science / engineering / analysis / finance / writing |
| Imbad0202/academic-research-skills | 19.0k | Curated academic research pipeline (research → write → review → revise → finalize) for Claude Code |
What it means for selection:
- The
.claude/skillsdirectory pattern Anthropic introduced has crossed from curiosity into shared infrastructure — engineers are now publishing their personal skill libraries the way they used to publish dotfiles, and Anthropic's own reference repo is the canonical anchor. - For people choosing a coding agent, the underlying agent matters less than it did six months ago — the skill layer on top is doing more of the work, and the breadth of available skill collections is now domain-spanning (engineering, science, academic research, finance, productivity).
- This map treats the "agentic skills framework" as a route in its own right via the Superpowers profile. Curated skill collections (
anthropics/skills,mattpocock/skills,addyosmani/agent-skills,scientific-agent-skills,academic-research-skills) are tracked as Skills Wave watchlist entries rather than profiled, because they are content assets rather than agent surfaces.
On April 23 2026, OpenAI released GPT-5.5 — positioned as "a new class of intelligence for real work and powering agents." This is not a GitHub-trending project but a model release that reshapes the capability ceiling across multiple agent surfaces already tracked in this repo.
| What changed | Impact on this map |
|---|---|
| 82.7% on Terminal-Bench 2.0 | Highest agentic coding benchmark at launch — raises the bar for Codex and all OpenAI-API-based agents |
| 1M token context window | Long-context tasks that were impractical before become viable for API-based agent builders |
| 2x price vs GPT-5.4 | Cost-sensitive teams must re-benchmark per-task economics |
| SWE-Bench Pro at 58.6% | Still trails Claude Opus 4.7 (64.3%) — model choice depends on workload |
GPT-5.5 does not replace Codex as a product entry. It is the model layer underneath. See the GPT-5.5 profile for the full breakdown.
A week before the GPT-5.5 release, OpenAI shipped "Codex for (almost) everything," a major capability expansion on the Codex product surface itself.
| What changed | Why it matters |
|---|---|
| Background Computer Use | Codex can see, click, and type with its own cursor across any macOS app — even ones without an API |
| Parallel multi-agent execution | Multiple Codex agents can run on the same Mac in parallel without interfering with foreground work |
| 90+ new plugins | Atlassian Rovo, CircleCI, CodeRabbit, GitLab Issues, Microsoft Suite, and more |
| In-app browser + proactive suggestions | Direct iteration on frontend designs and proactive proposals from project context and memory |
| 3M weekly active developers | Reported in April 2026, nearly 2x early-March 2026 |
Combined with the GPT-5.5 backbone, this is the most consequential agent product update in the snapshot window.
A "harness" is the minimal scaffolding around an LLM that turns it into an agent — the loop, the tool surface, the permission model, the skills hook. These are projects you can fork, audit, and own end-to-end, rather than vendor products you adopt as-is.
| Project | Stars | License | Sweet spot | Footprint |
|---|---|---|---|---|
| Pi | 52.9k | MIT (TS) | Terminal-first coding harness with broad LLM provider coverage | Small core + opt-in skills/extensions |
| OpenHands | 74.5k | Open source | Full open-source SWE agent (CLI + GUI + cloud option) | Heaviest — closer to a product |
| SWE-agent | 19.3k | MIT (Py) | Research reference behind SWE-bench, single-YAML config | Medium; upstream moving focus to mini-swe-agent |
| mini-swe-agent | 4.5k | MIT (Py) | ~100-line successor; SWE-bench Verified >74% | Tiny — readable in one sitting |
| OpenHarness | 13.0k | MIT (Py) | 10-subsystem open harness with anthropics/skills + MCP + 43 tools | Medium; production-shaped, sibling to CLI-Anything |
How to read this row:
- Pick by footprint, not by stars. The right harness is the one whose surface area you are willing to maintain.
- If you want the smallest credible base to fork: mini-swe-agent.
- If you want a production-shaped open runtime to self-host: OpenHarness.
- If you want to publish SWE-bench numbers: SWE-agent is the canonical reference; mini-swe-agent is the working successor.
- If you want a terminal-first day-to-day coding harness: Pi.
- If you want a more complete SWE agent product that is still open source: OpenHands.
| Route | Representative projects | Typical user |
|---|---|---|
| Direct execution | Claude Code, Aider, Codex, DeepSeek-TUI, Devin, Jules | Someone who wants to hand a concrete coding task to an agent |
| Agent harness framework | Pi, OpenHands, SWE-agent, mini-swe-agent, OpenHarness | Someone who wants to own the agent loop, tool surface, and permissions instead of inheriting a vendor's product |
| Frontier agentic model | GPT-5.5 | Someone choosing which model to wire into their own agent system or evaluating the capability ceiling of OpenAI-based surfaces |
| Agentic skills framework | Superpowers | Someone who wants a methodology + composable skills layer that plugs into Claude Code, Codex, Cursor, and similar agents |
| Workflow / orchestration layer | oh-my-claudecode, oh-my-codex, Ruflo | Someone who already likes Claude Code or Codex and wants stronger orchestration on top (Ruflo extends this to multi-machine federation and 100+ specialized agents) |
| Editor-centric AI workflow | Cursor, Windsurf, Continue | Someone who wants the editor itself to stay central |
| Review-first automation | Cline, GitHub Copilot, Froge Code | Someone who wants review and human control to stay central |
| Managed background path | Claude Managed Agents | Someone who needs scheduled, cloud, or detached Anthropic workflows |
| General-purpose autonomous agent | AutoGPT, Agent Zero, BabyAGI, Julep, GenericAgent, ml-intern | Someone who wants autonomous, general-purpose task execution (or, in ml-intern's case, autonomous ML engineering) |
| Build-your-own system | LangChain, LangGraph, CrewAI, LlamaIndex, Haystack, Semantic Kernel, DSPy, Pydantic AI | Teams building their own agent platform instead of buying one |
| Runtime and tools | n8n, MemGPT, Open Interpreter, LiteLLM, Flowise, CodeGraph, CLI-Anything | Teams that need workflow automation, code execution, LLM gateways, agent context infrastructure, agent-driven CLIs, or visual builders |
| Self-hosted / local runtime | AI Edge Gallery, Goose, Hermes Agent, OpenClaw, Mercury Agent, OpenHuman | Users who need on-device privacy, long-running agents, local control, channels, devices, or personal-data life integration |
| Project | Route | One-line positioning |
|---|---|---|
| Aider | Direct execution | Terminal-first AI pair programmer close to git |
| Claude Code | Direct execution | Local and IDE-first coding agent |
| Claude Managed Agents | Managed background path | Anthropic managed / cloud execution mapping |
| Codex | Direct execution | Async coding delegation in isolated cloud environments |
| oh-my-claudecode | Workflow layer | Teams-first orchestration layer on top of Claude Code |
| oh-my-codex | Workflow layer | Stronger workflow, teams, and persistent state around Codex CLI |
| Cursor | Editor-centric platform | AI editor spanning local coding, cloud agents, and integrations |
| GitHub Copilot | Platform | Multi-surface agent platform across VS Code and GitHub |
| Cline | Review-first execution | Approval-first editor-native coding agent |
| Windsurf | AI-native IDE | Cascade-centered AI IDE |
| OpenHands | Open-source execution | Open-source software engineering agent |
| Devin | Managed execution | End-to-end managed software engineering execution |
| Jules | Managed cloud execution | GitHub-connected coding delegation with PR handoff |
| AI Edge Gallery | On-device local runtime | Mobile-first local assistant sandbox with agent skills |
| Goose | Open-source local platform | Extensible local agent across desktop, CLI, and API |
| Hermes Agent | Multi-agent / self-hosted | Long-lived self-hosted environment with memory and skills |
| OpenClaw | Runtime | Local-first multi-channel runtime layer |
| LangChain | Platform | High-level framework for building custom agents quickly |
| LangGraph | Platform | Low-level framework for durable stateful workflows |
| Continue | Editor-centric | Open-source IDE extension with full model freedom |
| GPT-5.5 | Frontier agentic model | OpenAI's agentic model powering Codex, ChatGPT, and API agent builders |
| AutoGPT | Autonomous agent platform | Visual agent builder with workflows, marketplace, and multi-model support |
| CrewAI | Multi-agent framework | Role-based agent collaboration with fast prototyping |
| LlamaIndex | Data-first framework | RAG and agentic applications over documents and data |
| n8n | Workflow automation | Visual workflow platform with native AI agent nodes and 400+ integrations |
| MemGPT | Stateful agent platform | Persistent memory agents that learn across sessions (now Letta) |
| Agent Zero | Autonomous agent | Self-building autonomous agent with dynamic tool creation |
| BabyAGI | Experimental | Pioneering autonomous agent experiment — educational, not production |
| Julep | Workflow engine | Temporal-backed durable workflow engine for stateful AI agents |
| Haystack | Framework | Production-oriented RAG and agent framework by deepset |
| Semantic Kernel | Framework | Microsoft's AI orchestration SDK for .NET, Python, and Java |
| DSPy | Framework | Programmatic prompt optimization — programming, not prompting, LMs |
| Open Interpreter | Runtime | Natural language to local code execution, no sandbox |
| LiteLLM | Infrastructure | Unified API gateway for 100+ LLM providers |
| Pydantic AI | Framework | Type-safe Python agent framework with structured outputs |
| Flowise | Visual builder | Drag-and-drop LLM app and agent builder on top of LangChain |
| Froge Code | Review-first automation | Provisionally mapped to Automagik Genie |
| Mercury Agent | Self-hosted multi-channel | Permission-hardened agent for CLI and Telegram with token budgets |
| Pi | Direct execution | Minimal terminal coding-agent harness with multi-provider LLM support |
| ml-intern | Domain-specific autonomous agent | Hugging Face's autonomous ML engineer — research, code, and ship ML using HF tooling |
| GenericAgent | Self-evolving autonomous agent | Small-seed agent that grows a personal skill tree on every task |
| Superpowers | Agentic skills framework | Methodology and composable skills layer that plugs into Claude Code, Codex, Cursor, and other agents |
| DeepSeek-TUI | Direct execution | DeepSeek-native terminal coding agent |
| Ruflo | Workflow / orchestration layer | Multi-agent orchestration platform for Claude with federation across machines, neural memory, and 100+ specialized agents |
| OpenHuman | Self-hosted / local runtime | Desktop life-integration agent with 118+ connectors, local Memory Tree, and Ollama support |
| CodeGraph | Runtime and tools | Pre-indexed code knowledge graph + MCP server for Claude Code, Cursor, Codex CLI, opencode, and Hermes Agent |
| CLI-Anything | Runtime and tools | Auto-generates Click-based CLIs for arbitrary software so agents can drive non-API apps |
| SWE-agent | Agent harness framework | Princeton + Stanford's original SWE-bench harness with single-YAML configuration |
| mini-swe-agent | Agent harness framework | The ~100-line Python successor to SWE-agent that still scores >74% on SWE-bench Verified |
| OpenHarness | Agent harness framework | HKUDS's 10-subsystem open agent harness with 43+ tools, anthropics/skills, and MCP |
If you are still deciding where to begin, use one of these quick routes and then branch out.
| If you sound like this... | Follow this path | What it helps you answer |
|---|---|---|
| I want a day-to-day coding agent and need to choose terminal vs editor | Aider → Claude Code → Cursor → Cline → coding automation guide | Terminal-first local loop vs editor-led flow vs approval-first control |
| I already like Claude Code or Codex but want stronger orchestration | Claude Code → oh-my-claudecode → Codex → oh-my-codex → mainstream matrix | When the base agent is enough and when a workflow layer actually adds value |
| I want to understand GPT-5.5's impact on the agent landscape | GPT-5.5 → Codex → Claude Code → mainstream matrix | How a frontier model release shifts the capability ceiling and what it means for product choice |
| I want a dedicated AI IDE instead of stitching tools together | Cursor → Windsurf → GitHub Copilot → mainstream matrix | Dedicated AI editor vs ecosystem platform |
| I want to hand off tickets and check back later | Codex → Jules → Devin → Claude Managed Agents → mainstream matrix | Async cloud delegation vs managed background automation |
| I need something open-source or self-hosted | Aider → OpenHands → Goose → Hermes Agent → capabilities | Terminal control, open-source execution, and local runtime ownership |
| I am building an internal agent stack, not buying a product | LangChain → LangGraph → capabilities → mainstream matrix | Framework vs runtime vs product boundaries |
Star counts and 7-day gains are point-in-time GitHub snapshots taken when the repo is updated; numbers shift quickly between weekly refreshes and small rounding differences are expected. Project descriptions, vendors, and capability summaries reflect public information at the time of writing and may change as projects evolve, get acquired, or pivot. This map is selection guidance — not endorsement, financial advice, or a production-readiness guarantee. Verify against each project's own docs before committing to a choice.