A personal curated list of research papers, frameworks, and resources focused on agentic coding—where AI agents leverage language models for software engineering tasks. As a developer of GLM, I try to include papers that I find valuable and insightful, but I'm sure there are many great works I've missed. Contributions and suggestions are very welcome!
Keywords: Agentic Coding, Large Language Models (LLMs), Software Engineering Agents
Please open an issue or submit a pull request if you'd like to add papers or resources that are missing.
- Large Language Model-Based Agents for Software Engineering: A Survey,arxiv 2024,[paper]
- SWE-bench: Can Language Models Resolve Real-World GitHub Issues?, ICLR, 2024, [paper]
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?,arxiv, 2025 [paper]
- Terminal-bench: a benchmark for ai agents in terminal environments ,2025,[project]
- SWE-bench Goes Live! arxiv,2025,[paper]
- SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains? ICLR 2025,[paper]
- Multi-SWE-bench: A Multilingual Benchmarkfor Issue Resolving,arxiv 2025,[paper]
- SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models,arxiv 2025,[paper]
- Qwen3-Coder: Agentic Coding in the World, 2025.08, [Blog]
- Kimi K2: Open Agentic Intelligence, 2025.09.05, [Paper]
- GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models, 2025.08, [Paper]
- GLM-4.6,2025.10, [Blog]
- OpenAI Codex,2025.09, [Blog]
- Kimi-Dev: Agentless Training as Skill Prior for SWE-Agents,2025.09,[paper]
- Claude 4.5 Sonnet,[model Card]
- Claude 4.5 Opus,[Model Card]
- Claude Sonnet 4.6, 2026.02, [Model Card]
- Claude Opus 4.6, 2026.02, [Model Card]
- Kimi K2.5: Visual Agentic Intelligence, 2026.02, [Paper]
- GLM-5: from Vibe Coding to Agentic Engineering, 2026.02, [Paper]
- Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem, arXiv 2025, [Paper]
- On Data Engineering for Scaling LLM Terminal Capabilities, arXiv 2026, [Paper]
- Training Software Engineering Agents and Verifiers with SWE-Gym, ICML 2025, [Paper]
- SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents, arXiv 2025, [Paper]
- R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents, COLM 2025, [Paper]
- SWE-Mirror: Scaling Issue-Resolving Datasets by Mirroring Issues Across Repositories, arXiv 2025, [Paper]
- SWE-QA: Can Language Models Answer Repository-level Code Questions?, arxiv 2025,[paper]
- LocAgent: Graph-Guided LLM Agents for Code Localization ,ACL 2025, [paper]
- Improving Code Localization with Repository Memory, arxiv 2025, [paper]
- SweRank: Software Issue Localization with Code Ranking,arxiv 2025,[paper]
- EDIT-Bench: Evaluating LLM Abilities to Perform Real-World Instructed Code Edits,arxiv 2025, [paper]
- Executable Code Actions Elicit Better LLM Agents,ICML 2024,[paper]
- SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering, NeurIPS 2024, [Paper]
- Agentless: Demystifying LLM-based Software Engineering Agents, Proceedings of the ACM on Software Engineering, 2024, [Paper]
- OpenHands: An Open Platform for AI Software Developers as Generalist Agents, ICLR 2025, [Paper]
- Moatless Tools, 2024, [Project]
- Mini-swe-agent, 2025, [Project]
- Cursor [Project]
- GitHub Copilot [Project]
- TONGYI Lingma [Project]
- Tencent CodeBuddy [Project]
- Trae [Project]
Last updated: Mar 2026