This repository aims to be the most comprehensive collection of materials (papers, codes, datasets, demos) about SAM2 (Segment Anything in Images and Videos), Meta AI's groundbreaking vision foundation model.
SAM2 represents a significant advancement in computer vision, extending the capabilities of the original SAM to handle both images and videos with unprecedented accuracy and efficiency. This curated list covers the rapidly expanding ecosystem of SAM2 applications across diverse domains - from medical imaging to robotics, from 3D reconstruction to video generation.
🔥 Why SAM2?
SAM2 has revolutionized segmentation tasks by:
- Providing unified image and video segmentation capabilities
- Enabling zero-shot generalization across domains
- Offering efficient real-time processing
- Supporting diverse prompting mechanisms (points, boxes, masks, text)
📈 Repository Stats: Currently tracking 500+ papers and projects across 15+ domains
🤝 Contributing: We continuously improve this collection. Feel free to submit PRs for missed works, corrections, or new categories!
This repo aims to include materials (papers, codes, slides) about SAM2 (segment anything in images and videos), a vision foundation model released by Meta AI . We are continuously improving the project. Welcome to PR the works (papers, repos) that are missed.
- SAM2 [🔗 Code | 🖥️ Demo | 📖 Explanation]
- SAM [🔗 Code | 🖥️ Demo | 📖 Explanation]
| Domain | Papers | Key Highlights |
|---|---|---|
| 🏥 Medical | 80+ | Surgery, 3D medical imaging, pathology |
| 🎬 Video Processing | 70+ | Object tracking, temporal consistency |
| 🤖 Robotics | 50+ | Manipulation, navigation, human-robot interaction |
| 🛰️ Remote Sensing | 20+ | Satellite imagery, environmental monitoring |
| 🎨 Generation/Editing | 35+ | Video synthesis, image editing, creative tools |
| 🧊 3D Processing | 45+ | Point clouds, mesh processing, reconstruction |
| 🎯 Core Segmentation | 60+ | Novel applications and improvements |
| Total | 500+ | Across 15+ domains |
💡 Quick Navigation: Use
Ctrl+Fto search for specific keywords, or click on the Table of Contents links above for domain-specific papers.
- 📚 Surveys & Reviews
- 🎯 Traditional Segmentation Tasks
- 🏥 Medical Domain
- 🎭 Camouflaged Object Detection (COD)
- 🔊 Audio-visual Segmentation (AVS)
- 🛰️ Remote Sensing
- 🧊 3D Processing & Point Clouds
- 📊 Graph Learning
- 🎨 Image/Video Generation & Editing
- 🗺️ SLAM & Visual Odometry
- 💡 Light Field Segmentation
- 🤖 Robotics
- ⚡ Efficiency & Edge Computing
- 📖 Training & Learning
- 📊 Evaluation & Benchmarking
- 🛡️ Robustness & Security
- 🌟 Unique Applications
Comprehensive overviews and systematic analyses of SAM2 applications across various domains
Core image and video segmentation applications, including novel architectures and domain-specific adaptations
Temporal segmentation, object tracking, and video understanding applications
Multi-modal approaches combining audio and visual information for segmentation
Scene graph generation and graph-based reasoning with SAM2
| Release | Title | Code |
|---|---|---|
| 2025.03 | Universal Scene Graph Generation | 🌐Project page |
Healthcare applications including surgery, diagnostics, and biomedical research
Detecting and segmenting objects that blend with their surroundings
Satellite imagery analysis, environmental monitoring, and geospatial applications
Three-dimensional data processing, reconstruction, and analysis
Creative applications including content generation, editing, and artistic tools
Navigation, mapping, and localization applications
Advanced imaging techniques and multi-dimensional visual processing
| Release | Title | Code |
|---|---|---|
| 2024.11 | Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting | 🔗 Code |
Autonomous systems, manipulation, navigation, and human-robot interaction
Efficiency optimizations, model compression, and deployment on resource-constrained devices
Resources for model training, datasets, and learning frameworks
Curated datasets and benchmarks for SAM2 training and evaluation
Tools and methods for data synthesis, augmentation, and preprocessing
Supporting tools and frameworks for model training and fine-tuning
Benchmarking, evaluation metrics, and comparative studies
Security, adversarial robustness, and reliability studies
| Release | Title | Code |
|---|---|---|
| 2025.04 | Robust SAM: On the Adversarial Robustness of Vision Foundation Models | NA |
Novel and creative applications that don't fit traditional categories
We welcome contributions from the community! Here's how you can help improve this repository:
- Submit a Pull Request with new papers/projects
- Report Issues for broken links or incorrect information
- Suggest New Categories for emerging SAM2 applications
- Improve Organization by suggesting better categorization
- Format: Follow the existing table format with Release Date | Title | Code links
- Quality: Include only peer-reviewed papers or significant projects
- Recency: Focus on 2024+ publications (SAM2 era)
- Completeness: Provide accurate metadata and working links when available
- Categories: Place papers in the most appropriate domain category
| YYYY.MM | [Paper Title](paper_link) | [🔗 Code](code_link) / [🌐Project page](project_link) / NA |- 🔗
[🔗 Code]- Source code repositories - 🌐
[🌐Project page]- Official project websites - 📊
[📊Data]- Datasets and benchmarks - 🖥️
[🖥️Demo]- Interactive demonstrations - 📖
[📖Repo]- Documentation repositories - 🕒
🕒Soon- Coming soon NA- Not available
- Total Papers: 500+ and growing
- Domains Covered: 15+ major application areas
- Time Range: 2024-2025 (SAM2 era)
- Update Frequency: Weekly additions
- Community: Open for contributions
Special thanks to:
- Meta AI for releasing SAM2 and advancing the field
- Research Community for the incredible pace of SAM2 adoption and innovation
- Contributors who help maintain and improve this collection
- Reviewers who ensure quality and accuracy
This collection is maintained under the MIT License. Individual papers and projects retain their original licenses.
Last updated: August 2025 | Maintained by the Community