diff --git a/index.html b/index.html index fbc5691..7ceb546 100644 --- a/index.html +++ b/index.html @@ -273,6 +273,78 @@

Smart Event Detection for Highlight Clips

+ + + + + +
+ +
+

Object detection, segmentation, tracking, vision-language models

+

Open-Vocabulary Object Tracking with Grounding DINO, SAM 2 and CLIP

+

+ We present an open-vocabulary object tracking system that enables users to search, segment, and track arbitrary objects in images and videos using natural language queries. +

+ Our pipeline combines Grounding DINO for text-conditioned object detection, CLIP for semantic verification, and SAM 2 for segmentation and temporal tracking. +

+ The system supports interactive querying through a Gradio web interface and demonstrates how modern vision foundation models can be integrated into a unified visual understanding pipeline. +

+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +