Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,78 @@ <h3>Smart Event Detection for Highlight Clips</h3>
</label>
</div>
</article>





<article class="project-card">
<div class="teaser" role="img" aria-label="Open-vocabulary tracking project.">
<img src="assets/group_X.png" alt="Two segmented puppies in a park" style="position:absolute; inset:0; width:100%; height:100%; object-fit:cover; z-index:2;">
<span class="teaser-label" style="z-index:3;">Group X</span>
</div>
<div class="project-content">
<p class="project-meta">Object detection, segmentation, tracking, vision-language models</p>
<h3>Open-Vocabulary Object Tracking with Grounding DINO, SAM 2 and CLIP</h3>
<p class="project-abstract">
We present an open-vocabulary object tracking system that enables users to search, segment, and track arbitrary objects in images and videos using natural language queries.
<br><br>
Our pipeline combines Grounding DINO for text-conditioned object detection, CLIP for semantic verification, and SAM 2 for segmentation and temporal tracking.
<br><br>
The system supports interactive querying through a Gradio web interface and demonstrates how modern vision foundation models can be integrated into a unified visual understanding pipeline.
</p>
<label class="project-toggle-label">
<input class="project-toggle" type="checkbox" aria-label="Toggle full project pitch">
<span class="project-toggle-more">Read more</span>
<span class="project-toggle-less">Show less</span>
</label>
</div>
</article>














































<article class="project-card">
<div class="teaser" role="img" aria-label="Image retrieval with CLIP.">
Expand Down
Loading