Nawfal-AI · Nawfal-AI · May 30, 2026 · May 30, 2026 · Jun 1, 2026
diff --git a/index.html b/index.html
@@ -301,51 +301,6 @@ <h3>Open-Vocabulary Object Tracking with Grounding DINO, SAM 2 and CLIP</h3>
             </div>
           </article>
 
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
           <article class="project-card">
             <div class="teaser" role="img" aria-label="Image retrieval with CLIP.">
               <img src="assets/group_Q.png" alt="Image retrieval preview" style="position:absolute; inset:0; width:100%; height:100%; object-fit:cover; z-index:2;">
@@ -435,7 +390,27 @@ <h3>From Raw Footage to Recipe: Extracting Cooking Steps from Egocentric Video</
               </label>
             </div>
           </article>
-
+
+          <article class="project-card">
+            <div class="teaser" role="img" aria-label="AI image captioning system turning video frames into short action labels.">
+              <img src="assets/group_V.png" alt="Group V image captioning preview" style="position:absolute; inset:0; width:100%; height:100%; object-fit:cover; z-index:2;">
+              <span class="teaser-label" style="z-index:3;">Group V</span>
+            </div>
+            <div class="project-content">
+              <p class="project-meta">Video understanding, vision-language models, action captioning</p>
+              <h3>Action/Event-Focused Captioning: A Three-Model Comparison</h3>
+              <p class="project-abstract">
+                This project explores how pretrained image-captioning models can be adapted to produce short action-focused captions for video activity timelines. Instead of generating long descriptive captions, we fine-tune BLIP, ViT-GPT2, and Microsoft GIT on COCO action captions so that the models output compact labels such as “person walking” or “coffee being poured.”
+                <br><br>
+                For video inference, frames are sampled over time, captioned by the fine-tuned models, and de-duplicated into a simple activity timeline. The project compares original and fine-tuned models using BLEU-1, BLEU-2, METEOR, and ROUGE-L, and analyzes whether architecture choice still matters after all models are adapted to the same action-caption task.
+              </p>
+              <label class="project-toggle-label">
+                <input class="project-toggle" type="checkbox" aria-label="Toggle full project pitch">
+                <span class="project-toggle-more">Read more</span>
+                <span class="project-toggle-less">Show less</span>
+              </label>
+            </div>
+          </article>
 
 
           <article class="project-card add-project-card">