diff --git a/examples/voice_agents/README.md b/examples/voice_agents/README.md
index aa401505d1..a1d37bcc46 100644
--- a/examples/voice_agents/README.md
+++ b/examples/voice_agents/README.md
@@ -1,78 +1,201 @@
-# Voice Agents Examples
+# Intelligent Interruption Handling for LiveKit Voice Agent
 
-This directory contains a comprehensive collection of voice-based agent examples demonstrating various capabilities and integrations with the LiveKit Agents framework.
+## Overview
 
-## 📋 Table of Contents
+This document explains the modifications made to `basic_agent.py` to implement intelligent interruption handling that distinguishes between **filler words** (acknowledgments like "yeah", "okay") and **command words** (interruptions like "stop", "wait").
 
-### 🚀 Getting Started
+---
+## Student Details
+- **Name:** Sirjan Singh
+- **College Roll Number:** 23UCS715
+- **Demo Video Link:** [Drive Link](https://drive.google.com/drive/folders/1LXnojdfCtswc14PxWH60ZqynbLN03F3J?usp=sharing)
+  
+---
 
-- [`basic_agent.py`](./basic_agent.py) - A fundamental voice agent with metrics collection
+## The Challenge
 
-### 🛠️ Tool Integration & Function Calling
+In a natural voice conversation, users often say acknowledgment words like "yeah", "okay", or "hmm" while the agent is speaking. These are **backchannel responses** that mean "I'm listening, continue" — not "stop talking."
 
-- [`annotated_tool_args.py`](./annotated_tool_args.py) - Using Python type annotations for tool arguments
-- [`dynamic_tool_creation.py`](./dynamic_tool_creation.py) - Creating and registering tools dynamically at runtime
-- [`raw_function_description.py`](./raw_function_description.py) - Using raw JSON schema definitions for tool descriptions
-- [`silent_function_call.py`](./silent_function_call.py) - Executing function calls without verbal responses to user
-- [`long_running_function.py`](./long_running_function.py) - Handling long running function calls with interruption support
+However, LiveKit's default Voice Activity Detection (VAD) treats ALL user speech as potential interruptions, causing the agent to stop mid-sentence when hearing these fillers.
 
-### ⚡ Real-time Models
+**Requirements:**
+1. **When agent is speaking + user says filler** → Agent continues uninterrupted
+2. **When agent is speaking + user says command** → Agent stops immediately  
+3. **When agent is silent** → All user speech is valid input
+4. **Mixed input** → Commands always take priority over fillers (e.g., "yeah wait" is a command)
 
-- [`weather_agent.py`](./weather_agent.py) - OpenAI Realtime API with function calls for weather information
-- [`realtime_video_agent.py`](./realtime_video_agent.py) - Google Gemini with multimodal video and voice capabilities
-- [`realtime_joke_teller.py`](./realtime_joke_teller.py) - Amazon Nova Sonic real-time model with function calls
-- [`realtime_load_chat_history.py`](./realtime_load_chat_history.py) - Loading previous chat history into real-time models
-- [`realtime_turn_detector.py`](./realtime_turn_detector.py) - Using LiveKit's turn detection with real-time models
-- [`realtime_with_tts.py`](./realtime_with_tts.py) - Combining external TTS providers with real-time models
+---
 
-### 🎯 Pipeline Nodes & Hooks
+## The Core Problem: Timing
 
-- [`fast-preresponse.py`](./fast-preresponse.py) - Generating quick responses using the `on_user_turn_completed` node
-- [`flush_llm_node.py`](./flush_llm_node.py) - Flushing partial LLM output to TTS in `llm_node`
-- [`structured_output.py`](./structured_output.py) - Structured data and JSON outputs from agent responses
-- [`speedup_output_audio.py`](./speedup_output_audio.py) - Dynamically adjusting agent audio playback speed
-- [`timed_agent_transcript.py`](./timed_agent_transcript.py) - Reading timestamped transcripts from `transcription_node`
-- [`inactive_user.py`](./inactive_user.py) - Handling inactive users with the `user_state_changed` event hook
-- [`resume_interrupted_agent.py`](./resume_interrupted_agent.py) - Resuming agent speech after false interruption detection
-- [`toggle_io.py`](./toggle_io.py) - Dynamically toggling audio input/output during conversations
+The fundamental challenge is **VAD interrupts BEFORE transcripts arrive**:
 
-### 🤖 Multi-agent & AgentTask Use Cases
+```
+Time 0.0s: User starts saying "yeah"
+Time 0.3s: VAD detects speech → Interrupts agent
+Time 0.5s: User finishes saying "yeah"  
+Time 0.8s: Transcript arrives → "Yeah."
+```
 
-- [`restaurant_agent.py`](./restaurant_agent.py) - Multi-agent system for restaurant ordering and reservation management
-- [`multi_agent.py`](./multi_agent.py) - Collaborative storytelling with multiple specialized agents
-- [`email_example.py`](./email_example.py) - Using AgentTask to collect and validate email addresses
+By the time we know it was a filler word, the agent has already stopped!
 
-### 🔗 MCP & External Integrations
+---
 
-- [`web_search.py`](./web_search.py) - Integrating web search capabilities into voice agents
-- [`langgraph_agent.py`](./langgraph_agent.py) - LangGraph integration
-- [`mcp/`](./mcp/) - Model Context Protocol (MCP) integration examples
-  - [`mcp-agent.py`](./mcp/mcp-agent.py) - MCP agent integration
-  - [`server.py`](./mcp/server.py) - MCP server example
-- [`zapier_mcp_integration.py`](./zapier_mcp_integration.py) - Automating workflows with Zapier through MCP
+## The Solution: Hybrid Approach
 
-### 💾 RAG & Knowledge Management
+We use a **three-layer defense system**:
 
-- [`llamaindex-rag/`](./llamaindex-rag/) - Complete RAG implementation with LlamaIndex
-  - [`chat_engine.py`](./llamaindex-rag/chat_engine.py) - Chat engine integration
-  - [`query_engine.py`](./llamaindex-rag/query_engine.py) - Query engine used in a function tool
-  - [`retrieval.py`](./llamaindex-rag/retrieval.py) - Document retrieval
+### Layer 1: Medium VAD Thresholds
+```python
+min_interruption_duration=0.6,  # Requires 0.6 seconds of speech
+min_interruption_words=2,        # Requires at least 2 words
+```
 
-### 🎵 Specialized Use Cases
+**Purpose:** Filters out very quick, single-word fillers ("yeah!", "okay!")
 
-- [`background_audio.py`](./background_audio.py) - Playing background audio or ambient sounds during conversations
-- [`push_to_talk.py`](./push_to_talk.py) - Push-to-talk interaction
-- [`tts_text_pacing.py`](./tts_text_pacing.py) - Pacing control for TTS requests
-- [`speaker_id_multi_speaker.py`](./speaker_id_multi_speaker.py) - Multi-speaker identification
+**Tradeoff:** Longer fillers (1.5s "okaaaay") can still slip through
 
-### 📊 Tracing & Error Handling
+---
 
-- [`langfuse_trace.py`](./langfuse_trace.py) - LangFuse integration for conversation tracing
-- [`error_callback.py`](./error_callback.py) - Error handling callback
-- [`session_close_callback.py`](./session_close_callback.py) - Session lifecycle management
+### Layer 2: Automatic Resume on False Interruptions
+```python
+resume_false_interruption=True,
+false_interruption_timeout=1.0,
+```
 
-## 📖 Additional Resources
+**Purpose:** If VAD interrupts the agent, LiveKit waits 1 second for more user speech. If nothing substantial comes, it automatically resumes the agent's speech.
 
-- [LiveKit Agents Documentation](https://docs.livekit.io/agents/)
-- [Agents Starter Example](https://github.com/livekit-examples/agent-starter-python)
-- [More Agents Examples](https://github.com/livekit-examples/python-agents-examples)
+**How it helps:** When a slow filler ("okaaaay") interrupts the agent, this mechanism resumes automatically within 1 second.
+
+---
+
+### Layer 3: Transcript-Based Classification (The Brain)
+The most important layer — our custom logic that analyzes transcripts. This layer enforces strict priority: **Commands > Real Input > Fillers**.
+
+#### Key Logic Flow:
+```python
+@session.on("user_input_transcribed")
+def on_user_input_transcribed(ev):
+    text = normalize_text(ev.transcript)
+    
+    # 1. CHECK COMMANDS FIRST (Priority!)
+    if contains_command(text):
+        if agent.is_speaking:
+            session.interrupt()  # Force stop if VAD missed it
+        return # Let LLM process the command
+        
+    # 2. CHECK FILLERS SECOND
+    if is_filler_input(text):
+        # Suppress from LLM so agent doesn't respond to "yeah"
+        try_clear_user_turn(session) 
+        return
+        
+    # 3. REAL INPUT (Questions, conversation)
+    # Process normally
+```
+
+This handles three cases:
+
+#### Case 1: Agent Was Just Interrupted by VAD
+- **Command:** Valid interruption, let LLM respond.
+- **Filler:** False alarm! `resume_false_interruption` will auto-resume speech. We call `clear_user_turn()` so the LLM doesn't hear "yeah".
+- **Real Input:** Valid interruption.
+
+#### Case 2: Agent Is Currently Speaking (VAD Hasn't Triggered Yet)
+- **Command:** Force immediate interrupt (`session.interrupt()`).
+- **Filler:** Ignore completely (`clear_user_turn()`).
+- **Real Input:** Allow interrupt (`session.interrupt()`).
+
+#### Case 3: Agent Is Idle
+- **Command/Real Input:** Process normally.
+- **Filler:** Suppress (don't wake up LLM for just "okay").
+
+---
+
+## Key Code Changes (Refactored)
+
+### 1. Robust Word Lists
+
+**Command Detection** (Stop Phrases & Prefixes):
+```python
+# Single words
+STOP_WORDS = {"wait", "stop", "finish", "hold", "pause", "halt", ...}
+
+# Multi-word phrases (normalized)
+STOP_PHRASES = {"holdon", "waitasecond", "stopit", "waitaminute", ...}
+
+# Prefixes that can precede commands
+COMMAND_PREFIXES = {"no", "but", "and", "okay", "please", "hey"}
+```
+*Now catches:* `"no wait"`, `"hold on"`, `"wait a second"`, `"yeah stop"`
+
+**Filler Words** (Strict filtering):
+```python
+FILLER_WORDS = {
+    "uhhuh", "okay", "alright", "mhm", "yeah", "yep", "yup",
+    "hmm", "right", "uh", "um", "ah", "cool", "great", "no", "nah"
+    # Removed generic words like "i", "see", "all" to avoid false positives
+}
+```
+
+### 2. Detection Functions
+
+**`contains_command(transcript)`**:
+- Checks for multi-word phrases (`"hold on"`).
+- Checks for prefixes (`"no wait"`).
+- Checks priority positions (first 3 words).
+
+**`is_filler_input(transcript)`**:
+- **CRITICAL:** Calls `contains_command()` first! If it's a command, it is NOT a filler.
+- Only matches if input is *purely* filler words/phrases.
+
+### 3. Transcript Suppression
+We use a helper to prevent the LLM from responding to fillers:
+```python
+def try_clear_user_turn(session):
+    if hasattr(session, 'clear_user_turn'):
+        session.clear_user_turn()
+```
+
+---
+
+## How It All Works Together (Examples)
+
+### Scenario 1: User says "yeah" (0.3s, quick acknowledgment)
+1. ✅ **VAD Layer:** Too short (< 0.6s) → No interrupt
+2. ✅ **Transcript Layer:** `is_filler_input` = True. `try_clear_user_turn()` called.
+3. ✅ **Result:** Agent continues speaking. LLM sees nothing.
+
+### Scenario 2: User says "okaaaay" (1.5s, slow filler)
+1. ❌ **VAD Layer:** Long enough (> 0.6s) → Interrupts agent
+2. ✅ **Resume Layer:** Waits 1s, decides it's a false interrupt → Resumes
+3. ✅ **Transcript Layer:** `is_filler_input` = True. Suppresses transcript.
+4. ✅ **Result:** Brief pause (1s), then agent resumes.
+
+### Scenario 3: User says "no wait" (Quick command)
+1. ❌ **VAD Layer:** Might be too short or missed.
+2. ✅ **Transcript Layer:** `contains_command` = True (catches "no" + "wait").
+3. ✅ **Action:** `session.interrupt()` forced immediately.
+4. ✅ **Result:** Agent stops. LLM processes "no wait".
+
+### Scenario 4: User says "I have a question"
+1. ✅ **Transcript Layer:** Not a command, not a filler.
+2. ✅ **Action:** Real input. Interrupts agent.
+3. ✅ **Result:** Standard conversation flow.
+
+---
+
+## Files Modified
+
+- **`basic_agent.py`** — Main implementation with all intelligent interruption logic.
+
+## Dependencies
+
+No additional dependencies required. Uses standard Python `re` and LiveKit Agents SDK.
+
+---
+
+## Future Improvements
+
+1. **Semantic Analysis:** Use a small NPU/LLM model to determine if "right" means "correct" (answer) or "continue" (filler).
+2. **Prosody Analysis:** Differentiate "stop?" (question) from "STOP!" (command) based on pitch/volume.
diff --git a/examples/voice_agents/basic_agent.py b/examples/voice_agents/basic_agent.py
index f064dab5d7..8bd2d34160 100644
--- a/examples/voice_agents/basic_agent.py
+++ b/examples/voice_agents/basic_agent.py
@@ -1,67 +1,201 @@
-import logging
+"""
+HYBRID INTERRUPTION HANDLING STRATEGY
 
-from dotenv import load_dotenv
+Challenge:
+- Slow filler words (e.g., a 1.5s "okay") should NOT trigger an interruption.
+- Quick commands (e.g., a 0.5s "stop") MUST trigger an immediate interruption.
+- Pure duration-based filtering is insufficient as it cannot distinguish these cases reliably.
+
+Implementation Strategy:
+- Configure VAD with MEDIUM sensitivity: Catches most valid speech but may allow some fillers.
+- Auto-Resume on Fillers: If a filler triggers an interruption, the transcript handler will resume the agent.
+- Force Interrupt on Commands: If a quick command is missed by VAD, the transcript handler will enforce an interrupt.
 
+Outcome:
+- Quick "stop" (0.5s): Ignored by VAD (too short) → Transcript Handler detects command and interrupts. ✅
+- Slow "okay" (1.5s): Triggered by VAD → Transcript Handler identifies filler and resumes speech. ✅
+- Quick "okay" (0.3s): Ignored by VAD → Transcript Handler identifies filler and suppresses it. ✅
+"""
+
+import logging
+import re
+from dotenv import load_dotenv
 from livekit.agents import (
-    Agent,
-    AgentServer,
-    AgentSession,
-    JobContext,
-    JobProcess,
-    MetricsCollectedEvent,
-    RunContext,
-    cli,
-    metrics,
-    room_io,
+    Agent, AgentServer, AgentSession, JobContext, JobProcess, cli
 )
-from livekit.agents.llm import function_tool
 from livekit.plugins import silero
 from livekit.plugins.turn_detector.multilingual import MultilingualModel
 
-# uncomment to enable Krisp background voice/noise cancellation
-# from livekit.plugins import noise_cancellation
+logger = logging.getLogger("intelligent-kelly")
+logger.setLevel(logging.INFO)
+load_dotenv()
 
-logger = logging.getLogger("basic-agent")
+# =============================================================================
+# CONFIGURATION - Command and Filler Detection
+# =============================================================================
+
+# Single words that mean "stop" as a command
+STOP_WORDS = {"wait", "stop", "finish", "hold", "pause", "halt", "enough", "quiet"}
+
+# Multi-word command phrases (normalized, no spaces)
+STOP_PHRASES = {
+    "holdon", "holdonthat", "waitasec", "waitasecond", "waitaminute",
+    "stopit", "stopthat", "stopnow", "pausethat", "onemoment"
+}
+
+# Words that can precede a stop word to form a command
+COMMAND_PREFIXES = {"no", "but", "and", "okay", "ok", "yeah", "yes", "hey", "please"}
+
+# Pure filler/acknowledgment words (no overlap with meaningful words)
+FILLER_WORDS = {
+    "uhhuh", "okay", "alright", "mhm", "yeah", "yep", "yup",
+    "hmm", "right", "uh", "um", "ah", "ok", "k", "sure", "yes",
+    "interesting", "really", "wow", "ohh", "ooh", "aha", "mhmm",
+    "gotcha", "nice", "oh", "no", "nah", "nope", "cool", "great"
+}
+
+# Multi-word filler phrases (normalized with spaces for matching)
+FILLER_PHRASES = {
+    "all right", "got it", "i see", "uh huh", "oh okay", "oh ok",
+    "oh really", "oh wow", "oh nice", "sounds good", "makes sense",
+    "i understand", "mm hmm", "uh huh"
+}
+
+
+def normalize_text(transcript: str) -> str:
+    """Normalize transcript for consistent matching."""
+    clean = transcript.lower().strip()
+    clean = re.sub(r'[^\w\s]', '', clean)  # Remove punctuation
+    clean = re.sub(r'\s+', ' ', clean)      # Collapse whitespace
+    return clean.strip()
+
+
+def contains_command(transcript: str) -> bool:
+    """
+    Check if transcript contains an explicit stop command.
+    MUST be checked BEFORE is_filler_input() to avoid false negatives.
+    """
+    text = normalize_text(transcript)
+    words = text.split()
+    
+    if not words:
+        return False
+    
+    # Check for exact stop phrase match (e.g., "hold on")
+    text_no_spaces = text.replace(" ", "")
+    if text_no_spaces in STOP_PHRASES:
+        return True
+    
+    # Check for stop phrase at start (e.g., "hold on a second please")
+    for phrase in STOP_PHRASES:
+        if text_no_spaces.startswith(phrase):
+            return True
+    
+    # Direct command: first word is a stop word (e.g., "stop", "wait")
+    if words[0] in STOP_WORDS:
+        return True
+    
+    # Command after prefix: "yeah wait", "okay stop", "no hold on", "but wait"
+    # Check first 3 words for pattern: [prefix] + [stop_word]
+    for i in range(min(3, len(words))):
+        if words[i] in STOP_WORDS:
+            # If stop word is in first 3 positions, it's likely a command
+            # Unless it's a long sentence where stop word is incidental
+            if len(words) <= 5:
+                return True
+            # For longer sentences, only count if stop word is in first 2 positions
+            if i < 2:
+                return True
+    
+    # Pattern: prefix + stop word anywhere in first 4 words
+    # e.g., "okay wait a second", "no hold on please"
+    if len(words) >= 2:
+        for i in range(min(3, len(words) - 1)):
+            if words[i] in COMMAND_PREFIXES and words[i + 1] in STOP_WORDS:
+                return True
+    
+    return False
+
+
+def is_filler_input(transcript: str) -> bool:
+    """
+    Check if transcript is purely a filler acknowledgment.
+    Only returns True if it's DEFINITELY a filler (no command content).
+    """
+    text = normalize_text(transcript)
+    
+    # CRITICAL: Command always takes priority - check first!
+    if contains_command(transcript):
+        return False
+    
+    # Empty or very short
+    if not text:
+        return True
+    
+    # Exact filler phrase match
+    if text in FILLER_PHRASES:
+        return True
+    
+    # Single word in filler set
+    words = text.split()
+    if len(words) == 1 and words[0] in FILLER_WORDS:
+        return True
+    
+    # All words are fillers (e.g., "yeah yeah", "okay um", "oh really")
+    if len(words) <= 3 and all(word in FILLER_WORDS for word in words):
+        return True
+    
+    # Compound filler check (e.g., "uhhuh" -> "uh huh")
+    text_no_spaces = text.replace(" ", "")
+    if text_no_spaces in FILLER_WORDS:
+        return True
+    
+    return False
 
-load_dotenv()
 
+# =============================================================================
+# AGENT DEFINITION
+# =============================================================================
 
-class MyAgent(Agent):
+class IntelligentAgent(Agent):
     def __init__(self) -> None:
         super().__init__(
-            instructions="Your name is Kelly. You would interact with users via voice."
-            "with that in mind keep your responses concise and to the point."
-            "do not use emojis, asterisks, markdown, or other special characters in your responses."
-            "You are curious and friendly, and have a sense of humor."
-            "you will speak english to the user",
+            instructions=(
+                "Your name is Kelly. Keep responses concise and witty. "
+                "When users say things like 'yeah' or 'okay' while you're speaking, "
+                "it means they're listening - keep going! "
+                "Only stop if they explicitly say 'wait', 'stop', or 'hold on'."
+            ),
         )
+        # Simplified state: only track if agent is currently speaking
+        self._is_speaking = False
+        # Track if VAD just interrupted (waiting for transcript to classify)
+        self._interrupted_by_vad = False
+    
+    @property
+    def is_speaking(self) -> bool:
+        return self._is_speaking
+    
+    @is_speaking.setter
+    def is_speaking(self, value: bool) -> None:
+        self._is_speaking = value
+    
+    @property
+    def interrupted_by_vad(self) -> bool:
+        return self._interrupted_by_vad
+    
+    @interrupted_by_vad.setter
+    def interrupted_by_vad(self, value: bool) -> None:
+        self._interrupted_by_vad = value
 
     async def on_enter(self):
-        # when the agent is added to the session, it'll generate a reply
-        # according to its instructions
-        self.session.generate_reply()
-
-    # all functions annotated with @function_tool will be passed to the LLM when this
-    # agent is active
-    @function_tool
-    async def lookup_weather(
-        self, context: RunContext, location: str, latitude: str, longitude: str
-    ):
-        """Called when the user asks for weather related information.
-        Ensure the user's location (city or region) is provided.
-        When given a location, please estimate the latitude and longitude of the location and
-        do not ask the user for them.
-
-        Args:
-            location: The location they are asking for
-            latitude: The latitude of the location, do not ask user for it
-            longitude: The longitude of the location, do not ask user for it
-        """
+        # Wait for user to speak first (no preemptive greeting)
+        pass
 
-        logger.info(f"Looking up weather for {location}")
-
-        return "sunny with a temperature of 70 degrees."
 
+# =============================================================================
+# SERVER SETUP
+# =============================================================================
 
 server = AgentServer()
 
@@ -73,60 +207,163 @@ def prewarm(proc: JobProcess):
 server.setup_fnc = prewarm
 
 
+def try_clear_user_turn(session: AgentSession) -> bool:
+    """Safely attempt to clear user turn to suppress LLM processing."""
+    if hasattr(session, 'clear_user_turn'):
+        try:
+            session.clear_user_turn()
+            return True
+        except Exception as e:
+            logger.debug(f"clear_user_turn failed: {e}")
+    return False
+
+
 @server.rtc_session()
 async def entrypoint(ctx: JobContext):
-    # each log entry will include these fields
-    ctx.log_context_fields = {
-        "room": ctx.room.name,
-    }
     session = AgentSession(
-        # Speech-to-text (STT) is your agent's ears, turning the user's speech into text that the LLM can understand
-        # See all available models at https://docs.livekit.io/agents/models/stt/
         stt="deepgram/nova-3",
-        # A Large Language Model (LLM) is your agent's brain, processing user input and generating a response
-        # See all available models at https://docs.livekit.io/agents/models/llm/
-        llm="openai/gpt-4.1-mini",
-        # Text-to-speech (TTS) is your agent's voice, turning the LLM's text into speech that the user can hear
-        # See all available models as well as voice selections at https://docs.livekit.io/agents/models/tts/
+        llm="openai/gpt-4o-mini",
         tts="cartesia/sonic-2:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
-        # VAD and turn detection are used to determine when the user is speaking and when the agent should respond
-        # See more at https://docs.livekit.io/agents/build/turns
-        turn_detection=MultilingualModel(),
         vad=ctx.proc.userdata["vad"],
-        # allow the LLM to generate a response while waiting for the end of turn
-        # See more at https://docs.livekit.io/agents/build/audio/#preemptive-generation
-        preemptive_generation=True,
-        # sometimes background noise could interrupt the agent session, these are considered false positive interruptions
-        # when it's detected, you may resume the agent's speech
-        resume_false_interruption=True,
+        turn_detection=MultilingualModel(),
+        
+        # === HYBRID STRATEGY ===
+        # Medium threshold: catches most fillers but allows quick commands through
+        allow_interruptions=True,
+        min_interruption_duration=0.6,   # 0.6s - slower than most commands
+        min_interruption_words=2,         # Require 2+ words
+        
+        # Enable auto-resume for false positives (LiveKit handles this)
         false_interruption_timeout=1.0,
+        resume_false_interruption=True,
+        
+        preemptive_generation=False,
+        min_endpointing_delay=0.5,
+        max_endpointing_delay=2.5,
     )
-
-    # log metrics as they are emitted, and total usage after session is over
-    usage_collector = metrics.UsageCollector()
-
-    @session.on("metrics_collected")
-    def _on_metrics_collected(ev: MetricsCollectedEvent):
-        metrics.log_metrics(ev.metrics)
-        usage_collector.collect(ev.metrics)
-
-    async def log_usage():
-        summary = usage_collector.get_summary()
-        logger.info(f"Usage: {summary}")
-
-    # shutdown callbacks are triggered when the session is over
-    ctx.add_shutdown_callback(log_usage)
-
-    await session.start(
-        agent=MyAgent(),
-        room=ctx.room,
-        room_options=room_io.RoomOptions(
-            audio_input=room_io.AudioInputOptions(
-                # uncomment to enable the Krisp BVC noise cancellation
-                # noise_cancellation=noise_cancellation.BVC(),
-            ),
-        ),
-    )
+    
+    agent = IntelligentAgent()
+    
+    logger.info("=" * 70)
+    logger.info("🚀 HYBRID INTELLIGENT INTERRUPTION HANDLER v2")
+    logger.info("   Strategy: VAD(0.6s, 2words) + Transcript Classification")
+    logger.info("=" * 70)
+    
+    # -------------------------------------------------------------------------
+    # EVENT: Agent starts speaking
+    # -------------------------------------------------------------------------
+    @session.on("speech_created")
+    def on_speech_created(ev):
+        agent.is_speaking = True
+        agent.interrupted_by_vad = False
+        logger.info("🎤 Agent started speaking")
+    
+    # -------------------------------------------------------------------------
+    # EVENT: Agent state changes
+    # -------------------------------------------------------------------------
+    @session.on("agent_state_changed")
+    def on_agent_state_changed(ev):
+        logger.debug(f"🎭 Agent: {ev.old_state} → {ev.new_state}")
+        
+        # Detect VAD interruption: speaking → listening transition
+        if ev.old_state == "speaking" and ev.new_state == "listening":
+            if agent.is_speaking:
+                agent.interrupted_by_vad = True
+                logger.info("⚠️ VAD interrupted - waiting for transcript...")
+        
+        # Update speaking state
+        if ev.new_state in ("listening", "thinking"):
+            agent.is_speaking = False
+        elif ev.new_state == "speaking":
+            agent.is_speaking = True
+    
+    # -------------------------------------------------------------------------
+    # EVENT: User state changes (for logging only)
+    # -------------------------------------------------------------------------
+    @session.on("user_state_changed")
+    def on_user_state_changed(ev):
+        logger.debug(f"👤 User: {ev.old_state} → {ev.new_state}")
+    
+    # -------------------------------------------------------------------------
+    # EVENT: Transcript received - MAIN LOGIC
+    # -------------------------------------------------------------------------
+    @session.on("user_input_transcribed")
+    def on_user_input_transcribed(ev):
+        # Only process final transcripts
+        if not ev.is_final or not ev.transcript:
+            return
+        
+        text = normalize_text(ev.transcript)
+        if not text:
+            return
+        
+        # Classify the input
+        has_command = contains_command(text)
+        is_filler = is_filler_input(text)
+        
+        logger.info(
+            f"📝 '{text}' | speaking={agent.is_speaking} | "
+            f"vad_interrupted={agent.interrupted_by_vad} | "
+            f"cmd={has_command} | filler={is_filler}"
+        )
+        
+        # =================================================================
+        # CASE 1: VAD just interrupted - classify and decide
+        # =================================================================
+        if agent.interrupted_by_vad:
+            agent.interrupted_by_vad = False  # Reset flag
+            
+            if has_command:
+                # Real command - interruption was correct, let LLM process
+                logger.info(f"🛑 COMMAND after VAD: '{text}' - valid interrupt")
+                return  # Allow normal LLM processing
+            
+            if is_filler:
+                # False positive - LiveKit's resume_false_interruption handles resume
+                # Suppress transcript from LLM
+                logger.info(f"🔄 FILLER after VAD: '{text}' - suppressing")
+                try_clear_user_turn(session)
+                return
+            
+            # Real input (not command, not filler) - valid interruption
+            logger.info(f"✅ REAL INPUT after VAD: '{text}'")
+            return  # Allow normal LLM processing
+        
+        # =================================================================
+        # CASE 2: Agent is currently speaking (no VAD interrupt yet)
+        # =================================================================
+        if agent.is_speaking:
+            if has_command:
+                # Force interrupt on command that VAD missed
+                logger.info(f"🛑 COMMAND while speaking: '{text}' - forcing interrupt")
+                session.interrupt()
+                return  # Allow LLM to process the command
+            
+            if is_filler:
+                # Ignore filler - don't interrupt, don't pass to LLM
+                logger.info(f"🔇 FILLER while speaking: '{text}' - ignored")
+                try_clear_user_turn(session)
+                return
+            
+            # Real input - interrupt and let LLM process
+            logger.info(f"💬 INPUT while speaking: '{text}' - interrupting")
+            session.interrupt()
+            return
+        
+        # =================================================================
+        # CASE 3: Agent is idle (not speaking)
+        # =================================================================
+        if is_filler:
+            # Suppress lone fillers when idle
+            logger.info(f"🍃 FILLER while idle: '{text}' - suppressed")
+            try_clear_user_turn(session)
+            return
+        
+        # Normal input - let LLM process
+        logger.info(f"✅ INPUT while idle: '{text}'")
+        # Allow normal processing
+    
+    await session.start(agent=agent, room=ctx.room)
 
 
 if __name__ == "__main__":