A Python application that uses OpenCV to detect a hand through the webcam and adjusts the system volume based on the distance between two specific landmarks: the thumb and the index finger.
The farther the fingers move from each other, the higher the volume — the closer they are, the lower the volume.
A simple but effective demonstration of gesture-based interaction and computer vision.
- Real-time hand tracking using OpenCV
- Landmark detection to identify thumb and index finger positions
- Distance calculation used to control system volume
- Smooth and intuitive gesture-based interaction
- Works with any standard webcam
- No buttons or GUI — fully gesture-controlled
- Python
- OpenCV – camera capture + hand detection
- Mediapipe Hands
- Numpy – mathematical calculations
- System volume control library
(e.g.,pycawon Windows or equivalent for your OS)
- The webcam feed is read frame-by-frame using OpenCV.
- A hand-tracking model detects 21 hand landmarks in real time.
- Two specific landmarks are extracted:
- Thumb tip
- Index finger tip
- The Euclidean distance between these two landmarks is calculated.
- That distance is mapped to a volume range, typically:
- Small distance → low volume
- Large distance → high volume
- The program sends volume-change commands to the operating system.
- Visual feedback (a line or circle) may be drawn on the video feed to show detection.
This project demonstrates concepts from:
- Real-time computer vision
- Gesture recognition
- Hand landmark processing
- Mapping physical movement to system actions
- Interactive and intuitive user experience design
- Add additional gestures (mute, pause, skip)
- Integrate with media apps (Spotify API, VLC, YouTube)
- Add smoothing filters to make volume transitions even more natural
- Display a UI bar with the current volume level
- Multi-hand controls (left hand = volume, right hand = playback)
Diogo Regadas
GitHub Profile