Gestura -- AI-Powered Gesture-Based Medical Screening

HackHive 2026 · Track: Applied AI

Don't just say it. Show it.
Gestura bridges the gap between patient pain and clinical understanding using real-time Computer Vision, 3D visualization, and Generative AI.

🩺 The Problem

Telehealth and traditional medical intake systems suffer from a critical communication gap, especially for vulnerable populations:

Language barriers prevent accurate symptom descriptions
Anatomical ambiguity ("My arm hurts") lacks clinical precision
Loss of physical interaction in remote care removes intuitive pointing and localization
Clinical burden forces doctors to spend time documenting instead of diagnosing

These challenges disproportionately affect: - Non-native speakers - Elderly and pediatric patients - People with speech, hearing, or cognitive impairments - Anxious patients in high-stress medical settings

💡 The Solution

Gestura is a gesture-based medical screening platform that allows patients to communicate pain non-verbally.

Patients simply stand in front of a camera, point to areas of concern on their own body, and use natural gestures to confirm selections. Gestura translates these gestures into structured, clinically meaningful data, enhanced with AI-generated medical summaries.

🔄 User Workflow

Non-Verbal Input
The patient stands in front of a webcam. MediaPipe tracks full-body and hand landmarks in real time.
Gesture-to-Anatomy Mapping
Pointing gestures are mapped to anatomical regions on a 3D digital twin.
Confirmation via Pinch Gesture
A pinch gesture locks and saves a body part. Multiple areas can be saved in one session.
Multilingual Guidance
Voice prompts (ElevenLabs) guide the patient through the process in their native language.
AI Medical Synthesis
Saved data is processed by Google Gemini to generate structured, clinician-ready reports.
Clinical Handoff
Doctors receive a clear pain map and AI-generated summary, improving speed and clarity of care.

🧠 Key Features

🎯 Gesture-Based Body Part Selection

Full-body pose tracking via MediaPipe Holistic
High-precision fingertip tracking
Scale-aware pinch detection (works at varying distances)
Each body part can only be saved once (no duplicates)

🧍 Real-Time Visual Feedback

Color-coded pointer states:
- Blue: pointing
- Yellow: locking in progress
- Red: saved / confirmed
Hover highlights on anatomical regions
"Saved" indicators when revisiting selected areas

🧩 Multi-Part Injury Tracking

Save multiple body parts per session
Undo last saved part
Clear all saved parts
Persistent visual confirmation

🧍‍♂️ 3D Digital Twin Visualization

Body parts map to a neutral 3D human model
Provides spatial clarity for clinicians
Designed for future annotation and heatmap overlays

🗣️ Multilingual Audio Guidance

Powered by ElevenLabs
Supports English, French, Spanish, Mandarin, and Japanese
Improves accessibility for illiterate or visually impaired users

🤖 AI-Generated Medical Reports

Powered by Google Gemini
Converts raw gesture data into:
- Structured clinical summaries
- SOAP-style notes
- Non-diagnostic triage insights
Supports patient-language → doctor-language translation

🏗️ Technical Architecture

Backend

Python + Flask
Threaded computer vision processing
REST APIs for CV state, UI, and AI services

Computer Vision

MediaPipe Holistic
- Pose landmarks → body region mapping
- Hand landmarks → pointer & pinch detection
Custom geometric regions for:
- Limbs
- Torso
- Head and neck
Optimized for real-time performance (≈15--20 FPS)

Frontend

HTML5 + Tailwind CSS
Live MJPEG video streaming
Interactive gesture toolbar
Three.js-powered 3D model rendering

AI & Audio

Google Gemini for report generation
ElevenLabs for multilingual voice synthesis
Session-based issue storage

🧍‍♂️ 3D Digital Twin: Technical Breakdown

The base model (male_lopoly.glb) is loaded using Three.js GLTFLoader
Original textures are overridden with a neutral MeshPhysicalMaterial
- Privacy-first design
- High contrast for pain indicators

Interaction Layer

Anatomical nodes are represented as programmatically generated spheres
Each node has predefined (x, y, z) coordinates
Spheres:
- Glow blue when hovered
- Turn red when locked
- Change color based on pain severity in clinician view

CV ↔ 3D Mapping

2D hand landmark coordinates are projected into screen space
Overlap with 3D node projections registers a "touch"
Enables touchless interaction with medical data

🛠 Tech Stack

Backend: Python 3.10+, Flask
Frontend: HTML5, Tailwind CSS, Three.js
Computer Vision: MediaPipe, OpenCV
AI: Google Gemini API
Audio: ElevenLabs API
Data Processing: NumPy

🚀 Setup & Installation

Prerequisites

Python 3.10+
Webcam
Google Gemini API key
ElevenLabs API key

Installation

python -m venv .venv
source .venv/bin/activate   # macOS/Linux
# .venv\Scripts\activate  # Windows

pip install -r requirements.txt

Environment Configuration

Create a .env file:

GEMINI_API_KEY=your_key
GEMINI_MODEL=gemini-1.5-flash
ELEVENLABS_API_KEY=your_key
DEBUG=True

Run

python app.py

Access at: http://localhost:5050

HackHive2026/ ├── gestura_flask/ │ ├── app.py │ ├── templates/ │ ├── static/ | └── male_boning.glb │ └── male_lopoly.glb ├── cv_adrian/ │ ├── body/ │ ├── interaction/ │ ├── paint/ │ ├── UI/ │ └── vision/ ├── .env # create this ├── .env.example ├── requirements.txt └── README.md

🖼️ Screenshots & Demo

Add screenshots or GIFs here.

🚀 Future Improvements

Temporal pain tracking
Mobile device support
EHR / FHIR integration
PDF and EMR export
Clinical usability testing

👥 Team -- HackHive 2026

AL Muqshith Shifan --- Frontend & Full Stack\
Kevin Christopher Chua --- Frontend & 3D Visualization\
Adrian Fudge --- Computer Vision & Backend\
Alex --- AI Integration & Debugging

Built with ❤️ and ☕ for HackHive 2026. Applying AI to make healthcare more human.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
data		data
gestura_flask		gestura_flask
screenshots		screenshots
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Gestura -- AI-Powered Gesture-Based Medical Screening

HackHive 2026 · Track: Applied AI

🩺 The Problem

💡 The Solution

🔄 User Workflow

🧠 Key Features

🎯 Gesture-Based Body Part Selection

🧍 Real-Time Visual Feedback

🧩 Multi-Part Injury Tracking

🧍‍♂️ 3D Digital Twin Visualization

🗣️ Multilingual Audio Guidance

🤖 AI-Generated Medical Reports

🏗️ Technical Architecture

Backend

Computer Vision

Frontend

AI & Audio

🧍‍♂️ 3D Digital Twin: Technical Breakdown

Interaction Layer

CV ↔ 3D Mapping

🛠 Tech Stack

🚀 Setup & Installation

Prerequisites

Installation

Environment Configuration

Run

🖼️ Screenshots & Demo

🚀 Future Improvements

👥 Team -- HackHive 2026

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages