Lightweight Intent-Focused Engine for Guidance in Unmanned Autonomous Rescue Deployments
LIFEGUARD enables operators to naturally and effectively pass Command Intent to autonomous units, bridging voice commands and machine action for efficient, reliable, and intuitive search and rescue deployments.
LIFEGUARD is a Python-based system that interprets spoken natural language commands, allowing a human operator (Commander) to pass intent and instructions directly to autonomous units (drones/robots). It bridges human intent and autonomous action for field and rescue operations, with robust multi-agent support and advanced command understanding. The system is designed for field-deployable, edge-AI applications where internet connectivity is unreliable or non-existent, prioritizing operational robustness and self-sufficiency.
- Speech-to-Text (STT): Converts live spoken commands to text using the robust, offline Vosk engine.
- Natural Language Understanding (NLU): Extracts intent and entities (GPS, landmarks, targets) using spaCy with custom entity recognizers and a sophisticated spoken number conversion module.
- Multi-Agent MAVLink Integration: Supports multiple MAVLink-compatible vehicles (drones/robots) with seamless agent selection by voice command.
- Waypoint & Grid Search: Translates high-level intent into actionable commands, including waypoint navigation and automatic generation of grid search patterns.
- Audio Preprocessing: Employs noise reduction and band-pass filtering for reliable command recognition in noisy field environments.
- Interactive Confirmation: A critical safety feature that prompts the operator for confirmation before dispatching commands to autonomous units.
- Push-to-Talk (PTT): Via GUI button (press & hold). (Legacy spacebar handler replaced in current GUI launcher.)
- Decoupled Subsystems: Audio Input/PTT, STT, NLU, MAVLink I/O, and TTS Output run in dedicated threads and communicate via thread-safe queues for improved reliability and responsiveness.
- Modular Message-Passing Architecture: Improves reliability, thread safety, and responsiveness for mission-critical operations.
- Cross-Platform: Works on both Windows and Linux operating systems.
The LIFEGUARD system uses a decoupled, multi-threaded message-passing architecture. Subsystems (Audio Input/PTT, STT, NLU, MAVLink I/O, TTS Output) run in dedicated threads and communicate via thread-safe queues. This design improves reliability, thread safety, and responsiveness for mission-critical operations.
- Push-to-Talk: The operator presses and holds the GUI "Push to Talk" button. Audio is captured only while held.
- Audio Preprocessing: The captured audio is processed for noise reduction and bandpass filtering to isolate speech frequencies.
- Speech Recognition: The cleaned audio is transcribed into text in real-time by the offline Vosk STT engine.
- Intent Extraction: spaCy NLU parses the transcribed text to identify intent and extract entities, including spoken number conversion.
- Agent Selection: Voice commands can select the active MAVLink agent (drone/robot). The selected agent is retained for subsequent commands.
- Command Translation: Valid intents are mapped to MAVLink commands (e.g., grid search mission upload).
- User Confirmation: The system verbalizes its interpretation and asks for confirmation before executing safety-critical actions.
- Action Execution: Upon confirmation, commands are sent to the selected autonomous unit for execution.
The system is built on a foundation of powerful, open-source libraries chosen for their performance and suitability for offline, real-time applications.
| Component/Library | Role in LIFEGUARD |
|---|---|
| Vosk | Offline Speech-to-Text (STT) Engine |
| spaCy | Natural Language Understanding (NLU) & Entity Extraction |
| pymavlink | MAVLink Protocol Communication with Autonomous Units |
| noisereduce | Real-time Audio Noise Reduction |
| pynput | (Legacy) keyboard event handling (may be unused with GUI button) |
| customtkinter / Pillow | GUI framework and image assets |
| Threading/Queues | Decoupled subsystem communication and message routing |
| PyAudio | Microphone Audio Input Stream |
| scipy, numpy | Signal Processing & Numerical Operations |
| pyttsx3, word2number | Text-to-Speech (for confirmation) & Number Conversion |
The NLU model is trained to recognize a specific grammar of commands tailored for search and rescue operations.
| Voice Command Example | Extracted Intent | Key Entities Extracted |
|---|---|---|
| "Search a 100 meter grid at latitude 38.99 longitude -76.48." | REQUEST_GRID_SEARCH |
grid_size_meters=100, latitude, longitude |
| "Search at latitude 38.99 longitude -76.48 for a person in a life ring." | COMBINED_SEARCH_AND_TARGET or REQUEST_SEARCH_AT_LOCATION |
latitude, longitude, target_description_full |
| "Fly to latitude 38.99 longitude -76.48." | REQUEST_FLY_TO |
latitude, longitude |
| "Select drone two." | SELECT_AGENT |
selected_agent_id=agent2 |
| "Set altitude to fifty meters." | SET_AGENT_ALTITUDE |
altitude_meters=50 |
| "Yes." / "No." (after prompt) | (Handled by confirmation mode) | (No separate intent label) |
Note: Implemented intents now include SELECT_AGENT, REQUEST_GRID_SEARCH, REQUEST_SEARCH_AT_LOCATION, REQUEST_FLY_TO, COMBINED_SEARCH_AND_TARGET, SET_AGENT_ALTITUDE, and PROVIDE_TARGET_DESCRIPTION (fallback when only a target description is given). Yes/No confirmations are handled by a coordinator confirmation mode (no explicit confirmation intent constant).
- Python 3.9+ (matches
pyproject.toml) - Vosk version 0.3.45
- spaCy version 3.7.5 with
en_core_web_smmodel - PyAudio (microphone input)
- noisereduce
- pymavlink
- numpy <2.0
- scipy <1.14
- pynput (optional / legacy)
- pyttsx3
- word2number
- customtkinter (GUI)
- tkintermapview (map display & offline tile caching)
- Pillow (logo/icon / image assets)
-
(Recommended) Create and Activate a Virtual Environment
python3 -m venv lifeguard_bot
On Linux/macOS:
source lifeguard_bot/bin/activateOn Windows:
lifeguard_bot\Scripts\activate
-
Clone the Repository
git clone https://github.com/John-Sarge/LIFEGUARD.git cd LIFEGUARD -
Install Dependencies The provided script will install core Python packages (those listed in
pyproject.toml).# Make the script executable on Linux/macOS chmod +x setup.sh ./setup.sh
If the GUI reports missing modules (e.g. ModuleNotFoundError: customtkinter or tkintermapview), install them manually:
pip install customtkinter tkintermapview Pillow-
(Linux Only) Install Audio Dependencies PyAudio on Linux requires the PortAudio development libraries.
sudo apt-get update && sudo apt-get install portaudio19-dev python3-pyaudio -
Download Vosk Model
- Download a Vosk English model (e.g.,
vosk-model-small-en-us-0.15) from the Vosk website. - Unzip the model and place the contents into the
vosk_models/directory. The final path should look likevosk_models/vosk-model-small-en-us-0.15/.
- Download a Vosk English model (e.g.,
-
Connect MAVLink Vehicle(s)
- Connect your MAVLink-compatible vehicle(s) via USB, telemetry radio, or network.
- Alternatively, start a simulator such as SITL or Gazebo.
- Edit
config.json(auto-created on first run) or use the GUI Settings dialog (Agents tab) to adjust connection strings.
- Run the Application
GUI launcher (recommended for current PTT workflow and map display — tiles cached to map_cache.db):
python run_gui.pyHeadless/CLI (legacy entry point):
python -m lifeguardEnsure your PYTHONPATH includes src/ or install in editable mode (pip install -e .).
- Issue a Voice Command
- Press and hold the GUI Push to Talk button.
- Speak a command clearly (e.g., "Search a 100 meter grid at latitude 41.37 longitude -72.09.").
- Release the button.
-
Confirm the Intent
- The system will transcribe your command, display its interpretation, and ask for verbal confirmation.
- Respond with "yes" to send the command or "no" to cancel.
-
Switch Agents (if applicable)
- To direct a command to a different vehicle, first issue a selection command.
- Example: "Select drone two."
- The system will confirm the agent switch, and all subsequent commands will be sent to that agent until a new one is selected.
Key extension points:
- Commands & NLU: Update patterns/logic in
src/lifeguard/components/nlu.py. - Agents: Adjust via
config.json(or GUI Settings); environment overrides also supported viasettings.py. - Audio & STT: Tune pre-processing in
src/lifeguard/utils/audio_processing.pyand STT insrc/lifeguard/components/stt.py.
- Vosk model not found: Ensure the model has been downloaded, unzipped, and placed in the correct
vosk_models/directory. The path in the script must match the folder name. - PyAudio errors on startup: Check that your microphone is connected and recognized by the operating system. On Linux, ensure the
portaudio19-devpackage is installed. - MAVLink connection issues: Verify your connection strings in
AGENT_CONNECTION_CONFIGS. Ensure the vehicle/simulator is running and that the specified port (e.g.,/dev/ttyACM0,udp:127.0.0.1:14550) is correct and not in use by another application.
A small simulator script, agent1_grid_sim.py, is included to exercise the LIFEGUARD workflow without a full field setup.
- Note: Had to switch from using mavproxy on agents as the STATUSTEXT messages used to send target messages was being silently consumed by the autopilot, using mavlink-router now (https://github.com/alireza787b/mavlink-anywhere).
- What it does: Connects to a MAVLink endpoint (e.g., SITL) and, once a mission is running in AUTO, reports a simulated find by sending a STATUSTEXT message:
FOUND:<lat>,<lon>at a waypoint roughly in the middle of the mission. - How it works: It infers mission readiness from STATUSTEXT, can pull the mission from the vehicle to get waypoint coordinates, waits for
MISSION_ITEM_REACHED, and then emits theFOUNDmessage. - Why it’s useful: LIFEGUARD listens for
FOUND:messages to automatically dispatch another available agent to verify the reported location. - Quick start: Adjust the connection string at the bottom of the desired simulator file (e.g.,
agent1_grid_sim.py) for your SITL/vehicle (e.g.,tcp:127.0.0.1:5763) and run it in parallel while following the normal README flow. (add "--handshake-test --duration 30" to verify STATUSTEXT message functionality in hardware)
- Ensure you have
setuptoolsandwheelinstalled:
pip install setuptools wheel- Build the package:
python -m build- Install the package locally:
pip install dist/lifeguard-*.whl- Ensure you have
twineinstalled:
pip install twine- Upload the package:
twine upload dist/*Refer to PyPI packaging guide for more details.
This project stands on the shoulders of giants and is made possible by the incredible work of the open-source community, especially the teams behind Vosk, spaCy, and pymavlink.

