An autonomous AI amateur radio operator that conducts real QSOs (radio conversations) over HF using the GPT-4o Realtime API and wfweb for radio audio and PTT control.
BEFORE USING THIS SOFTWARE, YOU MUST:
- READ THE DISCLAIMER — Contains critical safety and legal information
- Hold a valid amateur radio license — Required by law to operate with transmit capability
- Understand you are the control operator — You are legally responsible for all transmissions
- Review the LICENSE and NOTICE files
🔴 This software uses AI to control radio transmissions. You must actively monitor and supervise all operations at all times.
See DISCLAIMER.md for complete legal notices, liability disclaimers, and regulatory requirements.
AI-DX uses a fully integrated audio pipeline built on GPT-4o Realtime:
wfweb WebSocket (RX audio, 48 kHz)
│
▼
GPT-4o Realtime API ──── server-side VAD + STT + LLM + TTS
│ function calling for contact tracking
▼
TxBuffer (resample 24 kHz → 48 kHz, real-time pacing)
│
▼
wfweb WebSocket (TX audio + PTT)
- No local STT — transcription is handled server-side by GPT-4o Realtime
- No local TTS — synthesis is handled server-side by GPT-4o Realtime
- No local VAD — voice activity detection is server-side (
server_vadmode) - No Hamlib — PTT is controlled via wfweb's
{"cmd":"setPTT","value":true/false} - No PortAudio/sounddevice — all audio I/O goes through wfweb's WebSocket binary frames (production mode)
The model calls update_contact() in real time as it learns information during a QSO:
| Event | Tool call |
|---|---|
| Hears callsign | update_contact(callsign="VK2TDX") |
| Learns name | update_contact(name="John") |
| Learns QTH | update_contact(qth="New South Wales") |
| QSO ends | update_contact(closing=true) → logs to ADIF, returns to CQ |
No text parsing. No regex. No heuristics.
- Python ≥ 3.10, < 3.14
- OpenAI API key with access to
gpt-4o-realtime-preview - wfweb running and accessible — production mode only (provides radio audio + PTT via WebSocket); not required for demo mode
# Clone and install dependencies
uv syncAll settings are via environment variables or a .env file in the project root.
OPENAI_API_KEY=sk-... # OpenAI API key (GPT-4o Realtime access required)
CALLSIGN=W1AW # Your amateur radio callsign
WFWEB_URL=wss://192.168.x.x:8080 # wfweb WebSocket URL — production only (self-signed SSL accepted)YOUR_NAME=Hiram # Your name (spoken during QSOs)
LOCATION="Newington, CT" # Your QTH
ANTENNA="Dipole" # Antenna description
POWER="100W" # Power output
TRANSCEIVER="IC-7300" # Rig nameREALTIME_MODEL=gpt-4o-realtime-preview # Model (default: gpt-4o-realtime-preview)
REALTIME_VOICE=ash # Voice: alloy, ash, ballad, coral, echo, sage, shimmer, verseModel choice:
gpt-4o-realtime-previewis the recommended default. It has significantly better audio comprehension thangpt-realtime-1.5— in particular it handles weak HF signals, phonetic alphabet, and partial callsigns more accurately.gpt-realtime-1.5is available as a lower-cost fallback but noticeably underperforms in noisy radio conditions.
OPERATOR_STYLE=CALLING_CQ # CALLING_CQ | CONTESTING | MONITORING | SWL| Style | Behaviour |
|---|---|
CALLING_CQ |
Calls CQ periodically, engages in casual QSOs |
CONTESTING |
Rapid serial-number exchanges, optimised for contest protocol |
MONITORING |
Listens for direct calls, never initiates. IDs every 5 minutes |
SWL |
Receive-only. No transmit under any circumstances |
VAD_THRESHOLD=0.5 # Server VAD speech probability threshold (0.0–1.0)
VAD_SILENCE_DURATION=0.6 # Seconds of silence to end a turn
CQ_INTERVAL_SEC=30 # Seconds between CQ calls
CQ_RESTART_DELAY_SEC=5 # Delay before restarting CQ after a QSO
WFWEB_CONNECT_TIMEOUT=15 # wfweb connection timeout in secondsLOG_LEVEL=INFO # DEBUG | INFO | WARNING | ERROROPENAI_API_KEY=sk-...
WFWEB_URL=wss://192.168.1.10:8080
CALLSIGN=W1AW
YOUR_NAME=Hiram
LOCATION="Newington, CT"
ANTENNA="Dipole, 40m"
POWER=100W
TRANSCEIVER="IC-7300"
OPERATOR_STYLE=CALLING_CQ
REALTIME_VOICE=ash
LOG_LEVEL=INFO# Normal operation (wfweb radio connection required)
uv run python radio_operator.py
# Demo mode — no radio hardware needed; uses your mic and speakers
uv run python radio_operator.py --demo
uv run python radio_operator.py -d
# Play RX and TX audio locally through your computer's speakers (production mode)
uv run python radio_operator.py --monitor-audio
# Suppress the terminal UI (log to console instead)
uv run python radio_operator.py --no-ui--demo runs the full operator without any radio hardware:
- Mic input → GPT-4o Realtime (server VAD detects when you speak)
- Speaker output → GPT-4o Realtime audio played locally
- Fake frequency — 14.225 MHz (20m USB) shown in the UI
- Fake TX meters — 50 W / SWR 1.3 shown while the model is transmitting
- S-meter driven by actual mic RMS — rises when you speak
- Isolated logs — timestamped
logs/demo_YYYYMMDD_HHMMSS.logand.adifiles so production logs are never touched - UI — full Rich terminal UI with a
⬡ DEMObadge in the header
AI-DX includes an HF radio-themed terminal UI that updates at 10 FPS:
● W1AW Hiram · Newington, CT 14.225.000 MHz 00:42:15 RX:5 TX:12
┌──────────┬──────────────────────────────────────────────────────┬────────────┐
│ │ S 1 3 5 7 9 +20 +40 │ USB │
│ RX │ ████████████████████████░░░░░░░░░░░░ S7 │ │
│ │ ▶ SIGNAL DETECTED │ 14.225.000 │
│ │ │ MHz │
└──────────┴──────────────────────────────────────────────────────┴────────────┘
│ [14:32:01] ► TX CQ CQ CQ de W1AW W1AW, QRZ?
│ [14:32:05] ◄ RX W1AW this is VK2TDX, good afternoon from New South Wales...
│ [14:32:09] ► TX Good afternoon VK2TDX, you are 59 here in Newington, Connecticut...
└─ ◆ IN QSO ─ VK2TDX ─ John ─ New South Wales ─────────────────────────
- PTT indicator — RX / VOICE↑ / ON AIR
- S-meter — live signal strength (S0–S9+60 dB) from wfweb; in demo mode driven by mic RMS
- TX meters — when transmitting: power (watts) + SWR bars; in demo mode shows 50 W / 1.3 SWR
- Mode — USB / LSB / CW / AM / FM from wfweb
- Frequency — VFO frequency from wfweb (14.225 MHz fixed in demo mode)
- Communications log — full RX/TX transcripts, newest first
- QSO bar — current contact: state, callsign, name, QTH
- Demo badge —
⬡ DEMOshown in the header when running with--demo
In production, meter data (S-meter, power, SWR, mode) is streamed from wfweb status messages in near real time.
wfweb communicates over a browser-style WebSocket:
| Direction | Format | Purpose |
|---|---|---|
| Client → Server | {"cmd":"getStatus"} |
Request rig info |
| Client → Server | {"cmd":"enableAudio","value":true} |
Start RX audio stream |
| Client → Server | {"cmd":"setPTT","value":true/false} |
PTT on/off |
| Server → Client | Binary frame 0x02 … |
RX PCM16 audio (48 kHz) |
| Client → Server | Binary frame 0x03 … |
TX PCM16 audio (48 kHz) |
| Server → Client | {"type":"status", "frequency":…, "mode":…, "sMeter":…, …} |
Radio state |
| Server → Client | {"type":"meters", "sMeter":…, "powerMeter":…, "swrMeter":…} |
Meter updates |
Meter value formats (as sent by wfweb):
sMeter— dB relative to S9 (−54 = S0, 0 = S9, +60 = S9+60 dB)powerMeter— watts (0–100+)swrMeter— actual SWR ratio (1.0 = perfect, 2.0 = 2:1, etc.)
ai-dx/
├── radio_operator.py # Main application — QSO state machine, wfweb callbacks,
│ # GPT-4o Realtime session, function call handler, TxBuffer
├── ai/
│ └── realtime_client.py # GPT-4o Realtime WebSocket session (asyncio in background thread)
├── audio/
│ └── wfweb_client.py # wfweb browser WebSocket client (RX audio, TX audio, PTT)
├── core/
│ ├── config.py # AppConfig, AudioConfig, RadioConfig, WfwebConfig, RealtimeConfig
│ ├── adif_logger.py # ADIF QSO log writer
│ ├── band_utils.py # Frequency → band name mapping
│ ├── operator_profiles.py # System prompts per operator style
│ └── operator_profiles_base.py # Shared prompt building blocks
├── ui/
│ └── radio_ui.py # HF radio-themed Rich terminal UI (10 FPS)
├── logs/ # Runtime logs and contacts.adi ADIF log
└── test_tools/ # Manual test utilities
Start
│
▼
CALLING_CQ ──── send CQ every N seconds (ephemeral context, no memory)
│
│ model calls update_contact(callsign="VK2TDX")
▼
IN_QSO ──── full QSO exchange; update_contact() fills in name, QTH, notes
│
│ model calls update_contact(closing=true) on final goodbye
▼
QSO_ENDED ──── contact logged to ADIF, brief pause
│
└──► CALLING_CQ (loop)
Station skip detection prevents looping conversations with the same station or detecting response hijacking via string similarity.
Contacts are logged in standard ADIF format. In production, the log file is logs/contacts.adi. In demo mode (--demo), a separate timestamped file logs/demo_YYYYMMDD_HHMMSS.adi is created per session so production logs are never affected.
Each record captures:
QSO_DATE,TIME_ON— UTC date and timeCALL— remote callsignFREQ— frequency in MHz (live from wfweb at time of logging)MODE— operating modeRST_SENT,RST_RCVD— signal reportsNAME,QTH,NOTES— filled by the model via function calls during the QSO
wfweb connection fails
- Verify
WFWEB_URLpoints to the correct host/port - wfweb uses a self-signed TLS certificate — this is handled automatically
- Check that wfweb is running and the radio is connected
Operator goes silent after hearing callsign
- Ensure your OpenAI key has GPT-4o Realtime access
- Check logs for WebSocket errors
Wrong frequency in ADIF log
- Frequency is fetched live from wfweb at QSO close time; verify wfweb reports the correct VFO frequency
S-meter / power / SWR not moving
- Confirm wfweb is sending
statusormetersmessages (check DEBUG logs) - Meter data is only shown when wfweb provides it; no local fallback
MIT License — see LICENSE.
Note: Previous versions of this project used Hamlib (LGPL v2.1). The current version does not link to Hamlib. See NOTICE for historical third-party license information.
Good DX and happy QSOs! 📻