Skip to content

Latest commit

 

History

History
570 lines (461 loc) · 25.5 KB

File metadata and controls

570 lines (461 loc) · 25.5 KB
title HTTP control plane

HTTP control plane

A LAN-scoped HTTP/1.1 server runs on port 80 once Wi-Fi connects. It exposes a small, fixed set of routes for live state, manual override, persistent config, and an embedded operator dashboard. Write routes (PUT, POST) are gated on a configurable bearer token — empty token (default) leaves the LAN open, matching the offline-first stance for Wi-Fi. See Auth for how to enable it; security covers what's still out of scope.

The wire-format implementation is hand-rolled in crates/stackchan-firmware/src/net/http.rs — the surface is small enough that a hand-rolled request matcher beats pulling a full HTTP framework into the firmware target.

Routes

Method Path Description
GET / Operator dashboard (HTML + JS, embedded in firmware)
GET /health Uptime, firmware version, free heap
GET /state Snapshot JSON: emotion, head pose, battery, Wi-Fi, audio
GET /sensors Live IMU / ambient / audio-RMS / body-touch sample
GET /sensor-history Rolling 60-second window of one-per-second sensor snapshots, oldest first
GET /hardware/status Boot-time servo-power-rail health (servo-power only, not a full hardware rollup)
GET /camera/snapshot Most recent /sd/CAPTURE.565 frame as raw QVGA RGB565 BE
GET /state/stream Server-Sent Events stream of state changes
GET /state/ws WebSocket stream of state changes (RFC 6455)
GET /head/offsets Current yaw + tilt zero-point offsets
POST /head/offsets Update yaw + tilt zero-point offsets; persisted to SD
GET /crash Most recent panic log (404 if none recorded)
POST /crash/clear Delete /sd/CRASH.LOG (idempotent)
POST /emotion Set affect with hold timer
POST /look-at Aim head + eyes with hold timer
POST /reset Clear active emotion / look-at hold
POST /speak Play a baked phrase / chirp through the speaker
POST /listen Open a 3-second listen window — Ear decorator + ack chirp
POST /pair Open ESP-NOW pairing window with hold timer
POST /mood Set the operator-selected energy baseline (runtime-only)
POST /palette Pick the avatar's colour palette (skin / eye / mouth)
POST /behavior Toggle one boolean flag in behavior (soliloquy / hourly chime / battery icon / toast overlay); persisted, reboot to apply
POST /dance Play a JSON keyframe script (head + emotion + decorator + LED)
POST /motion Play a named one-shot motion preset (greet/nod/shake/laugh)
POST /face-geometry Pick a face-geometry preset (eye + mouth baseline silhouette)
POST /mcp JSON-RPC 2.0 / MCP endpoint (initialize, tools/list, tools/call)
POST /volume Set output volume (0–100); persisted to SD
POST /mute Mute / un-mute output stage; persisted to SD
POST /toast Push a short toast band (warn / error); 3-second TTL
GET /settings Persisted config (PSK + token redacted)
PUT /settings Replace persisted config; atomic SD writeback

Write routes (PUT, all POST) require Authorization: Bearer <token> when auth.token is configured. Reads are always unauthenticated.

Live state

$ curl http://stackchan.local/state
{"emotion":"Neutral","head_pose":{"pan_deg":0.00,"tilt_deg":0.00},
 "head_actual":{"pan_deg":0.10,"tilt_deg":-0.20},
 "battery":{"percent":78,"voltage_mv":3920},
 "wifi":{"connected":true,"ip":"192.168.1.42"},
 "audio":{"volume_pct":50,"muted":false}}

GET /state/stream opens a Server-Sent Events stream. The server emits an initial event on connect, then one event per change. Idle connections receive : heartbeat SSE comment lines every 15 s so proxies and NAT idle timers don't close the connection.

$ curl -N http://stackchan.local/state/stream
data: {"emotion":"Neutral", ...}

data: {"emotion":"Happy", ...}

: heartbeat

Producer-side throttling caps events at ~10 Hz even when the underlying render loop ticks at 30 Hz.

The HTTP layer accepts on a pool of worker tasks, so a long-lived SSE stream doesn't block other requests on the same port.

GET /state/ws opens an RFC 6455 WebSocket carrying the same snapshot payload as /state/stream, framed as text frames instead of data: lines. Operators choose whichever transport their dashboard library speaks; both subscribe to the same publisher so payloads are bit-identical. A 15 s ping frame keeps proxies and NAT idle timers from closing the connection.

$ wscat -c ws://stackchan.local/state/ws
< {"emotion":"Neutral", ...}
< {"emotion":"Happy", ...}

Only server→client traffic is implemented today — the firmware accepts the handshake, sends the initial snapshot, then pushes text frames whenever the snapshot changes. Client-sent frames are ignored; binary frames, fragmentation, and per-frame masking are out of scope for this v1.

Hardware status

GET /hardware/status reports the outcome of the boot-time servo-power-rail enable. The firmware drives a PY32 co-processor pin HIGH to gate the servo rail MOSFET; if that I²C write fails (retried a few times at boot), the head is left unpowered while the rest of the device boots normally. Reach for this route when a head won't move — it distinguishes "rail never came up" from "servos present but stuck".

$ curl http://stackchan.local/hardware/status
{"servo_power":{"enabled":true,"attempts":1,"settled":true}}
Field Type Notes
enabled bool Whether the rail-enable write ultimately succeeded.
attempts u8 Enable attempts made (0 before boot reaches the step).
settled bool Whether the post-enable settle delay completed.

enabled:false means the PY32 write never succeeded — ping_servo times out at boot and commanded poses produce no motion. A boot-time failure is also recorded to the event ring (GET /events, warn kind), but that ring is volatile and drains on reset, so the status here is the durable-until-reset read-out. Scope is servo-power only, not a full PMU / SD / sensor rollup.

Manual control

POST /emotion, POST /look-at, and POST /reset write into the modifier pipeline through the RemoteCommandModifier in stackchan-core. Each command takes effect on the next render tick.

POST /emotion

$ curl -X POST http://stackchan.local/emotion \
       -H 'Content-Type: application/json' \
       -d '{"emotion":"happy","hold_ms":30000}'
Field Type Required Notes
emotion string yes One of the wire strings in Emotion::wire_str (e.g. neutral, happy, sleepy, …)
hold_ms u32 no Default 30 000. 0 is fire-and-forget.

The hold blocks autonomous emotion drivers (touch, IR, ambient, battery, EmotionCycle) for hold_ms. Source recorded as OverrideSource::Remote.

POST /look-at

$ curl -X POST http://stackchan.local/look-at \
       -H 'Content-Type: application/json' \
       -d '{"pan_deg":12.0,"tilt_deg":-3.0,"hold_ms":30000}'
Field Type Required Notes
pan_deg f32 yes Same coordinate system as motor.head_pose
tilt_deg f32 yes "
hold_ms u32 no Default 30 000.

The handler writes mind.attention = Attention::Tracking { target }. HeadFromAttention and GazeFromAttention translate that into motor + eye motion. While the hold is active, fresh tracking observations from the camera don't stomp the operator's target.

POST /reset

$ curl -X POST http://stackchan.local/reset

Empty body. Clears any active emotion or look-at hold; autonomous behaviours resume on the next render tick.

POST /speak

$ curl -X POST http://stackchan.local/speak \
       -H 'Content-Type: application/json' \
       -d '{"phrase":"wake_chirp"}'
Field Type Required Notes
phrase string yes Catalog entry — see vocabulary below
locale string no "en" (default) / "ja" — only meaningful for verbal phrases

Phrase vocabulary (matches [stackchan_core::voice::PhraseId]):

Wire string Kind Notes
wake_chirp SFX 100 ms / 1 kHz
pickup_chirp SFX Two-tone upward sweep
startle_chirp SFX Sharp 4 kHz tone
low_battery_chirp SFX Two-pulse 2 kHz alert
camera_mode_entered_chirp SFX Upward "doot-DEE"
camera_mode_exited_chirp SFX Downward "DEE-doot"
greeting Verbal phrase en + ja PCM assets
acknowledge_name Verbal phrase en + ja PCM assets
battery_low Verbal phrase en + ja PCM assets

Fire-and-forget — the request returns 204 No Content once the utterance is queued; playback completes asynchronously. Operator- driven calls land at Priority::Normal. Modifier-internal call sites (low-battery alert, etc.) use elevated priority via the firmware's audio::try_dispatch_utterance directly and can preempt or evict an in-flight operator request.

POST /speak is gated by auth when a token is configured.

POST /pair

$ curl -X POST http://stackchan.local/pair \
       -H 'Content-Type: application/json' \
       -d '{"duration_ms":30000}'
Field Type Required Notes
duration_ms integer no Pairing window length in ms; default 30 000

Opens an ESP-NOW pairing window for the given duration. While the window is open, the avatar shows the Decorator::Pairing overlay (concentric blue rings) and the firmware-side ESP-NOW receiver accepts new peer registrations. The window times out automatically; POST /reset closes it early.

POST /pair is gated by auth when a token is configured.

POST /volume

$ curl -X POST http://stackchan.local/volume \
       -H 'Content-Type: application/json' \
       -d '{"level":75}'
Field Type Required Notes
level u8 yes Output volume as a percentile, 0..=100.

Persistent — the new value is written atomically to /sd/STACKCHAN.RON (same writeback path as PUT /settings) and mirrored into the live audio.volume_pct field surfaced on GET /state. The firmware then signals the audio task, which calls Aw88298::set_volume_db with the linear-in-dB mapping (0% → -36 dB, 100% → 0 dB). Persist-then-apply ordering: a failed SD write leaves the amp at its current level instead of partially applying.

POST /mute

$ curl -X POST http://stackchan.local/mute \
       -H 'Content-Type: application/json' \
       -d '{"muted":true}'
Field Type Required Notes
muted bool yes true = mute output stage, false = un-mute.

Persistent. Mute is independent of volume so unmuting restores the prior level — volume = 0 is "audible but very quiet"; muted = true is the actual-silence path.

POST /camera/mode

$ curl -X POST http://stackchan.local/camera/mode \
       -H 'Content-Type: application/json' \
       -d '{"enabled":true}'
Field Type Required Notes
enabled bool yes true = LCD shows camera preview, false = avatar.

Display-only — tracking still runs in either mode. Ephemeral (no SD writeback); a power-cycle returns to avatar. Mirrored on the BLE view service (8a1c0041-…).

POST /camera/capture

$ curl -X POST http://stackchan.local/camera/capture

Empty body. Signals the camera task to write the latest QVGA RGB565 frame to /sd/CAPTURE.565 (320 × 240 × 2 = 153 600 bytes, big-endian); each capture overwrites the previous file. Returns 202 Accepted immediately — the SD write happens asynchronously and stalls render for ~200 ms during the SPI burst. Mirrored on the BLE view service (8a1c0042-…).

Decode example (Python):

import numpy as np
raw = np.fromfile('CAPTURE.565', dtype='>u2').reshape(240, 320)
r = ((raw >> 11) & 0x1F) * 255 // 31
g = ((raw >>  5) & 0x3F) * 255 // 63
b = ((raw      ) & 0x1F) * 255 // 31
img = np.dstack([r, g, b]).astype('u1')

Persistent config

The boot config lives at /sd/STACKCHAN.RON in the schema described by stackchan-net. The HTTP control plane round-trips it as JSON.

GET /settings

$ curl http://stackchan.local/settings
{"wifi":{"ssid":"my-net","psk":"***","country":"US"},
 "mdns":{"hostname":"stackchan"},
 "time":{"tz":"UTC","sntp_servers":["pool.ntp.org"]},
 "auth":{"token":"***"},
 "audio":{"volume_pct":50,"muted":false},
 "tracker":{"fov_h_deg":62.0,"fov_v_deg":49.0,"target_smoothing_alpha":1.0,
            "flip_x":false,"flip_y":false},
 "head":{"pan_trim_deg":0.0,"tilt_trim_deg":49.0},
 "appearance":{"palette":"","face_geometry":""}}

appearance pins a default look in the boot config: palette (default / dark / cute / dog) and face_geometry (default / chibi / wide / sleepy). Both default to an empty string, meaning "not pinned" — the firmware falls back to the neutral palette and default geometry. A non-empty value that isn't a known wire name is rejected on load. The appearance block is a first-boot seed only: it populates /sd/RUNTIME.RON when that file is absent, but once an operator changes the look at runtime via POST /palette / POST /face-geometry, RUNTIME.RON exists and wins on subsequent boots.

tracker carries the operator-tunable subset of the firmware tracker config: physical lens FOV (fov_h_deg / fov_v_deg), EMA smoothing on the published target (target_smoothing_alpha, 1.0 = pass-through, lower = more inertia), and centroid orientation flips for non-standard mountings. Algorithm tuning (P-gain, block thresholds, dead zones) stays compile-time. Changes take effect on next boot — same as mdns.hostname / time.*.

head carries the per-unit zero-point trim seeded into the SCServo driver at boot: pan_trim_deg / tilt_trim_deg are added to every commanded pose at the driver edge to absorb the assembled module's servo encoder offset (the tilt encoder zero sits well below physical horizontal, ~49° on the reference unit). This is the durable per-unit calibration; it is distinct from the per-session POST /head/offsets correction, which is applied on top and persisted to /sd/RUNTIME.RON. Both default to the firmware's compile-time PAN_TRIM_DEG / TILT_TRIM_DEG when the head block is absent. Range: [-90.0, 90.0]. Changes take effect on next boot — same as tracker.*.

wifi.psk and a non-empty auth.token are redacted to ***. On PUT /settings, the server treats either sentinel as "keep current value" rather than overwriting with the literal placeholder — so a GET → mutate hostname → PUT round trip preserves the secrets. An empty "" value still means "clear" (open AP / auth disabled); only the literal "***" triggers the preserve substitution.

PUT /settings

$ curl -X PUT http://stackchan.local/settings \
       -H 'Content-Type: application/json' \
       -H 'Authorization: Bearer s3cret' \
       -d '{"wifi":{"ssid":"new-net","psk":"realkey","country":"US"},
            "mdns":{"hostname":"stackchan"},
            "time":{"tz":"UTC","sntp_servers":["pool.ntp.org"]},
            "auth":{"token":"s3cret"},
            "audio":{"volume_pct":50,"muted":false},
            "tracker":{"fov_h_deg":62.0,"fov_v_deg":49.0,"target_smoothing_alpha":1.0,
                       "flip_x":false,"flip_y":false},
            "head":{"pan_trim_deg":0.0,"tilt_trim_deg":49.0},
            "appearance":{"palette":"","face_geometry":""}}'
{"reboot_required":true}

To disable auth on a device that previously had it enabled, send auth.token = "" (the explicit empty string, not the *** sentinel) and drop the Authorization header on the PUT itself.

Operators using curl who only want to update one field can echo the rest of the body back from GET /settings unchanged — the *** sentinels for wifi.psk and auth.token will be merged against the persisted values rather than clobbering them.

Unlike the redacted secrets, the appearance and head blocks have no preserve-on-echo sentinel: an omitted block parses to its default (empty appearance / compile-time head trim) and overwrites the persisted value. Callers must round-trip both blocks from GET /settings or they reset the operator's pinned boot appearance and per-unit head trim. The dashboard handles this for you — the System page's Boot appearance panel edits appearance, and the Settings form carries appearance + head through every save.

Full-replace. The server validates the body via stackchan_net::validate (rejects empty SSID, invalid country code, invalid hostname, empty SNTP list) and then writes back atomically: the new config goes to /sd/STACKCHAN.NEW, gets copied onto /sd/STACKCHAN.RON, and the staging file is removed. Mid-write power loss leaves the old file intact.

The firmware does not tear down Wi-Fi on save — that would drop the operator's HTTP session if the SSID changed. Reboot to apply.

Crafting a config offline

just config runs a host-only tool over the same stackchan-net::config API the firmware validates with at boot, so a file it accepts will load on device:

just config list                                         # accepted palette + face-geometry wire names
just config template --palette cute --face-geometry chibi > STACKCHAN.RON
just config validate STACKCHAN.RON                        # exits non-zero on a bad file

Status codes

Code Where it shows up
200 GET responses, dashboard, successful PUT
204 POST /emotion / /look-at / /reset / /volume / /mute on success
400 Malformed JSON, missing required fields, unknown field, invalid emotion / phrase / locale, volume.level > 100, validation failure on PUT /settings
401 Write route called without a valid bearer token (when auth is enabled)
404 Path not in the matcher
405 Method not allowed
413 Request body exceeds MAX_BODY_BYTES (1024)
431 Headers exceed REQUEST_BUF_BYTES before \r\n\r\n is reached
500 SD write failed during PUT /settings
503 PUT /settings with no SD; GET /settings before the config snapshot is loaded; no free SSE subscriber slot

Error responses have Content-Type: text/plain with a single-line reason — operators triage from defmt boot logs, the HTTP body is just a hint.

Dashboard

GET / returns a self-contained HTML page embedded in the firmware binary via include_bytes!. Vanilla DOM + EventSource + fetch, no framework, no external CDN. The dashboard:

  • Subscribes to /state/stream for live state.
  • Has a button per emotion (POST /emotion with a 30 s hold).
  • Renders pan/tilt sliders that POST /look-at on release.
  • Volume slider (POST /volume with a 250 ms client-side debounce so dragging doesn't burn an SD write per pointer move) and a mute toggle (POST /mute).
  • Loads /settings into an editable form; submits PUT /settings on save and surfaces the reboot_required hint via toast.

To customise it, edit the HTML in crates/stackchan-firmware/src/net/dashboard.html and re-flash. Embedding via include_bytes! instead of serving from SD is intentional: the dashboard works on a card-less device.

Auth

Write routes (PUT, all POST) accept an optional bearer token. Token is stored in auth.token of the persisted config and read once at boot into a snapshot the HTTP handler consults on every write request. Empty token (default) disables the gate; non-empty token requires Authorization: Bearer <token> and returns 401 on mismatch.

Compare is constant-time across the byte length so a co-located attacker can't leak the token byte by byte through timing — though on a LAN, network jitter dominates any sub-microsecond difference.

$ curl -X POST http://stackchan.local/emotion
HTTP/1.1 401 Unauthorized
unauthorized

$ curl -X POST http://stackchan.local/emotion \
       -H 'Authorization: Bearer s3cret' \
       -H 'Content-Type: application/json' \
       -d '{"emotion":"happy"}'
HTTP/1.1 204 No Content

To enable auth on a fresh kit:

$ curl -X PUT http://stackchan.local/settings \
       -H 'Content-Type: application/json' \
       -d '{"wifi":{...},"mdns":{...},"time":{...},
            "auth":{"token":"s3cret"}}'
$ # reboot the device — Wi-Fi keeps the boot config until restart

The dashboard at GET / reads the configured token from localStorage and prompts the operator on its first 401. The typed value is persisted to the device on PUT /settings and to localStorage, so a freshly configured browser keeps writing without re-prompting until the device reboots.

Security

What's covered:

  • LAN scope (port 80 binds to 0.0.0.0; reachable inside the network the device joined).
  • Bearer-token gate on writes.
  • PSK and auth-token redaction on GET /settings.
  • Atomic SD writeback for PUT /settings.

What isn't:

  • No TLS. Tokens cross the wire in cleartext. A network observer on the same broadcast domain can capture and replay.
  • No rate limiting. A misconfigured client can hammer 401s without throttling.
  • No replay protection. A captured request can be re-sent.
  • No CSRF protection on the dashboard. Any page on the same LAN that can fetch the dashboard can also drive writes through the operator's browser.

These are acceptable for a desktop toy on a trusted home LAN and explicitly out of scope for v0.x. If you put Stack-chan on an untrusted network, fence it off.

Worker pool sizing

The HTTP layer spawns a fixed pool of worker tasks (HTTP_WORKER_COUNT in net/http.rs). Each worker holds its own rx/tx buffers and accepts one connection at a time. SSE subscribers occupy a worker for the lifetime of the stream; ordinary GET / POST / PUT requests free the worker on response close.

SSE_MAX_SUBSCRIBERS in net/snapshot.rs caps the concurrent SSE consumers. An SSE connection that arrives when no subscriber slot is free gets 503 stream slots exhausted back. The two constants are coupled — bumping concurrency requires raising both.

Related

  • Behavior flags — operator-visible fields persisted through PUT /settings.
  • Signal channels — the typed Signal<…, T> plumbing the render task drains before each /state snapshot; SSE uses PubSubChannel for multi-subscriber /state/stream fan-out.
  • Sidecar agent — the agent that lives outside the firmware and talks to it over HTTP.
  • Dance choreography — wire format for the POST /dance keyframe stream.