Skip to content

OTAUpdateFlow

Rob Dobson edited this page May 4, 2026 · 2 revisions

OTA Update Flow

End-to-end view of how a Raft device receives a new firmware image, writes it to the inactive OTA partition, verifies the result and reboots into the new app. Covers both delivery paths — HTTP-direct (espFwUpdate) and the framework's file-stream pipeline — and the invariants they share.

The OTA machinery is split across several SysMods so that the same low-level write/verify/restart path can be driven from very different transports (BLE, WebSocket, raw HTTP). The code that actually programs flash lives in ESPOTAUpdate; the routing, framing and back-pressure live in ProtocolExchange and the Comms Channels layer.

At a glance

                  ┌─────────────────────────────────┐
HTTP POST  ─────► │ RaftWebServer                   │
(/espFwUpdate)    │   ↓ chunked body                │ ────────┐
                  └─────────────────────────────────┘         │
                                                              ▼
BLE / WS / serial ─► RICREST  ──► ProtocolExchange    ──► ESPOTAUpdate
(ufStart + FileBlock)             (FileStreamSession)      (worker task)
                                                              │
                                                              ▼
                                                       esp_ota_begin
                                                       esp_ota_write × N
                                                       esp_ota_end
                                                       esp_ota_set_boot_partition
                                                              │
                                                              ▼
                                                       esp_restart  (after 1 s)

Both paths converge on the same SysMod entry points — fileStreamStart(), fileStreamDataBlock(), fileStreamCancelEnd() — so once a transfer is "open", the rest of the pipeline is identical regardless of how the bytes arrived.

Path A — HTTP-direct (espFwUpdate)

The simplest deployment path: a workstation running the raft CLI, an IDE plugin, or curl POSTs the binary as the request body to /espFwUpdate over WiFi.

POST /espFwUpdate HTTP/1.1
Host: my-device.local
Content-Type: application/octet-stream
Content-Length: 1572864

<raw firmware bytes>

Inside the device:

  1. RaftWebServer receives the request and streams the body in chunks.
  2. The endpoint espFwUpdate is registered by ESPOTAUpdate::addRestAPIEndpoints() with three callbacks:
    • apiFirmwareMain — final response after esp_ota_end() returns; populated from _otaStatus.
    • apiFirmwarePart — invoked for every chunk; immediately calls fileStreamDataBlock().
    • apiReadyToReceiveData — back-pressure hook; returns false whenever the worker queue still holds an unprocessed block, so the web layer pauses the body stream until the queue drains.
  3. The first chunk's firstBlock flag triggers fileStreamStart(), which lazily creates a single-slot FreeRTOS queue and spawns the OTA worker task (default core 0, priority 5, 4 KB stack — overrideable via SysType config keys taskCore, taskPriority, taskStack).
  4. Each FileStreamBlock is deep-copied into an OTAUpdateFileBlock (block bytes go into a SpiramAwareAllocator vector) and pushed onto the worker queue. The HTTP task never blocks on flash writes.
  5. The worker pops blocks, calls esp_ota_begin() on the first one, esp_ota_write() on every block, accumulates a CRC-16/CCITT and per-update statistics under a status mutex, then calls esp_ota_end() + esp_ota_set_boot_partition() on the final block.
  6. loop() notices _restartPending is set and, after TIME_TO_WAIT_BEFORE_RESTART_MS (1000 ms) so the HTTP response can flush, calls esp_restart().

This path can be disabled per device by setting OTADirect: 0 in the ESPOTAUpdate SysType config. When disabled, fileStreamStart rejects the transfer and only Path B is accepted.

Path B — File-stream over RICREST (BLE / WebSocket / serial)

Used by the raftjs browser library and any RICREST client. The wire protocol is the File Download Protocol's upload counterpart — a ufStart command opens the session, a sequence of binary FileBlock RICREST messages carry the payload, and ufEnd closes it.

The routing inside the device:

  1. Comms layer decodes the framed CommsChannelMsg (RICSerial / RICFrame / RICJSON) and hands it to ProtocolExchange::processEndpointMsg.

  2. The RICREST command frame ufStart with firmware file type/destination is dispatched to a built-in handler that opens a FileStreamSession of content type firmware on the originating channel.

  3. ProtocolExchange looks up its registered FW-update handler via the hook installed at boot by ESPOTAUpdate::setup():

    pSysManager->getProtocolExchange()->setFWUpdateHandler(this);

    That hook tells FileStreamSession to forward every block to ESPOTAUpdate::fileStreamDataBlock() — exactly the same entry point used by Path A.

  4. Each RICREST FileBlock element carries (streamID, filePos, payload) where streamID is the non-zero session ID returned by ufStart and filePos is the lower 24 bits of the FILEBLOCK position word. The session reassembles blocks in order, applies flow control (batch ACKs, heap-watermark throttling, retries), and pushes blocks at the FW handler.

  5. From this point onward the worker-task / esp_ota_* / restart sequence is identical to Path A.

The notable differences from Path A:

Concern Path A (HTTP) Path B (RICREST)
Transport TCP, single connection BLE / WebSocket / serial via channels
Framing Raw octets in HTTP body RICREST FileBlock elements
Flow control apiReadyToReceiveData pauses body okto batched OKTO acks + heap watermark
Multi-session One at a time Up to MAX_SIMULTANEOUS_FILE_STREAM_SESSIONS (3) per ProtocolExchange
Cancel Connection close ufCancel RICREST message

For full wire-level details of the file-stream side see File Download Protocol (the upload direction reuses the same OKTO batching scheme).

Shared invariants

Regardless of the path, the OTA write side enforces these guarantees:

  • Single in-flight update. _otaDirectInProgress plus the single-slot worker queue mean only one OTA write can be active at a time; concurrent attempts on a different channel are rejected at fileStreamStart.
  • Worker thread isolation. The flash writes happen on a dedicated FreeRTOS task pinned to a single core. Callers (HTTP, BLE, etc.) never block on esp_ota_write, which can take tens of milliseconds.
  • Status mutex. _fwUpdateStatusSemaphore protects the FWUpdateStatus struct (bytes written, throughput, CRC, last result string) so that getDebugJSON() and apiFirmwareMain can read consistent snapshots.
  • CRC-16/CCITT. Computed incrementally as bytes are written; available in getDebugJSON() and as _otaStatus.totalCRC.
  • Deferred restart. The reboot is scheduled by setting _restartPending and is performed from loop() 1 s later, so any pending response (HTTP body, RICREST CmdRespJSON, etc.) has time to leave the device.
  • isBusy() reports true while an update is in progress, which lets SysManager and StatePublisher reduce publish rates and keep busy indicators visible.

What the application has to do

For the standard ESP32 setup RaftCoreApp already does the wiring — adding ESPOTAUpdate to your SysType config is enough. Manually:

auto* pProtoExch = (ProtocolExchange*)getSysManager()->getSysMod("ProtoExchange");
auto* pOTA      = (ESPOTAUpdate*)getSysManager()->getSysMod("ESPOTAUpdate");
if (pProtoExch && pOTA)
    pProtoExch->setFWUpdateHandler(pOTA);

Without this hook the file-stream path (Path B) has nowhere to deliver firmware blocks; the HTTP-direct path (Path A) still works because it goes straight into ESPOTAUpdate via its REST endpoint.

Partition layout

OTA requires two app partitions plus an otadata partition. A typical Raft partitions.csv:

# Name,   Type, SubType, Offset,   Size
nvs,      data, nvs,     0x009000, 0x015000
otadata,  data, ota,     0x01e000, 0x002000
app0,     app,  ota_0,   0x020000, 0x1b0000
app1,     app,  ota_1,   0x1d0000, 0x1b0000
fs,       data, 0x83,    0x380000, 0x080000

esp_ota_begin() chooses the inactive app0/app1 slot; esp_ota_set_boot_partition() flips otadata so the bootloader picks the new image on reboot. See Partitions and Flash Layout for the broader picture.

Failure handling

Failure What happens
esp_ota_begin fails Worker sets _otaDirectInProgress = false, status lastOTAUpdateResult = "Failed". Subsequent blocks are still drained but ignored.
esp_ota_write returns non-OK Same as above; the partial image stays on the inactive slot but is never marked bootable.
Cancel (Path B ufCancel, or session timeout) fileStreamCancelEnd() enqueues a cancel marker; worker calls completeOTAUpdate(true) which abandons the write.
Worker queue full / heap low (BLE) Block sender pauses; transfer resumes on next batch ack. See BLE publish-throughput notes.
New image fails to boot The bootloader keeps the old image marked as the rollback target — not implemented in ESPOTAUpdate today; the new image is unconditionally marked bootable on a clean esp_ota_end. Applications that want rollback must call esp_ota_mark_app_valid_cancel_rollback() themselves after boot, with CONFIG_BOOTLOADER_APP_ROLLBACK_ENABLE=y in sdkconfig.

Diagnostics

ESPOTAUpdate::getDebugJSON() returns:

{
  "Bps": 41234.5,
  "stMs": 12,
  "bytes": 1572864,
  "wrPS": 65000.0,
  "elapS": 38.2,
  "blk": 4096
}
Field Meaning
Bps Average bytes/sec across the whole transfer
stMs esp_ota_begin duration
bytes Total bytes written so far
wrPS Bytes/sec measured against esp_ota_write time only (excludes transport stalls)
elapS Wall-clock seconds since the transfer started
blk Last block size accepted

These fields are the most useful primary signal for diagnosing slow OTA: a low wrPS indicates flash contention, a low Bps with a high wrPS indicates a transport bottleneck (typical on BLE — see throughput notes).

See also

Clone this wiki locally