Skip to content

Detect OTA job cancellation during download via notify-next#85

Open
MathiasKoch wants to merge 6 commits intorustot-ota-reasonsfrom
ota-cancel-detection
Open

Detect OTA job cancellation during download via notify-next#85
MathiasKoch wants to merge 6 commits intorustot-ota-reasonsfrom
ota-cancel-detection

Conversation

@MathiasKoch
Copy link
Member

Summary

Enables the device to detect cloud-initiated OTA job cancellations while a file transfer is in progress. Previously, a force-cancelled job would go unnoticed until the download completed or timed out. The device now co-subscribes to the AWS IoT Jobs notify-next topic during download and immediately aborts when a job state change notification arrives.

Note: This is PR 4 of 4 in a stacked series. Depends on #84 (OTA failure reason reporting).

Design

During init_file_transfer(), the MQTT data interface now subscribes to two topics instead of one:

  1. The OTA data stream topic (for file blocks)
  2. $aws/things/{thingName}/jobs/notify-next (for job state change notifications)

In next_block(), incoming messages are checked: if a message arrives on a topic that doesn't contain /streams/ (i.e., a notify-next message rather than a data block), the download returns OtaError::UserAbort. The OTA state machine handles this by setting the image state to Aborted(ImageStateReason::UserAbort) and cleanly terminating.

The notify-next topic is published by AWS IoT Jobs whenever the next pending job execution changes — including when the current job is force-cancelled via cancel_job(force=true).

Changelog

  • Subscribe to both OTA data stream and jobs notify-next topic during file transfer
  • Increase MQTT subscription count from 1 to 2 in MqttTransfer
  • Detect non-data-stream messages as job cancellation signals, return OtaError::UserAbort
  • Handle UserAbort in OTA state machine with Aborted(ImageStateReason::UserAbort) image state
  • Add test_mqtt_ota_cancel integration test with force-cancellation after 3 seconds

MathiasKoch and others added 6 commits February 12, 2026 15:09
subscribe() was only using the first topic and silently dropping the
rest. Callers like shadows and provisioning subscribe to accepted/rejected
pairs and expect messages from both. Create N individual StreamOperations
and merge them via futures::stream::SelectAll.
Extend OTA status updates to include detailed failure reasons (e.g.
SignatureCheckFailed, BadImageState) in job status details. Rework
integration tests to self-provision OTA jobs via AWS SDK instead of
relying on external shell scripts, with automatic cleanup.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
OtaError::Pal was discarding the inner OtaPalError, so the cleanup path
reported Aborted(None) — losing the failure reason (e.g.
SignatureCheckFailed) from job status details. Now OtaError::Pal carries
the OtaPalError through to the job status update.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Progress was only refreshed for InProgress updates, so Failed status
carried the stale value from the last periodic update. Now progress is
also updated on Failed, giving an accurate block count at failure time.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Allow PAL implementations to provide custom key-value pairs that are
included in every job status update. The PAL defines a StatusDetails
associated type implementing StatusDetailsExt, which gets serialized
alongside the base OTA status fields via CombinedStatusDetails.

This threads the extra context through ProgressState, ControlInterface,
and DataInterface as a generic parameter (defaulting to ()).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Subscribe to the jobs/notify-next topic alongside the data stream in
init_file_transfer. When a force-cancel triggers a notify-next message
during active download, next_block() returns UserAbort and perform_ota
aborts gracefully. The test uses force-cancel (.force(true)) which
transitions the execution to CANCELED and triggers the notification.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments