Detect OTA job cancellation during download via notify-next#85
Open
MathiasKoch wants to merge 6 commits intorustot-ota-reasonsfrom
Open
Detect OTA job cancellation during download via notify-next#85MathiasKoch wants to merge 6 commits intorustot-ota-reasonsfrom
MathiasKoch wants to merge 6 commits intorustot-ota-reasonsfrom
Conversation
subscribe() was only using the first topic and silently dropping the rest. Callers like shadows and provisioning subscribe to accepted/rejected pairs and expect messages from both. Create N individual StreamOperations and merge them via futures::stream::SelectAll.
Extend OTA status updates to include detailed failure reasons (e.g. SignatureCheckFailed, BadImageState) in job status details. Rework integration tests to self-provision OTA jobs via AWS SDK instead of relying on external shell scripts, with automatic cleanup. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
OtaError::Pal was discarding the inner OtaPalError, so the cleanup path reported Aborted(None) — losing the failure reason (e.g. SignatureCheckFailed) from job status details. Now OtaError::Pal carries the OtaPalError through to the job status update. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Progress was only refreshed for InProgress updates, so Failed status carried the stale value from the last periodic update. Now progress is also updated on Failed, giving an accurate block count at failure time. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Allow PAL implementations to provide custom key-value pairs that are included in every job status update. The PAL defines a StatusDetails associated type implementing StatusDetailsExt, which gets serialized alongside the base OTA status fields via CombinedStatusDetails. This threads the extra context through ProgressState, ControlInterface, and DataInterface as a generic parameter (defaulting to ()). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Subscribe to the jobs/notify-next topic alongside the data stream in init_file_transfer. When a force-cancel triggers a notify-next message during active download, next_block() returns UserAbort and perform_ota aborts gracefully. The test uses force-cancel (.force(true)) which transitions the execution to CANCELED and triggers the notification. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enables the device to detect cloud-initiated OTA job cancellations while a file transfer is in progress. Previously, a force-cancelled job would go unnoticed until the download completed or timed out. The device now co-subscribes to the AWS IoT Jobs
notify-nexttopic during download and immediately aborts when a job state change notification arrives.Design
During
init_file_transfer(), the MQTT data interface now subscribes to two topics instead of one:$aws/things/{thingName}/jobs/notify-next(for job state change notifications)In
next_block(), incoming messages are checked: if a message arrives on a topic that doesn't contain/streams/(i.e., a notify-next message rather than a data block), the download returnsOtaError::UserAbort. The OTA state machine handles this by setting the image state toAborted(ImageStateReason::UserAbort)and cleanly terminating.The
notify-nexttopic is published by AWS IoT Jobs whenever the next pending job execution changes — including when the current job is force-cancelled viacancel_job(force=true).Changelog
MqttTransferOtaError::UserAbortUserAbortin OTA state machine withAborted(ImageStateReason::UserAbort)image statetest_mqtt_ota_cancelintegration test with force-cancellation after 3 seconds