cuprated: Add graceful shutdown, pt2: service error propagation#586
Open
redsh4de wants to merge 26 commits intoCuprate:mainfrom
Open
cuprated: Add graceful shutdown, pt2: service error propagation#586redsh4de wants to merge 26 commits intoCuprate:mainfrom
redsh4de wants to merge 26 commits intoCuprate:mainfrom
Conversation
aac614a to
1a41630
Compare
91f6723 to
e3d9960
Compare
e3d9960 to
0cf53be
Compare
0cf53be to
15bc617
Compare
15bc617 to
b5cfe8a
Compare
b5cfe8a to
157451c
Compare
157451c to
8adc7f1
Compare
…erHandle, doc nits
8adc7f1 to
79e3e61
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Depends on pt1, PR #585
Building on part 1, this replaces the panic-on-error patterns in cuprated's services with error propagation that makes use of the graceful shutdown mechanism
Why
So we dont panic and insta-crash upon a error, shutdown should be graceful. Internal errors still caused panics before this via PANIC_CRITICAL_SERVICE_ERROR
Where
cupratedmonitor.rs- addedspawn_criticaltoTaskExecutor, which logs errors and triggers shutdown on task endconstants.rs- removedPANIC_CRITICAL_SERVICE_ERRORmanager.rs- usespawn_criticalfor syncer and manager taskshandler.rs- replace all.expect(PANIC_CRITICAL_SERVICE_ERROR)with?/.map_err(), addHandleBlockErrorenum to distinguish validation errors from service errors, functions returnanyhow::Resultinterface.rs- addIncomingBlockError::Servicevariant, replace.expect()with.map_err()syncer.rs- suppressIncomingBlockChannelClosederror if shutting downblockchain.rs-check_add_genesisreturnsanyhow::Resultmanager.rs- sameexpectto?pattern,spawn_critical, functions returnanyhow::Resultincoming_tx.rs- addIncomingTxError::Servicevariant, error handling for incoming tx pathrequest_handler.rs- addHandlerErrorenum (PeervsService), replaceexpectwith typed error propagation, log service errorsserver.rs-spawntospawn_criticalwith?propagationjson_rpc.rs- handleIncomingBlockError::Serviceinsubmit_block, trigger shutdownother_json.rs- pass shutdown token to tx handlertx_handler.rs- handleIncomingTxError::Service, trigger shutdownHow
spawn_criticalwraps each subsystem's future: if it returns an error it is logged and the shutdown token is hit, triggering the same graceful shutdown path as OS signals.Critical tasks returning
Ok(())also trigger shutdown.Validation errors (invalid blocks/transactions from peers) are distinguished from service errors (database/internal failures) using simple enums (
HandleBlockError,HandlerError). Only service errors propagate to shutdown - validation errors are handled gracefully (ban peer, cancel downloader, etc.)