Skip to content

Fix scripts multisig rotation race condition#1260

Closed
ryan-hansen wants to merge 5 commits intoWebOfTrust:mainfrom
ryan-hansen:bug-scripts-multisig-rotation-race-condition
Closed

Fix scripts multisig rotation race condition#1260
ryan-hansen wants to merge 5 commits intoWebOfTrust:mainfrom
ryan-hansen:bug-scripts-multisig-rotation-race-condition

Conversation

@ryan-hansen
Copy link
Copy Markdown
Collaborator

Problem: The multisig-join.sh CLI script was hanging.

Two causes:
1 - A timing race: when one member queried the other’s KEL after local rotations, the query could be answered before the witness had applied that member’s rotation, so the group rotation was built with stale keys and failed with “invalid rotation, new key set unable to satisfy prior next signing threshold”.

2 - When the rotate failed, the companion multisig join had no timeout and waited indefinitely for a group event that never came, so the script’s wait never returned.

Fix:
1 - Added a --timeout option to kli multisig join so it exits with an error instead of hanging when no group multisig event arrives.

2 - Added retry logic in the group rotation CLI so transient ValidationErrors (e.g. from stale member KEL state) are retried for a short window before failing.

3 - Hardened the demo script by using --timeout on all join calls and adding short sleeps before cross-queries so witnesses have time to apply rotations.

4 - Added kering.TimeoutError for join timeouts.

@ryan-hansen ryan-hansen marked this pull request as draft February 28, 2026 09:05
@ryan-hansen
Copy link
Copy Markdown
Collaborator Author

Not quite ready yet, apparently.

Copy link
Copy Markdown
Collaborator

@SmithSamuelM SmithSamuelM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert this to a non-draft so it can be accepted

@ryan-hansen
Copy link
Copy Markdown
Collaborator Author

The decision was made to handle retries and waits, as necessary, at the script level rather than in the actual cli code. Closing this and will push a new PR for that work.

@ryan-hansen ryan-hansen closed this Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants