-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
Fix variable deallocation order in panic unwinding paths #149435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
r? @wesleywiser rustbot has assigned @wesleywiser. Use |
|
r? @dianne |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there's a bug causing an assertion failure when building the standard library. I've given it a look and offered a guess at what's causing it below. There's still work to do here beyond fixing that, though.
First, could you add a ui test to demonstrate that this fixes #147875? It looks like it might not yet, since the code for scheduling unwind drops on calls panicking looks unchanged.
Second, after verifying that this results in the correct borrow-checking behavior, we need to make sure that this change doesn't negatively affect codegen. Per the old comment on needs_cleanup, at least at the time it was written, LLVM didn't handle the unnecessary cleanup blocks and StorageDeads particularly well. If you can demonstrate with codegen tests that that's not an issue anymore, and perf isn't too bad, that might be all that's needed. But my expectation is that we'll have to get rid of or ignore the StorageDeads later in compilation (sometime after they serve their purpose in borrowck). Unless there's a reason to keep the StorageDeads around longer, my gut feeling is that this cleanup would be best as a post-borrowck MIR pass (maybe as part of CleanupPostBorrowck?), since then optimization passes can be done on cleaner MIR and we can test it works with MIR tests rather than codegen tests. Could you also add a test for this not affecting later stages of compilation? If you accomplish that by removing the unwind-path StorageDeads as part of a MIR pass, that'd be a mir-opt test.
Before you push again, you'll probably want to run the codegen and mir-opt tests to make sure the former is clean and to bless the latter. Regardless of what approach we take here, if we're changing how the MIR is built, there should be differences in the MIR building test output (part of the mir-opt suite).
|
Reminder, once the PR becomes ready for a review, use |
|
Also, could you change the PR description? #147875 on its own doesn't allow destructors to access freed memory, it doesn't allow for the creation of dangling references, and I'm at least not aware of a safety guarantee that it violates. You should only get unsoundness out of it if you write unsafe code on the assumption that the borrow checker will enforce the relative drop order of locals that may have destructors and those that definitely don't. Of course, per language team decision, consistent drop order is a promise Rust would like to make. But it's not quite the same as the borrow-checker failing to ensure places outlive their references. |
|
So what i did was write this simple rust program panic drop.rs I ran the llvm to get the intermediate representaion and on looking at the IR I cannot find any llvm.lifetime.end statements suggesting to us that on master the StorageDead statements are missing, which according to my understanding means that the borrowchecker does not know when the storage becomes invalid. Let me now write the UI test to see what is up |
|
edit: adjusted wording |
5afe7c2 to
59a7e56
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs CI to pass before I can review it properly. I've left a few comments on obvious things, but I don't think reviewing the code changes would be helpful at this point. Please test your changes locally. You don't have to run the whole test suite yourself, but for this change, you'll at least want make sure that the mir-opt and codegen tests all pass, that any relevant ui tests pass, and that tidy passes as well.
Could you rebase onto a more recent commit, also? I don't expect there will be conflicts in the MIR building part of this, but I'm not sure about the rest.
I don't mean to be harsh, but this is a relatively complex and nuanced change. If you're not familiar with what's being changed, why it's being changed, the consequences/needs of that, and general contribution procedure, I'd recommend gaining familiarity with easier issues instead.
59a7e56 to
44fbdb3
Compare
|
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
e94d041 to
67a9bdb
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a full review, but I took a glance through the code and comment changes. I think there's some incorrect logic causing the r-a build failure and some suspicious checks papering over it in other cases. I'll take a closer look once CI passes.
| // Only adjust if the drop matches what unwind_to is pointing to (since we process | ||
| // drops in reverse order, unwind_to might not match the current drop). | ||
| if storage_dead_on_unwind | ||
| && unwind_to != DropIdx::MAX | ||
| && unwind_drops.drop_nodes[unwind_to].data.local == drop_data.local | ||
| && unwind_drops.drop_nodes[unwind_to].data.kind == drop_data.kind | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels suspicious. I think this would need a lot more justification to turn those asserts into conditions, so I have a feeling something is wrong and this is papering over it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unwind_to should always match when processing a Value drop, If it doesn't, we should advance through any interleaved StorageDead/ForLint drops first, The conditional check hides this by silently skipping the adjustment. The conditional check hides the mismatch, but, unwind_to can point to drops from other scopes
We may skip advancing unwind_to when it should advance. This can desync the unwind tree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the real problem is diverge_cleanup_target builds the unwind tree in forward order across multiple scopes and build_scope_drops processes one scope's drops in reverse order.
unwind_to can point to a different drop type (StorageDead/ForLint) when processing a Value drop, or vice versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the approaches i can think of is when processing a drop, if unwind_to doesn't match, walk forward in the unwind tree until we find a matching drop.
| let unwind_drop = self | ||
| .scopes | ||
| .unwind_drops | ||
| .add_drop(drop_node.data, unwind_indices[drop_node.next]); | ||
| unwind_indices.push(unwind_drop); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be unconditional. We only want to add StorageDeads if there's a real (or for-fcw) drop, and maybe ideally only before the last of those (if it's not too complicated). Not sure, but this might be what's causing the problem building r-a, since that involves breaking from a loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally there should be some way to write all of this without requiring duplicated logic all over the place. It feels really fragile to have to know in several different parts of that file exactly when unwind drops should be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option to keep it simple would be to always include StorageDeads when building the MIR, then get rid of empty cleanup paths in a post-borrowck pass. I'm not sure how much worse for perf that would be, but we could try profiling it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets do one thing, ill make the changes, make sure the tests pass and then lets do a profile. Its better to get a sanity check that the changes look good in a first pass. We can always go back and forth on the design and improvments. Lets get the sanity and structure right first.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From your commit message:
This fixes compilation errors that are blocking CI, though these are
pre-existing issues unrelated to the StorageDead changes.
I guess this breakage comes from this PR rather than pre-existing, as this isn't failing in other PRs?
This comment has been minimized.
This comment has been minimized.
This commit fixes several issues related to StorageDead and ForLint drops:
1. Add StorageDead and ForLint drops to unwind_drops for all functions
- Updated diverge_cleanup_target to include StorageDead and ForLint drops
in the unwind_drops tree for all functions (not just coroutines), but only
when there's a cleanup path (i.e., when there are Value or ForLint drops)
- This ensures proper drop ordering for borrow-checking on panic paths
2. Fix break_for_tail_call to handle StorageDead and ForLint drops
- Don't skip StorageDead drops for non-drop types
- Adjust unwind_to pointer for StorageDead and ForLint drops, matching
the behavior in build_scope_drops
- Only adjust unwind_to when it's valid (not DropIdx::MAX)
- This prevents debug assert failures when processing drops in tail calls
3. Fix index out of bounds panic when unwind_to is DropIdx::MAX
- Added checks to ensure unwind_to != DropIdx::MAX before accessing
unwind_drops.drop_nodes[unwind_to]
- Only emit StorageDead on unwind paths when there's actually an unwind path
- Only add entry points to unwind_drops when unwind_to is valid
- This prevents panics when there's no cleanup needed
4. Add test for explicit tail calls with StorageDead drops
- Tests that tail calls work correctly when StorageDead and ForLint drops
are present in the unwind path
- Verifies that unwind_to is correctly adjusted for all drop kinds
These changes make the borrow-checker stricter and more consistent by ensuring
that StorageDead statements are emitted on unwind paths for all functions when
there's a cleanup path, allowing unsafe code to rely on drop order being enforced
consistently.
- Add StorageDead to unwind paths for all functions (not just coroutines) - Modify CleanupPostBorrowck to remove StorageDead from cleanup blocks - Add tests for the fix and StorageDead removal
When processing drops in reverse order, unwind_to might not point to the current drop. Only adjust unwind_to when the drop matches what unwind_to is pointing to, rather than asserting they must match.
Fix lifetime issues in rust-analyzer where automaton doesn't live long enough for op.union(). Move op declaration inside each match arm to ensure proper lifetime scope. This fixes compilation errors that are blocking CI, though these are pre-existing issues unrelated to the StorageDead changes.
When processing drops in reverse order, unwind_to might not point to the current drop. Make the unwind_to adjustment conditional on the drop matching, matching the behavior in build_scope_drops. This prevents assertion failures when unwind_to points to a different drop than the one being processed.
- Remove conditional logic and optimization from diverge_cleanup_target - Remove conditional logic from build_exit_tree - Always add StorageDead when there are Value/ForLint drops - Cleanup passes (CleanupPostBorrowck, RemoveNoopLandingPads) handle removal - Fixes reviewer comments about code duplication and fragility
49c1d6a to
e4790cd
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
@dianne Thanks for the first pass review. I addressed most of the comments here. Let me try and get the tests passing and a general structure right. We can then go over any more design and profiling stuff later on. |
This comment has been minimized.
This comment has been minimized.
d8a1764 to
2ed7b45
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
e6b9a18 to
1556a29
Compare
This comment has been minimized.
This comment has been minimized.
Simplify boolean expressions in scope.rs to fix clippy::needless_bool lint failures, and update test expectations for dropck_trait_cycle_checked and ctfe-arg-bad-borrow to match new StorageDead behavior.
1556a29 to
44b8dda
Compare
|
The job Click to see the possible cause of the failure (guessed by this bot) |
This PR fixes a soundness bug where local variables are deallocated out of order during panic unwinding, allowing destructors to access freed memory. This violates Rust's safety guarantees and has caused real-world unsoundness in crates like generatively.
This PR removes the is_generator check and unconditionally emits StorageDead statements during unwinding for ALL functions, bringing non-generator behavior in line with generators. It ensures that during unwinding, when a local variable goes out of scope, its storage is properly marked as dead via StorageDead, allowing the borrow checker to enforce the
invariant that values must outlive their references even in panic paths.
Fixes #147875