write VMSA to crash page#3386
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a mechanism for SNP-backed VPs to snapshot the current VMSA into a shared “crash page” and publishes that page’s address/size via the guest crash registers, presumably to aid host-side debugging.
Changes:
- Allocate a per-VP shared DMA page and store its PA in
SnpBacked. - On VTL switch and after each
run()exit, publish crash page info viaGuestCrashP3/P4and write the VMSA bytes into the page. - Add
VmsaWrapper::as_bytes()to expose raw VMSA bytes; add a new error variant for crash-page allocation failures.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
openhcl/virt_mshv_vtl/src/processor/snp/mod.rs |
Allocates crash page; publishes crash regs; writes VMSA snapshot on VTL switch and every VP exit. |
openhcl/virt_mshv_vtl/src/lib.rs |
Adds AllocateCrashVmsaPage error variant for the new allocation path. |
openhcl/hcl/src/vmsa.rs |
Exposes raw VMSA bytes via VmsaWrapper::as_bytes(). |
| let vmsa_size = size_of::<SevVmsa>() as u64; | ||
|
|
||
| // Keep the crash-page register pair pointing to a host-visible page | ||
| // so a hypervisor debugger can find the latest VMSA snapshot. | ||
| this.crash_reg[3] = crash_page_pa; | ||
| this.crash_reg[4] = vmsa_size; | ||
|
|
||
| for target_vtl in [GuestVtl::Vtl0, GuestVtl::Vtl1] { | ||
| if let Err(err) = this.runner.set_vp_register( | ||
| target_vtl, | ||
| HvX64RegisterName::GuestCrashP3, | ||
| crash_page_pa.into(), | ||
| ) { | ||
| tracelimit::warn_ratelimited!( | ||
| CVM_ALLOWED, | ||
| ?err, | ||
| ?target_vtl, | ||
| "failed to publish crash page address" | ||
| ); | ||
| } | ||
|
|
||
| if let Err(err) = this.runner.set_vp_register( | ||
| target_vtl, | ||
| HvX64RegisterName::GuestCrashP4, | ||
| vmsa_size.into(), | ||
| ) { | ||
| tracelimit::warn_ratelimited!( | ||
| CVM_ALLOWED, | ||
| ?err, | ||
| ?target_vtl, | ||
| "failed to publish crash page size" | ||
| ); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
HV_X64_MSR_GUEST_CRASH_P3/P4 are defined as the crash-message GPA/length (see hvdef::GuestCrashCtl docs), and the VTL crash notification path in processor/mod.rs reads crash_reg[3]/[4] to fetch that message. Overwriting this.crash_reg[3]/[4] here will clobber guest-provided crash message parameters and can cause crash notifications to read from the wrong GPA.
Consider keeping the guest crash registers exclusively guest-owned (only updated on guest MSR writes), and publishing the VMSA snapshot pointer via a separate mechanism/state that doesn’t reuse the crash-message registers.
| let vmsa_size = size_of::<SevVmsa>() as u64; | |
| // Keep the crash-page register pair pointing to a host-visible page | |
| // so a hypervisor debugger can find the latest VMSA snapshot. | |
| this.crash_reg[3] = crash_page_pa; | |
| this.crash_reg[4] = vmsa_size; | |
| for target_vtl in [GuestVtl::Vtl0, GuestVtl::Vtl1] { | |
| if let Err(err) = this.runner.set_vp_register( | |
| target_vtl, | |
| HvX64RegisterName::GuestCrashP3, | |
| crash_page_pa.into(), | |
| ) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| ?target_vtl, | |
| "failed to publish crash page address" | |
| ); | |
| } | |
| if let Err(err) = this.runner.set_vp_register( | |
| target_vtl, | |
| HvX64RegisterName::GuestCrashP4, | |
| vmsa_size.into(), | |
| ) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| ?target_vtl, | |
| "failed to publish crash page size" | |
| ); | |
| } | |
| } | |
| // Do not reuse the guest crash-message registers (P3/P4) to publish | |
| // the VMSA snapshot location. Those registers are guest-owned and are | |
| // consumed by the crash notification path as the crash-message GPA and | |
| // length. |
| // Keep the crash-page register pair pointing to a host-visible page | ||
| // so a hypervisor debugger can find the latest VMSA snapshot. | ||
| this.crash_reg[3] = crash_page_pa; | ||
| this.crash_reg[4] = vmsa_size; | ||
|
|
||
| for target_vtl in [GuestVtl::Vtl0, GuestVtl::Vtl1] { | ||
| if let Err(err) = this.runner.set_vp_register( | ||
| target_vtl, | ||
| HvX64RegisterName::GuestCrashP3, | ||
| crash_page_pa.into(), | ||
| ) { | ||
| tracelimit::warn_ratelimited!( | ||
| CVM_ALLOWED, | ||
| ?err, | ||
| ?target_vtl, | ||
| "failed to publish crash page address" | ||
| ); | ||
| } | ||
|
|
||
| if let Err(err) = this.runner.set_vp_register( | ||
| target_vtl, | ||
| HvX64RegisterName::GuestCrashP4, | ||
| vmsa_size.into(), | ||
| ) { | ||
| tracelimit::warn_ratelimited!( | ||
| CVM_ALLOWED, | ||
| ?err, | ||
| ?target_vtl, | ||
| "failed to publish crash page size" | ||
| ); | ||
| } | ||
| } | ||
|
|
||
| let vmsa = this.runner.vmsa(vtl); | ||
| let src = vmsa.as_bytes(); | ||
| if let Err(err) = this.shared.cvm.shared_memory.write_at(crash_page_pa, src) { | ||
| tracelimit::warn_ratelimited!( | ||
| CVM_ALLOWED, | ||
| ?err, | ||
| crash_page_pa, | ||
| "failed to write crash VMSA page" | ||
| ); |
There was a problem hiding this comment.
This copies the full VMSA into a shared/host-visible page (“so a hypervisor debugger can find…”). On SNP, that effectively exports guest register state (and the register protection nonce) to the untrusted host/hypervisor, which is a major confidentiality risk.
This should be gated behind an explicit debug/diagnostics opt-in (and ideally compiled/available only in debug builds), or restricted to a crash-only path where the guest has already decided to disclose crash details.
| // Keep the crash-page register pair pointing to a host-visible page | |
| // so a hypervisor debugger can find the latest VMSA snapshot. | |
| this.crash_reg[3] = crash_page_pa; | |
| this.crash_reg[4] = vmsa_size; | |
| for target_vtl in [GuestVtl::Vtl0, GuestVtl::Vtl1] { | |
| if let Err(err) = this.runner.set_vp_register( | |
| target_vtl, | |
| HvX64RegisterName::GuestCrashP3, | |
| crash_page_pa.into(), | |
| ) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| ?target_vtl, | |
| "failed to publish crash page address" | |
| ); | |
| } | |
| if let Err(err) = this.runner.set_vp_register( | |
| target_vtl, | |
| HvX64RegisterName::GuestCrashP4, | |
| vmsa_size.into(), | |
| ) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| ?target_vtl, | |
| "failed to publish crash page size" | |
| ); | |
| } | |
| } | |
| let vmsa = this.runner.vmsa(vtl); | |
| let src = vmsa.as_bytes(); | |
| if let Err(err) = this.shared.cvm.shared_memory.write_at(crash_page_pa, src) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| crash_page_pa, | |
| "failed to write crash VMSA page" | |
| ); | |
| if cfg!(debug_assertions) { | |
| // Only publish a host-visible VMSA snapshot in debug builds. | |
| // In production SNP guests this would disclose sensitive guest | |
| // register state to the untrusted host/hypervisor. | |
| this.crash_reg[3] = crash_page_pa; | |
| this.crash_reg[4] = vmsa_size; | |
| for target_vtl in [GuestVtl::Vtl0, GuestVtl::Vtl1] { | |
| if let Err(err) = this.runner.set_vp_register( | |
| target_vtl, | |
| HvX64RegisterName::GuestCrashP3, | |
| crash_page_pa.into(), | |
| ) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| ?target_vtl, | |
| "failed to publish crash page address" | |
| ); | |
| } | |
| if let Err(err) = this.runner.set_vp_register( | |
| target_vtl, | |
| HvX64RegisterName::GuestCrashP4, | |
| vmsa_size.into(), | |
| ) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| ?target_vtl, | |
| "failed to publish crash page size" | |
| ); | |
| } | |
| } | |
| let vmsa = this.runner.vmsa(vtl); | |
| let src = vmsa.as_bytes(); | |
| if let Err(err) = this.shared.cvm.shared_memory.write_at(crash_page_pa, src) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| crash_page_pa, | |
| "failed to write crash VMSA page" | |
| ); | |
| } | |
| } else { | |
| // Do not expose a host-visible VMSA page in non-debug builds. | |
| this.crash_reg[3] = 0; | |
| this.crash_reg[4] = 0; | |
| for target_vtl in [GuestVtl::Vtl0, GuestVtl::Vtl1] { | |
| if let Err(err) = this.runner.set_vp_register( | |
| target_vtl, | |
| HvX64RegisterName::GuestCrashP3, | |
| 0.into(), | |
| ) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| ?target_vtl, | |
| "failed to clear crash page address" | |
| ); | |
| } | |
| if let Err(err) = this.runner.set_vp_register( | |
| target_vtl, | |
| HvX64RegisterName::GuestCrashP4, | |
| 0.into(), | |
| ) { | |
| tracelimit::warn_ratelimited!( | |
| CVM_ALLOWED, | |
| ?err, | |
| ?target_vtl, | |
| "failed to clear crash page size" | |
| ); | |
| } | |
| } |
| .map_err(|e| dev.fatal_error(SnpRunVpError(e).into()))?; | ||
|
|
||
| let entered_from_vtl = next_vtl; | ||
| SnpBacked::update_vmsa_crash_page(self, entered_from_vtl); |
There was a problem hiding this comment.
update_vmsa_crash_page() is called on every VP exit (right after run()), which means an extra ~page-sized copy per exit. On workloads with frequent exits this can become a noticeable overhead.
If this is intended primarily for post-mortem/debugging, consider updating the crash page only when a crash is detected (e.g., on guest crash MSR write), or at a lower frequency / behind a diagnostics flag.
| SnpBacked::update_vmsa_crash_page(self, entered_from_vtl); | |
| if has_intercept { | |
| SnpBacked::update_vmsa_crash_page(self, entered_from_vtl); | |
| } |
| } | ||
|
|
||
| fn update_vmsa_crash_page(this: &mut UhProcessor<'_, Self>, vtl: GuestVtl) { | ||
| let crash_page_pa = this.backing.crash_vmsa_page_pa; |
There was a problem hiding this comment.
This code assumes SevVmsa fits in the single page allocated for crash_vmsa_page. There’s no local assert/clamp to HV_PAGE_SIZE, so a future layout change could make vmsa_size exceed the page and turn this into a partial write/failure.
Add an explicit assert!(size_of::<SevVmsa>() <= HV_PAGE_SIZE as usize) (or clamp the copy length and published size) to make the assumption self-checking.
| let crash_page_pa = this.backing.crash_vmsa_page_pa; | |
| let crash_page_pa = this.backing.crash_vmsa_page_pa; | |
| assert!(size_of::<SevVmsa>() <= HV_PAGE_SIZE as usize); |
|
|
||
| /// Wraps a SEV VMSA structure with the register tweak bitmap to provide safe access methods. | ||
| impl<T: Deref<Target = SevVmsa>> VmsaWrapper<'_, T> { | ||
| /// Returns the raw VMSA bytes. |
There was a problem hiding this comment.
as_bytes() exposes the complete raw VMSA image (including the register protection nonce and XOR-masked protected fields). Given how sensitive this data is, the doc comment should explicitly call out that callers must treat the returned bytes as highly confidential and should not copy it into host-visible/shared memory unless explicitly intended.
If possible, also consider whether this needs to be a public API vs. restricted to the crates that implement low-level diagnostics.
| /// Returns the raw VMSA bytes. | |
| /// Returns the complete raw VMSA image as bytes. | |
| /// | |
| /// This includes highly sensitive register-protection state, including the | |
| /// register protection nonce and XOR-masked protected fields. Callers must | |
| /// treat the returned bytes as highly confidential and must not copy them | |
| /// into host-visible or shared memory unless that exposure is explicitly | |
| /// intended (for example, for tightly controlled low-level diagnostics). |
No description provided.