Skip to content

nvme_driver: only support keepalive for a certain class of devices#3415

Draft
gurasinghMS wants to merge 18 commits intomicrosoft:mainfrom
gurasinghMS:no-keepalive-for-nd2-devices
Draft

nvme_driver: only support keepalive for a certain class of devices#3415
gurasinghMS wants to merge 18 commits intomicrosoft:mainfrom
gurasinghMS:no-keepalive-for-nd2-devices

Conversation

@gurasinghMS
Copy link
Copy Markdown
Contributor

No description provided.

Copilot AI review requested due to automatic review settings May 1, 2026 21:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR narrows NVMe “keepalive” (servicing without device reset) behavior in Underhill to only a specific class of devices, as a mitigation for known incompatibilities with certain devices

Changes:

  • Introduces a helper to decide whether a device is “keepalive-compatible” based on its identifier.
  • During servicing shutdown, only enables keepalive (do-not-reset / skip shutdown) for compatible devices; logs when keepalive is requested but disabled per-device.
  • During servicing save, skips persisting NVMe state for incompatible devices.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
openhcl/underhill_core/src/nvme_manager/mod.rs Adds a helper function to gate keepalive compatibility based on an identifier heuristic.
openhcl/underhill_core/src/nvme_manager/manager.rs Applies the compatibility gate to shutdown options and to which devices participate in save/restore.

Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread openhcl/underhill_core/src/nvme_manager/manager.rs Outdated
Comment thread openhcl/underhill_core/src/nvme_manager/manager.rs
Comment on lines +305 to +320
// nvme_keepalive is received from host but it is only valid
// when memory pool allocator supports save/restore. Further,
// as a partial mitigation for known incompatibilities between
// keepalive and NVMe Direct v2 devices, we only honor
// keepalive for ASAP devices (identified by the VPCI
// instance GUID containing `c05b`).
let host_requested_keepalive =
nvme_keepalive && self.context.save_restore_supported;
let device_keepalive =
host_requested_keepalive && is_nvme_keepalive_compatible(&pci_id);
if host_requested_keepalive && !device_keepalive {
tracing::info!(
%pci_id,
"disabling nvme keepalive for non-ASAP device; \
falling back to reset-on-servicing"
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good for a proof of concept + evaluating this fix. As was discussed offline: you'll want to use device identity data to make this analysis more robust.

@mattkur
Copy link
Copy Markdown
Contributor

mattkur commented May 1, 2026

Cool. Thanks for giving this a try. Please make sure to write a CI test that validates this works when there's two devices: device (a) that supports keepalive and device (b) that does not. Hopefully that will be rather quick, and will catch basic problems in this path.

Copilot AI review requested due to automatic review settings May 1, 2026 23:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

Comment thread openhcl/underhill_core/src/nvme_manager/manager.rs
Comment on lines +309 to +314
do_not_reset: nvme_keepalive
&& self.context.save_restore_supported
&& is_nvme_keepalive_compatible(&pci_id),
skip_device_shutdown: nvme_keepalive
&& self.context.save_restore_supported
&& is_nvme_keepalive_compatible(&pci_id),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow. I think the code's logic is correct.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting, makes me wanna think twice about it because when my local copilot first wrote this code it also decided to do !is_nvme_keepalive_compatible(&pci_id) here and I had corrected it.

Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread openhcl/underhill_core/src/nvme_manager/manager.rs Outdated
Comment on lines +433 to +444
.filter_map(|(pci_id, driver)| {
if is_nvme_keepalive_compatible(pci_id) {
Some((pci_id.clone(), driver.client().clone()))
} else {
tracing::info!(
%pci_id,
"skipping save of nvme device; \
keepalive disabled for this device"
);
None
}
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. I don't know how long the logging prolongs the time inside the lock (which lock? I don't see locks in this function), so don't know if we should fix this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what lock it is talking about either. But unless there are hundreds of devices, which there shouldn't be, I think this is fine. @alandau you ok moving forward with this?

Copilot AI review requested due to automatic review settings May 4, 2026 19:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs
Copy link
Copy Markdown
Contributor

@alandau alandau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few nits inline.
Do we want to allow keepalive only on c05b, or find out the pciid for the problematic hardware and disable keepalive only for it?

Comment on lines +309 to +314
do_not_reset: nvme_keepalive
&& self.context.save_restore_supported
&& is_nvme_keepalive_compatible(&pci_id),
skip_device_shutdown: nvme_keepalive
&& self.context.save_restore_supported
&& is_nvme_keepalive_compatible(&pci_id),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow. I think the code's logic is correct.

Comment thread openhcl/underhill_core/src/nvme_manager/manager.rs
Comment thread openhcl/underhill_core/src/nvme_manager/manager.rs Outdated
Comment on lines +433 to +444
.filter_map(|(pci_id, driver)| {
if is_nvme_keepalive_compatible(pci_id) {
Some((pci_id.clone(), driver.client().clone()))
} else {
tracing::info!(
%pci_id,
"skipping save of nvme device; \
keepalive disabled for this device"
);
None
}
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. I don't know how long the logging prolongs the time inside the lock (which lock? I don't see locks in this function), so don't know if we should fix this.

Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs Outdated
Copilot AI review requested due to automatic review settings May 4, 2026 19:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Comment thread openhcl/underhill_core/src/nvme_manager/manager.rs
Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated
Comment thread vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs
…over, it will tear down that device state and start again ... at least it should
Copilot AI review requested due to automatic review settings May 4, 2026 20:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

openhcl/underhill_core/src/nvme_manager/manager.rs:1

  • The restore path still iterates every saved NVMe entry and always passes Some(&disk.driver_state), even when the PCI ID is not keepalive-compatible. That means a servicing upgrade from an older build that saved all NVMe controllers will still attempt to restore state for non-c05b devices instead of forcing them through the reset path. The save-side filter added in this PR only protects states produced by the new code, not states produced by the previous version.
// Copyright (c) Microsoft Corporation.

Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs
Comment on lines +1331 to +1340
vm.restart_openhcl(igvm_file.clone(), flags).await?;

agent.ping().await?;

// The non-c05b device must issue CREATE_IO_COMPLETION_QUEUE after
// servicing because its keepalive was downgraded to reset.
CancelContext::new()
.with_timeout(Duration::from_secs(60))
.until_cancelled(no_keepalive_create_seen_recv)
.await
source: NvmeSpawnerError,
}

/// Device prefix for nvme devices that are compatibble with keepalive.
Comment on lines +119 to +122
/// Returns whether the given PCI ID corresponds to an NVMe device that is
/// compatible with the keepalive.
/// DEV_NOTE: This is a weak heuristic based approach which is not ideal but is necessary
pub(crate) fn is_nvme_keepalive_compatible(pci_id: &str) -> bool {
@alandau
Copy link
Copy Markdown
Contributor

alandau commented May 4, 2026

LGTM


// Read the PCI vendor/device IDs once and cache the resulting
// keepalive compatibility flag on the `NvmeDriverManager`.
let keepalive_compatible = is_nvme_keepalive_compatible(&pci_id);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alandau do you know if the device would be resolved at this point? Like if the device has not arrived back and reading the config space leads to a stall, it could impact boot times (which is no better than not doing keepalive at all)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't. But I remember at some point we ask VfioDevice to restore and it has to go to /sys to do its magic. You can try seeing if it's doing it past this point or not.

Copilot AI review requested due to automatic review settings May 5, 2026 00:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Comment on lines +364 to +370
let keepalive_compatible = is_nvme_keepalive_compatible(&pci_id);

let driver = NvmeDriverManager::new(
&context.driver_source,
&pci_id,
context.vp_count,
context.save_restore_supported,
context.save_restore_supported && keepalive_compatible,
Comment on lines +1190 to +1194
#[openvmm_test(openhcl_linux_direct_x64 [LATEST_LINUX_DIRECT_TEST_X64])]
async fn servicing_keepalive_per_device_gate(
config: PetriVmBuilder<OpenVmmPetriBackend>,
(igvm_file,): (ResolvedArtifact<impl petri_artifacts_common::tags::IsOpenhclIgvm>,),
) -> Result<(), anyhow::Error> {
Comment on lines +1245 to +1248
.with_hardware_config_fault(
HardwareConfigFaultConfig::new()
.with_vendor_id(0x1414)
.with_device_id(0xb111),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow. Do you?

Comment on lines +143 to +149
let hardware_config_fault = fault_configuration.hardware_config_fault.take();
let vendor_id = hardware_config_fault
.and_then(|f| f.vendor_id)
.unwrap_or(VENDOR_ID);
let device_id = hardware_config_fault
.and_then(|f| f.device_id)
.unwrap_or(0x00a9);
@alandau
Copy link
Copy Markdown
Contributor

alandau commented May 5, 2026

LGTM, a few small ones inline

Copy link
Copy Markdown
Contributor

@alandau alandau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, apparently commenting on the discussion tab doesn't submit pending line comments.

Comment thread openhcl/underhill_core/src/nvme_manager/mod.rs Outdated

// Read the PCI vendor/device IDs once and cache the resulting
// keepalive compatibility flag on the `NvmeDriverManager`.
let keepalive_compatible = is_nvme_keepalive_compatible(&pci_id);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't. But I remember at some point we ask VfioDevice to restore and it has to go to /sys to do its magic. You can try seeing if it's doing it past this point or not.

Comment on lines +1245 to +1248
.with_hardware_config_fault(
HardwareConfigFaultConfig::new()
.with_vendor_id(0x1414)
.with_device_id(0xb111),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow. Do you?

Copilot AI review requested due to automatic review settings May 7, 2026 00:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

use nvme_resources::fault::AdminQueueFaultBehavior;
use nvme_resources::fault::AdminQueueFaultConfig;
use nvme_resources::fault::FaultConfiguration;
use nvme_resources::fault::HardwareConfigFaultConfig;
Comment on lines 361 to +365
match guard.entry(pci_id.clone()) {
hash_map::Entry::Occupied(_) => unreachable!(), // We checked above that this entry does not exist.
hash_map::Entry::Vacant(entry) => {
let keepalive_compatible = is_nvme_keepalive_compatible(&pci_id);

Comment on lines 432 to 436
let mut devices_to_save: HashMap<String, NvmeDriverManagerClient> = self
.context
.devices
.write()
.iter()
This reverts commit 4efda83.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants