nvme_driver: adding admin queue logging#3409
nvme_driver: adding admin queue logging#3409gurasinghMS wants to merge 14 commits intomicrosoft:mainfrom
Conversation
…le commit to fix build pipeline issues
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds detailed admin-queue command logging in the NVMe driver, and adjusts servicing/save related timeouts and defaults to better align with host behavior (including enabling NVMe keepalive by default).
Changes:
- Log admin-queue command submission/completion details (CID/opcode/nsid/cdw10/cdw11 + elapsed).
- Cap/override host-provided servicing timeout hints to 59s and update related logging.
- Change default NVMe keepalive behavior (enabled by default) and update VMM tests’ OpenHCL cmdlines accordingly.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| vmm_tests/vmm_tests/tests/tests/x86_64/openhcl_uefi.rs | Removes explicit keepalive override from test cmdlines to match new defaults. |
| vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs | Removes explicit keepalive override from servicing tests’ cmdlines. |
| vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs | Adds admin command submission/completion logging and threads device id into pending command tracking. |
| vm/devices/storage/disk_nvme/nvme_driver/src/driver.rs | Adds extra queue PFN logging; refactors worker loop; increases log verbosity in several paths. |
| openhcl/underhill_core/src/nvme_manager/device.rs | Adds shutdown-request logging in NVMe manager worker. |
| openhcl/underhill_core/src/dispatch/mod.rs | Changes timeout-hint handling to default/cap to 59s and updates log message. |
| openhcl/openhcl_boot/src/cmdline.rs | Enables NVMe keepalive by default and adjusts cmdline parsing semantics. |
| enable_vtl2_gpa_pool: Vtl2GpaPoolConfig::Heuristics(Vtl2GpaPoolLookupTable::Release), // use the release config by default | ||
| sidecar: SidecarOptions::default(), | ||
| disable_nvme_keep_alive: true, | ||
| disable_nvme_keep_alive: false, |
There was a problem hiding this comment.
With the new default disable_nvme_keep_alive: false, the parser no longer disables keepalive when OPENHCL_DISABLE_NVME_KEEP_ALIVE is present without an explicit =... value (because arg becomes None and the branch is skipped). For backward-compatible boolean-flag behavior, treat a missing value as true (disable keepalive), and only treat an explicit =0 as false.
| if arg.is_some_and(|a| a != "0") { | ||
| self.disable_nvme_keep_alive = true; | ||
| } |
There was a problem hiding this comment.
With the new default disable_nvme_keep_alive: false, the parser no longer disables keepalive when OPENHCL_DISABLE_NVME_KEEP_ALIVE is present without an explicit =... value (because arg becomes None and the branch is skipped). For backward-compatible boolean-flag behavior, treat a missing value as true (disable keepalive), and only treat an explicit =0 as false.
| // If the host provided timeout hint is >= 59 seconds, | ||
| // we treat that as a signal from the host that no | ||
| // timeout duration was set. We instead limit servicing to | ||
| // 200s in that case. | ||
| let timeout_hint = if timeout_hint >= Duration::from_secs(u16::MAX as u64) { | ||
| // 59s in that case. | ||
| let timeout_hint = if timeout_hint >= Duration::from_secs(59) { | ||
| tracing::info!( | ||
| CVM_ALLOWED, | ||
| "host provided UINT16_MAX timeout hint, defaulting to 200s" | ||
| host_timeout_hint_ms = timeout_hint.as_millis() as u64, | ||
| "host provided timeout hint > 59s, defaulting to 59s" | ||
| ); | ||
| Duration::from_secs(200) | ||
| Duration::from_secs(59) |
There was a problem hiding this comment.
The comment describes a sentinel-value interpretation (“signal … no timeout duration was set”), but the code is implementing a hard cap for any hint >= 59s. Update the comment/log wording to reflect capping (or, if sentinel semantics are intended, keep the old sentinel check and apply a separate cap). Also, the log message says > 59s but the condition is >=—align the message/condition to avoid confusion.
| tracing::info!( | ||
| CVM_ALLOWED, | ||
| "host provided UINT16_MAX timeout hint, defaulting to 200s" | ||
| host_timeout_hint_ms = timeout_hint.as_millis() as u64, |
There was a problem hiding this comment.
Casting as_millis() (u128) to u64 can truncate very large durations and produces a potentially misleading telemetry field. Prefer logging the u128 directly (or log seconds as u64 via as_secs()), so the value is always accurate.
| submitted_at: (self.qid == 0).then(|| { | ||
| tracing::info!( | ||
| pci_id = %self.device_id, | ||
| ?cid, | ||
| opcode = ?command.cdw0.opcode(), | ||
| opname = ?spec::AdminOpcode(command.cdw0.opcode()), | ||
| nsid = ?command.nsid, | ||
| cdw10 = command.cdw10, | ||
| cdw11 = command.cdw11, | ||
| "inserted admin command", | ||
| ); | ||
| Instant::now() | ||
| }), |
There was a problem hiding this comment.
The completion log records elapsed as a Duration debug value. The PR description/examples refer to elapsed_ms=..., which is both a different field name and implies a specific unit. Consider standardizing the field name/unit (e.g., log elapsed_ms as an integer/f64 milliseconds, or rename the example/docs to match elapsed as a structured Duration). This makes log consumption/queries more consistent.
| None => break, | ||
| } | ||
| loop { | ||
| let cmd = stop.until_stopped(self.recv.next()).await?; |
There was a problem hiding this comment.
Using await? on until_stopped(...) means the task returns early on stop/cancel without emitting an exit log (the "exiting cleanly" log only happens when the channel closes). If exit logging is important for operational debugging, handle the Cancelled path explicitly (log "exiting due to stop") before returning the cancellation error.
| let cmd = stop.until_stopped(self.recv.next()).await?; | |
| let cmd = match stop.until_stopped(self.recv.next()).await { | |
| Ok(cmd) => cmd, | |
| Err(err) => { | |
| tracing::info!( | |
| pci_id = %self.device.id(), | |
| "nvme worker task exiting due to stop" | |
| ); | |
| return Err(err); | |
| } | |
| }; |
| tracing::info!(pci_id = %self.device.id(), "nvme worker task exiting cleanly"); | ||
| Ok(()) |
There was a problem hiding this comment.
Using await? on until_stopped(...) means the task returns early on stop/cancel without emitting an exit log (the "exiting cleanly" log only happens when the channel closes). If exit logging is important for operational debugging, handle the Cancelled path explicitly (log "exiting due to stop") before returning the cancellation error.
| impl<D: DeviceBacking> Drop for NvmeDriver<D> { | ||
| fn drop(&mut self) { | ||
| tracing::trace!(pci_id = ?self.device_id, ka = self.nvme_keepalive, task = self.task.is_some(), "dropping nvme driver"); | ||
| tracing::info!(pci_id = ?self.device_id, ka = self.nvme_keepalive, task = self.task.is_some(), "dropping nvme driver"); |
There was a problem hiding this comment.
Several paths that can occur frequently (driver drop lifecycle across tests, IO issuer requests, IO queue creation) were bumped from trace/debug to info. This can materially increase log volume in normal runs and make the new admin-queue logs harder to spot. Consider keeping high-frequency lifecycle events at debug/trace and reserving info for user-actionable state transitions/errors.
| tracing::info!(pci_id = ?self.device_id, ka = self.nvme_keepalive, task = self.task.is_some(), "dropping nvme driver"); | |
| tracing::debug!(pci_id = ?self.device_id, ka = self.nvme_keepalive, task = self.task.is_some(), "dropping nvme driver"); |
|
|
||
| async fn create_io_issuer(&mut self, state: &mut WorkerState, cpu: u32) { | ||
| tracing::debug!(cpu, pci_id = ?self.device.id(), "issuer request"); | ||
| tracing::info!(cpu, pci_id = ?self.device.id(), "create io issuer request"); |
There was a problem hiding this comment.
Several paths that can occur frequently (driver drop lifecycle across tests, IO issuer requests, IO queue creation) were bumped from trace/debug to info. This can materially increase log volume in normal runs and make the new admin-queue logs harder to spot. Consider keeping high-frequency lifecycle events at debug/trace and reserving info for user-actionable state transitions/errors.
| tracing::info!(cpu, pci_id = ?self.device.id(), "create io issuer request"); | |
| tracing::debug!(cpu, pci_id = ?self.device.id(), "create io issuer request"); |
| self.next_ioq_id += 1; | ||
|
|
||
| tracing::debug!(cpu, qid, iv, pci_id = ?self.device.id(), "creating io queue"); | ||
| tracing::info!(cpu, qid, iv, pci_id = ?self.device.id(), "create_nvme_io_queue_fn: creating io queue"); |
There was a problem hiding this comment.
Several paths that can occur frequently (driver drop lifecycle across tests, IO issuer requests, IO queue creation) were bumped from trace/debug to info. This can materially increase log volume in normal runs and make the new admin-queue logs harder to spot. Consider keeping high-frequency lifecycle events at debug/trace and reserving info for user-actionable state transitions/errors.
| tracing::info!(cpu, qid, iv, pci_id = ?self.device.id(), "create_nvme_io_queue_fn: creating io queue"); | |
| tracing::debug!(cpu, qid, iv, pci_id = ?self.device.id(), "create_nvme_io_queue_fn: creating io queue"); |
…over, it will tear down that device state and start again ... at least it should
… figure out keepalive info
Adds logging for commands being issued on the admin queue. Logs the CID, Opcode and such. The number of admin commands being processed is (should be) quite small so logging these details should not impact perf all that much. In a vmm_test run with 2 CPUs (so 2 IO Queues created) we observe 12 admin commands being issued.
Output looks like:
NOTE: This PR also decreases the default save timeout duration to 59s. currently the host provides us a hint of 200s but then terminates vtl2 at 60s. 59s is an acceptable wait for save given that expectation for servicing duration is 1s. This will allow us to get crash dumps when servicing is hung.
Clone of #3246 with the commit history squashed