openvmm_entry: add VM attach commands by jstarks · Pull Request #3403 · microsoft/openvmm

jstarks · 2026-04-29T19:50:02Z

Running OpenVMM as a long-lived process currently ties interactive management to the original process or the ttrpc/grpc service path. This adds a mesh-backed attach socket so a later client can connect to the existing VM, open a REPL, or issue a one-shot inspect without taking ownership of VM lifetime.

The entry point now parses run, attach, and inspect as top-level commands while preserving implicit run for existing launch invocations. VM ownership is centralized in VmController, including server-mode VM creation and teardown, so both normal launches and ttrpc/grpc mode can expose the same attach resources. The Guide documents the new command shape, --mesh-listen, and the security and lifetime expectations for the socket.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a mesh-backed attach socket and new top-level CLI commands (run, attach, inspect) so that later clients can connect to and manage an already-running OpenVMM instance without taking ownership of VM lifetime.

Changes:

Introduces an attach protocol (vm_connect) and attach listener lifecycle in VmController (--mesh-listen).
Refactors VM ownership into VmController (CurrentVm, server-mode CreateVm/TeardownVm) to share attach resources across CLI and ttrpc/grpc modes.
Updates CLI parsing to support subcommands while preserving implicit run, and documents the new command shape in the Guide.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
openvmm/openvmm_entry/src/vm_controller.rs	Centralizes VM ownership, adds attach/connect and server-mode VM create/teardown RPCs, and manages attach listener lifecycle.
openvmm/openvmm_entry/src/vm_connect.rs	Adds the mesh-based attach request/response protocol and a background listener task.
openvmm/openvmm_entry/src/ttrpc/mod.rs	Switches ttrpc/grpc service to create/teardown VMs via `VmController` RPCs and wires controller channels into the worker.
openvmm/openvmm_entry/src/repl.rs	Adds “detach” vs “quit VM” behavior for REPL exits to support attach mode.
openvmm/openvmm_entry/src/meshworker.rs	Exposes `process_mesh()` for features requiring a process mesh (external listeners).
openvmm/openvmm_entry/src/lib.rs	Adds `attach`/`inspect` command handling, connect-to-VM helper, and refactors server-mode startup around `VmController`.
openvmm/openvmm_entry/src/cli_args.rs	Introduces top-level subcommands and legacy parsing fallback, adds `--mesh-listen`, and extends parsing tests.
Guide/src/reference/openvmm/management/cli.md	Documents new subcommand layout and `--mesh-listen` behavior/security expectations.

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Copilot · 2026-04-29T20:38:00Z

+/// clients.
+///
+/// Returns a handle that must be kept alive. Call [`AttachListener::shutdown`]
+/// to stop accepting, shut down the mesh, join the listener task, and remove


The doc comment claims AttachListener::shutdown will “shut down the mesh”, but the implementation only signals the task to stop and then removes the socket file; it never calls mesh.shutdown() (and it doesn't even have access to a mesh handle). Please update the comment to reflect what actually happens (stop accepting + join task + remove socket).

Suggested change

/// to stop accepting, shut down the mesh, join the listener task, and remove

/// to stop accepting new connections, join the listener task, and remove

Copilot · 2026-04-29T20:38:01Z

+pub struct VmController {
+    pub(crate) driver: DefaultDriver,
+    pub(crate) mesh: VmmMesh,
+    pub(crate) vm_controller: mesh::Sender<VmControllerRpc>,
+    pub(crate) current_vm: Option<CurrentVm>,
+    pub(crate) attach_path: Option<PathBuf>,
+    pub(crate) attach_listener: Option<vm_connect::AttachListener>,
+    pub(crate) exit_on_vm_stop: bool,


VmController now holds a vm_controller: mesh::Sender<VmControllerRpc> to its own RPC channel. Because at least one sender is always kept alive by the controller itself, rpc_recv.next().await will never return None, so Event::RpcClosed can't trigger when the REPL/stdin thread exits without sending Quit. In run_control_inner the main task awaits controller_task, so this can hang indefinitely with a still-running VM.

Consider removing the self-referential sender from VmController (or ensuring it is dropped once the attach listener is set up), so that dropping the last external client sender can still close the channel and allow the controller to stop/exit as before.

Copilot · 2026-04-29T20:38:02Z

+    async fn teardown_current_vm_after_worker_stop(&mut self) {
+        if let Some(mut current_vm) = self.current_vm.take() {
+            if let Some(mut vnc) = current_vm.vnc_worker.take() {
+                vnc.stop();
+                let _ = vnc.join().await;
+            }
+            if let Some(mut gdb) = current_vm.gdb_worker.take() {
+                gdb.stop();
+                let _ = gdb.join().await;
+            }
+        }


teardown_current_vm_after_worker_stop() drops current_vm.vm_worker without ever calling join(). WorkerEvent::Stopped/Failed only indicates an event was observed; WorkerHandle::join() is the only place that actually waits for the worker event stream to terminate and propagates failure. Please join the VM worker here (or reuse stop_current_vm() with a no-op stop) so the worker is fully reaped and errors aren't silently lost.

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Copilot · 2026-04-29T21:03:52Z

-                    transport,
-                })
-                .await?;
+    // Signal the the parent process that the server is ready.


The comment has a duplicated word: "Signal the the parent process". Please fix to "Signal the parent process" to avoid confusion in this startup/readiness path.

Suggested change

// Signal the the parent process that the server is ready.

// Signal the parent process that the server is ready.

Copilot · 2026-04-29T21:03:52Z

+/// Start a mesh listener that hands out VM control channels to connecting
+/// clients.
+///
+/// Returns a handle that must be kept alive. Call [`AttachListener::shutdown`]
+/// to stop accepting, shut down the mesh, join the listener task, and remove
+/// the socket file.
+pub async fn start_attach_listener(
+    mesh: &mesh_process::Mesh,
+    driver: &impl Spawn,
+    path: &Path,
+    vm_controller: mesh::Sender<VmControllerRpc>,
+) -> anyhow::Result<AttachListener> {


The doc comment for start_attach_listener says AttachListener::shutdown will "shut down the mesh", but the implementation only stops the listener task and removes the socket file. Please update the comment to match the actual behavior (or extend shutdown to also shut down the mesh if that is the intent).

Copilot · 2026-04-29T21:03:52Z

+impl VmConnectResponse {
+    pub fn into_repl_resources(self) -> repl::ReplResources {
+        let (send, recv) = mesh::channel();
+        drop(send);
+        repl::ReplResources {
+            vm_rpc: self.vm_rpc,
+            vm_controller: self.vm_controller,
+            vm_controller_events: recv,
+            scsi_rpc: self.scsi_rpc,
+            nvme_vtl2_rpc: self.nvme_vtl2_rpc,
+            shutdown_ic: self.shutdown_ic,
+            kvp_ic: self.kvp_ic,
+            console_in: None,
+            has_vtl2: self.has_vtl2,
+            quit_behavior: repl::ReplQuitBehavior::Detach,
+        }
+    }


VmConnectResponse::into_repl_resources constructs vm_controller_events as a receiver whose sender is immediately dropped. This means an attached REPL will never receive VmControllerEvent::WorkerStopped / GuestHalt events, so run_repl can't automatically react when the VM stops (it will only exit when the user types quit). Consider extending the attach protocol to provide a real event stream (e.g., a per-attach subscription/broadcast receiver) or otherwise signal disconnect/VM termination to the REPL.

github-actions · 2026-04-29T22:18:37Z

4 Petri tests failed (0 unstable)

github-actions · 2026-04-30T01:49:35Z

2 Petri tests failed (0 unstable)

smalis-msft · 2026-04-30T19:18:37Z


-    child.wait()?;
+        let (status_send, status_recv) = mesh::oneshot();
+        std::thread::Builder::new()


Why do we need to do this whole dance just to wait on the child process?

smalis-msft · 2026-04-30T19:21:08Z

+will reject the command. Use `--pidfile <PATH>` to write the server process ID
+on startup and remove it on clean exit.
+
+The positional `SOCKETPATH` is the management RPC server socket. This is


Is it possible/desired to expose only a mesh socket? I guess that wouldn't have much value, since you couldn't launch a vm through it.

You could imagine only exposing a mesh socket and adding a way to specify the starting params via the CLI. I think we might get there eventually, but not yet. I can also imagine having a single socket support both gRPC and mesh (but not ttrpc and mesh, since ttrpc doesn't have any kind of protocol negotiation).

smalis-msft · 2026-04-30T19:23:59Z

+        .with_context(|| format!("failed to bind to socket {}", path.display()))?;
+
+    let transport = match opt.transport {
+        #[cfg(feature = "grpc")]


Can you double check that these uses of these features is still cargo-hack clean?

smalis-msft · 2026-04-30T19:25:54Z

+                        vm_controller.send(VmControllerRpc::Quit);
+                        break;
+                    }
+                    ReplQuitBehavior::Detach => {


Is there any way to kill the VM (send VmControllerRpc::Quit) when we're in detach behavior?

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

openvmm/openvmm_entry/src/vm_controller.rs:166

The VmController::run doc comment says the controller runs "until the VM stops", but in server mode exit_on_vm_stop is false and the controller keeps running after VM stop/teardown to service future CreateVm/Connect calls. Update the comment to reflect the actual lifetime conditions (e.g., quit/RPC channel closed).

impl VmController {
    /// Run the controller, processing RPCs and worker events until the VM
    /// stops or the caller (REPL or ttrpc server) sends Quit.
    pub async fn run(

Copilot · 2026-05-01T00:16:18Z

+/// to stop accepting, shut down the mesh, join the listener task, and remove
+/// the socket file.


AttachListener::shutdown's doc comment says it will "shut down the mesh", but the implementation only stops the listener task and removes the socket file (mesh shutdown happens elsewhere). Please update the comment to match the actual behavior to avoid confusion about ownership/lifetime.

Suggested change

/// to stop accepting, shut down the mesh, join the listener task, and remove

/// the socket file.

/// to stop accepting new connections, join the listener task, and remove the

/// socket file.

Copilot · 2026-05-01T00:16:19Z


-            // Signal the the parent process that the server is ready.
-            pal::close_stdout().context("failed to close stdout")?;
+    // Signal the the parent process that the server is ready.


Minor grammar: the comment says "Signal the the parent process" (double "the").

Suggested change

// Signal the the parent process that the server is ready.

// Signal the parent process that the server is ready.

Copilot · 2026-05-01T00:16:19Z

+    async fn stop_current_vm(&mut self) {
+        let Some(mut current_vm) = self.current_vm.take() else {
+            return;
+        };
+
+        current_vm.vm_worker.stop();
+        if let Err(err) = current_vm.vm_worker.join().await {
            tracing::error!(
                error = err.as_ref() as &dyn std::error::Error,
                "vm worker join failed"
            );
        }


CurrentVm now owns scsi_rpc/nvme_vtl2_rpc (and IC RPC senders). stop_current_vm() waits for vm_worker.join() while those senders are still alive inside current_vm. Given the existing REPL workaround that explicitly drops scsi_rpc/nvme_vtl2_rpc before quitting (repl.rs notes they can hold up worker stop), this change risks reintroducing a shutdown hang/deadlock because the controller keeps the channels alive during join. Consider dropping/taking these RPC senders (and any other shutdown-blocking resources) before awaiting vm_worker.join(), or avoid storing these senders in CurrentVm across shutdown.

Copilot · 2026-05-01T00:16:19Z

+    };
+
+    let message = if command == "run" {
+        "VM launch options must be passed after the `run` subcommand.\n\nUse:\n  openvmm run [VM launch options]\n\nFor example:\n  openvmm run --memory 4\n\nIf you intended `run` as a value, pass it to the option that should receive it, for example:\n  openvmm --memory 4 -c run\n"


In the run ordering hint, the example openvmm run --memory 4 is misleading: --memory accepts raw bytes when no suffix is provided (so 4 means 4 bytes, not 4 GiB). Consider using 4GB/4G (or another realistic size) in the example to avoid accidental tiny-memory launches.

Suggested change

"VM launch options must be passed after the `run` subcommand.\n\nUse:\n openvmm run [VM launch options]\n\nFor example:\n openvmm run --memory 4\n\nIf you intended `run` as a value, pass it to the option that should receive it, for example:\n openvmm --memory 4 -c run\n"

"VM launch options must be passed after the `run` subcommand.\n\nUse:\n openvmm run [VM launch options]\n\nFor example:\n openvmm run --memory 4GB\n\nIf you intended `run` as a value, pass it to the option that should receive it, for example:\n openvmm --memory 4GB -c run\n"

Copilot · 2026-05-01T00:16:19Z

+    // In non-optimized builds, clap uses an embarassing amount of stack space
+    // to construct the `Command` instance for `Options`, more than the Windows
+    // default of 1MB. This has been known since 2023:


Spelling: "embarassing" should be "embarrassing" in this comment.

Suggested change

// In non-optimized builds, clap uses an embarassing amount of stack space

// to construct the `Command` instance for `Options`, more than the Windows

// default of 1MB. This has been known since 2023:

// In non-optimized builds, clap uses an embarrassing amount of stack

// space to construct the `Command` instance for `Options`, more than the

// Windows default of 1MB. This has been known since 2023:

github-actions · 2026-05-01T05:41:58Z

3 Petri tests failed (1 unstable)

smalis-msft · 2026-05-01T14:20:01Z

Is there any way to add a full integration test for this functionality?

Copilot AI review requested due to automatic review settings April 29, 2026 19:50

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Comment thread openvmm/openvmm_entry/src/ttrpc/mod.rs Outdated

Comment thread openvmm/openvmm_entry/src/vm_connect.rs

Comment thread openvmm/openvmm_entry/src/vm_connect.rs

Comment thread openvmm/openvmm_entry/src/lib.rs

Comment thread openvmm/openvmm_entry/src/cli_args.rs

github-actions Bot added the Guide label Apr 29, 2026

Copilot started reviewing on behalf of jstarks April 29, 2026 20:05 View session

Copilot AI review requested due to automatic review settings April 29, 2026 20:32

Copilot started reviewing on behalf of jstarks April 29, 2026 20:33 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

jstarks commented Apr 29, 2026

View reviewed changes

Comment thread openvmm/openvmm_entry/src/ttrpc/mod.rs Outdated

Copilot AI review requested due to automatic review settings April 29, 2026 20:58

jstarks marked this pull request as ready for review April 29, 2026 20:58

jstarks requested a review from a team as a code owner April 29, 2026 20:58

Copilot started reviewing on behalf of jstarks April 29, 2026 20:59 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

smalis-msft reviewed Apr 30, 2026

View reviewed changes

Comment thread openvmm/openvmm_entry/src/lib.rs Outdated

smalis-msft reviewed Apr 30, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings May 1, 2026 00:09

Copilot started reviewing on behalf of jstarks May 1, 2026 00:10 View session

Copilot AI reviewed May 1, 2026

View reviewed changes

jstarks added 2 commits May 1, 2026 02:27

squash

0def944

cleanup

0ee5e48

jstarks force-pushed the attach branch from d456d09 to 0ee5e48 Compare May 1, 2026 04:23

	/// to stop accepting, shut down the mesh, join the listener task, and remove
	/// to stop accepting new connections, join the listener task, and remove

	// Signal the the parent process that the server is ready.
	// Signal the parent process that the server is ready.

		/// to stop accepting, shut down the mesh, join the listener task, and remove
		/// the socket file.

	"VM launch options must be passed after the `run` subcommand.\n\nUse:\n openvmm run [VM launch options]\n\nFor example:\n openvmm run --memory 4\n\nIf you intended `run` as a value, pass it to the option that should receive it, for example:\n openvmm --memory 4 -c run\n"
	"VM launch options must be passed after the `run` subcommand.\n\nUse:\n openvmm run [VM launch options]\n\nFor example:\n openvmm run --memory 4GB\n\nIf you intended `run` as a value, pass it to the option that should receive it, for example:\n openvmm --memory 4GB -c run\n"

Conversation

jstarks commented Apr 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

smalis-msft Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

smalis-msft Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

jstarks Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

smalis-msft Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

smalis-msft Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 1, 2026

Choose a reason for hiding this comment