Skip to content

init: embed a simple DHCP client#589

Open
slp wants to merge 3 commits intomainfrom
init-dhcp
Open

init: embed a simple DHCP client#589
slp wants to merge 3 commits intomainfrom
init-dhcp

Conversation

@slp
Copy link
Copy Markdown
Collaborator

@slp slp commented Mar 16, 2026

If there's a eth0 interface present, configure it with DHCP.

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Mar 16, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a simple DHCP client to configure the eth0 interface. The implementation is a good start, but I've found several critical issues that need to be addressed. There are bugs in parsing IP addresses from the DHCP response due to incorrect byte order handling, which will lead to incorrect network configuration. Additionally, the DHCP option parsing logic has a security vulnerability that could lead to a buffer over-read when processing a malicious response. I've also identified a double-close bug in init.c and some other minor issues. Please see the detailed comments for suggestions on how to fix these problems.

@slp slp marked this pull request as draft March 16, 2026 10:56
@slp slp force-pushed the init-dhcp branch 3 times, most recently from e0c9001 to 52dc947 Compare March 16, 2026 12:35
@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Mar 16, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a simple DHCP client to configure the eth0 interface at startup. The implementation is a good starting point, but I've identified a few areas for improvement regarding robustness and maintainability. Specifically, I've suggested changing how /etc/resolv.conf is handled to prevent duplicate entries, randomizing the DHCP transaction ID for better compliance with standards, and a long-term suggestion to refactor the manual DHCP packet construction to improve code clarity and reduce brittleness. The changes to the build system and the integration into the init process are correct.

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Mar 16, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a simple DHCP client to configure the eth0 network interface on startup. The implementation is well-structured, using netlink for network configuration and handling the basic DHCP discovery flow with rapid commit. My review focuses on improving robustness, maintainability, and clarity. Key suggestions include correcting error handling to prevent potential indefinite blocking, refactoring resource management to be more robust, and improving clarity by renaming misleadingly named functions, removing an unused macro, and correcting misleading comments.

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Mar 16, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a simple DHCP client to configure the eth0 network interface during initialization. The overall approach is sound, but I've identified a few areas for improvement, primarily related to error handling and a discrepancy between documented and actual behavior. Specifically, several function return values are not checked, which could mask network configuration failures. Additionally, the handling of /etc/resolv.conf truncates the file instead of appending as the documentation suggests. Addressing these points will make the DHCP client more robust and reliable.

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Mar 16, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a DHCP client to configure the eth0 interface. The implementation is a good start, using netlink for configuration and a simple DHCP discover/ack flow with rapid commit. I've found a few areas for improvement regarding error handling and API correctness. Specifically, there's a minor bug in an error message and an opportunity to improve API safety with const.

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Mar 16, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a simple DHCP client to configure the eth0 network interface during initialization. The implementation uses netlink sockets to configure the interface, address, and routes, and a UDP socket for DHCP communication. The changes are well-contained and the approach of using a temporary link-local address to avoid raw sockets is a good design choice.

My review identifies a few areas for improvement to enhance robustness, error handling, and code clarity. This includes a high-severity issue with a predictable DHCP transaction ID and several medium-severity issues related to error reporting, misleading comments, use of magic numbers, and unchecked return values from system calls. Overall, this is a valuable addition, and addressing these points will make the implementation more robust and maintainable.

Comment on lines +352 to +362
/* Option 53: DHCP Message Type = DISCOVER (1) */
request.options[opt_offset++] = 53;
request.options[opt_offset++] = 1;
request.options[opt_offset++] = 1;

/* Option 80: Rapid Commit (RFC 4039) */
request.options[opt_offset++] = 80;
request.options[opt_offset++] = 0;

/* Option 255: End of options */
request.options[opt_offset++] = 0xff;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The code uses several magic numbers for DHCP options (e.g., 53, 80, 255) and their values. It would improve readability and maintainability to define these as named constants at the top of the file. For example:

#define DHCP_OPT_MESSAGE_TYPE 53
#define DHCP_OPT_MESSAGE_TYPE_DISCOVER 1
#define DHCP_OPT_RAPID_COMMIT 80
#define DHCP_OPT_END 255

This principle should also be applied to the option codes used when parsing the DHCP response (e.g., options 1, 3, 6, 26).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only using them once, and they're al properly documented. Moving them to constants makes it less readable. Nack.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean it's a matter of preference but I'd say I'd prefer the constants too in most cases.

@slp slp marked this pull request as ready for review March 16, 2026 15:40
@mz-pdm
Copy link
Copy Markdown
Collaborator

mz-pdm commented Mar 19, 2026

What's the reason to translate the DHCP implementation from Rust to C rather than to translate init.c to init.rs? Is it because of the TEE stuff? Could work in #590 help if this is the case? Or are there other reasons?

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Mar 20, 2026

What's the reason to translate the DHCP implementation from Rust to C rather than to translate init.c to init.rs? Is it because of the TEE stuff? Could work in #590 help if this is the case? Or are there other reasons?

Translating init.c to Rust is way more disruptive than translating the DHCP client, which is a new feature not used yet in any use case.

If we're to translate init.c to Rust, that's 2.x material. And we'd need to address whether linking against musl instead glibc is acceptable or not.

@mtjhrc
Copy link
Copy Markdown
Collaborator

mtjhrc commented Mar 20, 2026

What's the reason to translate the DHCP implementation from Rust to C rather than to translate init.c to init.rs? Is it because of the TEE stuff? Could work in #590 help if this is the case? Or are there other reasons?

Translating init.c to Rust is way more disruptive than translating the DHCP client, which is a new feature not used yet in any use case.

If we're to translate init.c to Rust, that's 2.x material. And we'd need to address whether linking against musl instead glibc is acceptable or not.

Personally I would too rather rewrite the init in Rust, but for 1.x branch it's safer to not touch it that much.

Btw. instead of musl / glibc I would try going for no-libc and use direct syscalls using rustix crate bindings (we would also need an allocator crate implementation + json parser).

@nohajc
Copy link
Copy Markdown
Contributor

nohajc commented Apr 3, 2026

Interesting results, vmnet-helper DHCP is significantly slower than gvproxy. Furthermore, with vmnet-helper, my VM didn't get any IP assigned (subnet is configured as 172.27.1.0/30). With gvproxy, I got an IP just fine but also encountered Failed to open /etc/resolv.conf: Read-only file system. That's because in my setup, I do use a ro rootfs and before my own script runs (and creates some tmpfs mounts), setting a nameserver will obviously not work.

I think the possibility of read-only root setups is another reason for having an option to disable the built-in DHCP.

vmnet-helper:
	do_dhcp("eth0") took 116.701 ms
	do_dhcp("eth0") took 104.278 ms
	do_dhcp("eth0") took 114.938 ms
	do_dhcp("eth0") took 110.409 ms
	do_dhcp("eth0") took 113.966 ms

gvproxy:
	do_dhcp("eth0") took 2.995 ms
	do_dhcp("eth0") took 2.483 ms
	do_dhcp("eth0") took 2.165 ms
	do_dhcp("eth0") took 0.983 ms
	do_dhcp("eth0") took 0.966 ms

@nirs
Copy link
Copy Markdown
Contributor

nirs commented Apr 7, 2026

@nohajc

Interesting results, vmnet-helper DHCP is significantly slower than gvproxy. Furthermore, with vmnet-helper, my VM didn't get any IP assigned (subnet is configured as 172.27.1.0/30).

vmnet_{start,end}_address require address in the private IP range (RFC 1918), but I tried only 192.168/16. vmnet_network_configuration_create - the modern way to create vment network requires 192.168/16.

If you use static addresses and do no need DHCP, there is a way to disable DHCP for vment but it is not available in vmnet-helper yet, but you can add it.
nirs/vmnet-helper#120

For anylinuxfs use case, why not use multiple vms in the same subnet with --operation-mode=host and --enable-isolation? Creating many subnets will fail when you conflict with another application that created the same subnet before you.

@nohajc
Copy link
Copy Markdown
Contributor

nohajc commented Apr 7, 2026

@nirs I already have a working setup with static addresses and /30 subnets which are picked dynamically from a predefined pool so that they don't conflict with any other IP ranges on the host interfaces.

Haven't tried disabling vmnet DHCP, I just ignore it.

I have considered your approach before but decided to use the smallest subnets separately.

@nohajc
Copy link
Copy Markdown
Contributor

nohajc commented Apr 7, 2026

Haven't used host mode though because sometimes I need internet access too and don't want to manage multiple configuration profiles for mounting, bootstrapping VM images or installing packages. I think we're getting a little off topic here... but thanks!

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Apr 13, 2026

I'm going to introduce a change to gate this behavior behind an API call.

@slp slp added the 1.x label Apr 13, 2026
@slp slp force-pushed the init-dhcp branch 3 times, most recently from 1ad1d26 to 8fe2451 Compare April 14, 2026 07:09
@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Apr 14, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a standalone DHCP client implementation in C to enable automatic IPv4 configuration for guest network interfaces. Key changes include the DHCP client logic, integration into the init process triggered by a new environment variable, and API updates in libkrun to support the DHCP client flag. The review feedback highlights several security and robustness concerns, such as missing validation for DHCP options and netlink messages, predictable transaction IDs, and an overly short receive timeout.

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Apr 14, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a standalone C-based DHCP client to libkrun, enabling automated IPv4 network configuration for guest interfaces. The implementation includes support for DHCP DISCOVER/REQUEST flows, MTU configuration, and DNS setup via /etc/resolv.conf. Integration is provided across TAP, unixstream, and unixgram backends, triggered by a new environment variable. Review feedback identifies a critical error handling gap when retrieving the interface MAC address, suggests safer macro definitions using parentheses, and notes that the function documentation for do_dhcp needs to be updated to accurately reflect the implementation.

init/dhcp.c Outdated
Comment on lines +477 to +479
if (ioctl(sock, SIOCGIFHWADDR, &mac_ifr) == 0) {
memcpy(request.chaddr, mac_ifr.ifr_hwaddr.sa_data, 6);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If ioctl fails to get the MAC address, the DHCP request will be sent with an all-zero chaddr, which will likely be rejected by the DHCP server. This should be treated as a fatal error for the DHCP process.

    if (ioctl(sock, SIOCGIFHWADDR, &mac_ifr) < 0) {
        perror("ioctl(SIOCGIFHWADDR) failed");
        goto cleanup;
    }
    memcpy(request.chaddr, mac_ifr.ifr_hwaddr.sa_data, 6);

/* Send the VFKIT magic after establishing the connection,
as required by gvproxy in vfkit mode. */
#define NET_FLAG_VFKIT 1 << 0
#define NET_FLAG_DHCP_CLIENT 1 << 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It is a good practice to wrap macro definitions in parentheses to avoid operator precedence issues. This also applies to NET_FLAG_VFKIT on the line above.

Suggested change
#define NET_FLAG_DHCP_CLIENT 1 << 1
#define NET_FLAG_DHCP_CLIENT (1 << 1)

init/dhcp.h Outdated
Comment on lines +40 to +48
* 1. Sets up a temporary link-local address (169.254.1.1/16)
* 2. Sends a DHCP DISCOVER message with Rapid Commit option
* 3. Waits up to 100ms for a DHCP ACK response
* 4. Parses the response and configures:
* - IPv4 address with appropriate prefix length
* - Default gateway route
* - DNS servers (overwriting /etc/resolv.conf)
* - Interface MTU
* 5. Cleans up temporary configuration
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for do_dhcp seems to be out of sync with the implementation. For example, it mentions setting up a temporary link-local address, but this is not done in do_dhcp. Please update the comment to accurately reflect the function's behavior.

References
  1. Prioritize code readability and focused commits over micro-optimizations.

@sbrivio-rh
Copy link
Copy Markdown

I think you could directly receive the response (DHCPACK) into the request struct, and naturally access fields instead of having to hardcode all this. It wasn't very convenient to do this in Rust but in C it's straightforward and will save some magic numbers (including the "240" later).

I'll sketch an example when I finish reviewing the whole thing. It generally looks good to me and all the bits should always work in practice, so there's no need to wait for my review (of course), but perhaps it's still worth it.

Sorry, I had some unexpected interruptions coming up. I still plan to review this, I'm now giving myself a harder deadline for this week.

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Apr 14, 2026

/gemini review

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Apr 14, 2026

I think you could directly receive the response (DHCPACK) into the request struct, and naturally access fields instead of having to hardcode all this. It wasn't very convenient to do this in Rust but in C it's straightforward and will save some magic numbers (including the "240" later).
I'll sketch an example when I finish reviewing the whole thing. It generally looks good to me and all the bits should always work in practice, so there's no need to wait for my review (of course), but perhaps it's still worth it.

Sorry, I had some unexpected interruptions coming up. I still plan to review this, I'm now giving myself a harder deadline for this week.

Thank you @sbrivio-rh !

We're in kind on an stretch to release 1.18, so this one might go in before you review it. But that's fine, if you find anything we should change, we can put it into 1.18.1.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements an embedded DHCP client in the init process to automate IPv4 network configuration. It introduces the NET_FLAG_DHCP_CLIENT flag to enable this feature via the network configuration APIs. Key changes include the DHCP client implementation in init/dhcp.c, its integration into the init binary, and VMM updates to pass the configuration through the kernel command line. Reviewers identified a compilation error due to a missing semicolon and recommended stricter error handling when updating /etc/resolv.conf.

@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Apr 14, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements an embedded DHCP client within the init process to automate IPv4 network configuration for guest VMs, controlled by a new NET_FLAG_DHCP_CLIENT flag. The review identified several issues in the DHCP implementation, including potential out-of-bounds reads during packet parsing, incorrect handling of the Server Identifier per RFC 2131, the use of a non-unique transaction ID (PID 1), and an overly aggressive 100ms timeout that could cause configuration failures.

slp and others added 3 commits April 14, 2026 11:50
If there's a eth0 interface present, and one of the interfaces has been
configured with the newly introduced NET_FLAG_DHCP_CLIENT flag, the
embedded dhcp client will try to obtain an address.

For 2.x we should be more flexible and actually allow both dhcp and
static IP configuration on specific interfaces (identified by their
MAC addresses, since ifnames aren't deterministic), but this should
be good enough for 1.x.

Signed-off-by: Sergio Lopez <slp@redhat.com>
Replace the temporary link-local address (169.254.1.1) workaround with
SO_BINDTODEVICE.

The temp address caused the kernel to use 169.254.1.1 as the source IP in DHCP
packets; gvproxy then tried to reply to that address and failed with
"no route to host". With this change the source IP should be 0.0.0.0, which is
what RFC 2131 requires for DHCPDISCOVER.

Signed-off-by: Matej Hrica <mhrica@redhat.com>
When a server answers DHCPDISCOVER with DHCPOFFER instead of an immediate ACK,
send DHCPREQUEST for the and wait for the final ACK.

This makes DHCP work on macOS hosts when using gvproxy for networking.

Signed-off-by: Matej Hrica <mhrica@redhat.com>
@slp
Copy link
Copy Markdown
Collaborator Author

slp commented Apr 14, 2026

@mtjhrc I've cherry picked your patches from #598 here, and also addressed the pointed out issues. PTAL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants