Skip to content

Integrate HW-MGMT Version 7.0050.2930#4

Draft
abelamit wants to merge 1 commit into
masterfrom
master_6733121_integrate_7.0050.2930_2026-01-04
Draft

Integrate HW-MGMT Version 7.0050.2930#4
abelamit wants to merge 1 commit into
masterfrom
master_6733121_integrate_7.0050.2930_2026-01-04

Conversation

@abelamit
Copy link
Copy Markdown
Owner

@abelamit abelamit commented Jan 4, 2026

Why I did it

Integrate HW-MGMT 7.0050.2930 Changes

How I did it

Run make integrate-mlnx-hw-mgmt

How to verify it

Build an image and run tests from "sonic-mgmt".

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211

Description for the changelog

A picture of a cute animal (not mandatory but encouraged)

abelamit pushed a commit that referenced this pull request May 12, 2026
…dating udevd rules (sonic-net#26343)

- Why I did it
On SONiC SmartSwitch platforms with DPUs, systemd-udevd crashes with SIGABRT on every reboot when DPU firmware initialization is slow. During the initramfs boot phase, a standalone systemd-udevd daemon is started to handle device discovery. If DPU firmware takes longer than the 60-second udevadm settle timeout (BlueField-3 DPUs can take 120 seconds each in the failure case when they are stuck), the initramfs cannot stop this udevd before switch_root. The stale process survives into the real system but is never chrooted into the overlayfs root, leaving it with a broken filesystem view. When dpu-udev-manager.sh writes udev rules, the stale udevd detects the change and crashes on an assertion in systemd's chase() path resolution (assert(path_is_absolute(p)) at chase.c:648), because dir_fd_is_root() returns false for a process whose root still points to the initramfs rootfs rather than the overlayfs.

This triggers a systemd issue : systemd/systemd#29559 which maintainers doesn't consider as a bug from systemd side. Raising this fix for our usecase.

Core was generated by `/usr/lib/systemd/systemd-udevd --daemon --resolve-names=never'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f29fe7f695c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007f29fe7f695c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f29fe7a1cc2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f29fe78a4ac in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f29fea50c11 in ?? () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#4  0x00007f29feb94a8b in chase () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
sonic-net#5  0x00007f29feb956e2 in chase_and_opendir () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
sonic-net#6  0x00007f29feb9a609 in conf_files_list_strv () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
sonic-net#7  0x00007f29fea913e8 in config_get_stats_by_path () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
sonic-net#8  0x0000559f295519cf in ?? ()
sonic-net#9  0x0000559f29553a77 in ?? ()
sonic-net#10 0x00007f29fec36055 in ?? () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
sonic-net#11 0x00007f29fec3668d in sd_event_dispatch () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
sonic-net#12 0x00007f29fec394a8 in sd_event_run () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
sonic-net#13 0x00007f29fec396c7 in sd_event_loop () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
sonic-net#14 0x0000559f29545820 in ?? ()
sonic-net#15 0x00007f29fe78bca8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
sonic-net#16 0x00007f29fe78bd65 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
sonic-net#17 0x0000559f29545c51 in ?? ()

- How I did it
Added a kill_stale_udevd() function to dpu-udev-manager.sh that runs before writing the udev rules. It identifies the systemd-managed udevd PID via systemctl show, then kills any other systemd-udevd --daemon process that doesn't match -- these are leftover initramfs instances. If no stale process exists (e.g. DPUs are healthy and the initramfs udevd exited cleanly), the function is a no-op.

- How to verify it
Deploy the image on a SmartSwitch with DPUs in a state where firmware initialization times out (>60s per DPU) by stopping image installation before firmware install step
Reboot the switch
Verify no new systemd-udevd coredumps in /var/core/
Verify the stale process was killed: journalctl -b 0 | grep dpu-udev-manager should show killing stale initramfs udevd PID (systemd udevd is PID )
Verify systemd-udevd.service is healthy: systemctl status systemd-udevd should show active (running)
Verify DPU udev rules were written: cat /etc/udev/rules.d/92-midplane-intf.rules should contain the DPU interface naming rules

Signed-off-by: Hemanth Kumar Tirupati <tirupatihemanthkumar@gmail.com>
abelamit pushed a commit that referenced this pull request May 12, 2026
Why I did it
gnoic is unused inside the PTF container, and its upstream (karimra/gnoic) has not cut a release containing the golang.org/x/crypto v0.45.0 fixes for CVE-2025-58181 (GHSA-j5w8-q4qc-rx2x) and CVE-2025-47914 (GHSA-f6x5-jh6r-wrfv). The latest tag v0.2.1 still ships x/crypto v0.43.0, and the renovate security PR (karimra/gnoic#170) is unmerged.

Carrying a private patched build of an unused tool just to satisfy S360 scans is not worth the maintenance cost.

How I did it
Removed the gnoic build block from dockers/docker-ptf/Dockerfile.j2.
Updated the Go-toolchain install comment to no longer mention gnoic.
Removed the gnoic entry from files/build/versions-public/default/versions-git.
Removed the gnoic line from ThirdPartyLicenses.txt (the shared Apache 2.0 license body is preserved because entry Introduced Cavium target #4 apt-clean still uses it).
grpcurl and gnmic are unaffected — they continue to be built from source with go get golang.org/x/...@latest && go mod tidy, which already covers the related CVEs flagged by S360.

How to verify it
grep -r gnoic dockers/docker-ptf/ files/build/versions-public/ ThirdPartyLicenses.txt returns nothing.
Build docker-ptf; the resulting image no longer contains /usr/local/bin/gnoic.
Re-run the S360 / Qualys ContainerImageScan against the new digest; CVE-2025-58181 and CVE-2025-47914 against /usr/local/bin/gnoic should disappear.
Which release branch to backport (if applicable)
N/A — master only. Older release branches do not contain the gnoic build block.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant