[PW_SID:1087275] kdump: reduce vmcore size and capture time via linux,no-dump#1834
[PW_SID:1087275] kdump: reduce vmcore size and capture time via linux,no-dump#1834linux-riscv-bot wants to merge 12 commits into
Conversation
When a reserved-memory node contains multiple reg entries (e.g., reg = <base1 size1>, <base2 size2>), the count used for total_reserved_mem_cnt is wrong in two places: 1) __reserved_mem_reserve_reg() returns 0 on success regardless of how many regions it reserved in memblock. The caller in fdt_scan_reserved_mem() then increments count by just 1. 2) fdt_scan_reserved_mem_late() uses of_flat_dt_get_addr_size() which only reads the first reg entry. Subsequent entries are never initialized via fdt_init_reserved_mem_node(), so their metadata is lost. Fix both issues: - Make __reserved_mem_reserve_reg() return the actual number of regions successfully reserved. Update the caller to accumulate the returned count. - Rewrite fdt_scan_reserved_mem_late() to use of_flat_dt_get_addr_size_prop() and iterate all reg entries, initializing each one via fdt_init_reserved_mem_node(). Fixes: 8a6e02d ("of: reserved_mem: Restructure how the reserved memory regions are processed") Fixes: 00c9a45 ("of: reserved_mem: Add code to dynamically allocate reserved_mem array") Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Tested-by: Zhao Meijing <zhaomeijing@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
early_init_dt_reserve_memory() does not validate whether the region falls within physical memory. If a device tree incorrectly specifies a reserved memory region outside the physical address range: - For the non-nomap path, memblock_reserve() blindly adds the region to memblock.reserved, creating a stale entry that refers to non-existent memory. - For the nomap path, memblock_mark_nomap() silently fails to match any region in memblock.memory, but still returns success. Add a memblock_overlaps_region() check at the entry of early_init_dt_reserve_memory() to reject such regions before any memblock operation takes place. This also simplifies the existing nomap guard: the original "overlaps && is_reserved" condition reduces to just "is_reserved", since the overlap with physical memory is already guaranteed by the new check. Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Tested-by: Zhao Meijing <zhaomeijing@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
…_reserved_mem_late() fdt_scan_reserved_mem_late() iterates all reg entries of every /reserved-memory child and unconditionally initialises each via fdt_init_reserved_mem_node(), while fdt_scan_reserved_mem() in the first pass may have rejected individual entries in early_init_dt_reserve_memory() (e.g. outside physical memory or, on the no-map path, overlapping an existing reservation). When a single node mixes failing and succeeding reg entries, the first-pass counter only accounts for the successful ones, and the second-pass save then overflows into the wrong slots: the failing entry may be written to reserved_mem[] while the succeeding one is dropped by the "not enough space" guard in fdt_init_reserved_mem_node(). The stored entry does not correspond to any real memblock reservation and misleads consumers such as of_reserved_mem_lookup(). Mirror early_init_dt_reserve_memory()'s preconditions in the per-reg-entry save loop: - skip the entry if it does not overlap memblock.memory; - for nomap entries, skip if the region is already reserved. This keeps reserved_mem[] strictly consistent with the regions that were actually reserved. Fixes: 8a6e02d ("of: reserved_mem: Restructure how the reserved memory regions are processed") Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Tested-by: Zhao Meijing <zhaomeijing@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
…thing to save fdt_scan_reserved_mem_late() unconditionally calls alloc_reserved_mem_array() after confirming /reserved-memory exists. Two issues with that: - When __reserved_mem_check_root() subsequently fails, the call returns right away, leaving the freshly allocated array unused. - When /reserved-memory exists but fdt_scan_reserved_mem() found no entries to save (total_reserved_mem_cnt stays at its freshly-set value of zero, e.g. empty node or all children disabled), alloc_reserved_mem_array() ends up calling memblock_alloc() with zero size, which returns NULL and logs an "Failed to allocate memory for reserved_mem array" error even though nothing was expected to be allocated. Move alloc_reserved_mem_array() past the root-node check and gate it on total_reserved_mem_cnt, so the array is only allocated when there is at least one entry that needs a slot. Fixes: 00c9a45 ("of: reserved_mem: Add code to dynamically allocate reserved_mem array") Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Tested-by: Zhao Meijing <zhaomeijing@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
…ory regions Add a 'no_dump' field to struct reserved_mem and parse the 'linux,no-dump' device tree property during reserved memory node initialization. This property allows device tree authors to mark specific reserved memory regions that should be excluded from kdump vmcore dumps. Reserved memory regions used by device firmware (e.g., GPU, DSP, modem) typically contain data that is not useful for kernel crash analysis and can significantly increase vmcore size. The 'linux,no-dump' property provides a declarative way to indicate these regions should be filtered out when constructing the elfcorehdr for kdump. The property is named with a 'linux,' prefix because kdump/vmcore is Linux-specific and the property is an OS hint rather than a hardware description, matching existing properties such as 'linux,cma-default' and 'linux,usable-memory-range'. The 'linux,no-dump' property is only effective when the region: - Does not have 'no-map': these regions are already excluded from vmcore since they are removed from the linear mapping (MEMBLOCK_NOMAP). - Does not have 'reusable': CMA reusable regions are actively used by the kernel for movable page allocations, and their contents are valuable for crash analysis. The no-dump status is also printed in the boot log alongside the existing nomap and reusable flags for diagnostic purposes. Corresponding dt-schema binding update: devicetree-org/dt-schema#193 Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Tested-by: Zhao Meijing <zhaomeijing@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
Save /memreserve/ entries from the FDT header into the reserved_mem array so they can be consumed as vmcore filtering metadata by kdump. /memreserve/ regions hold firmware or bootloader state that is not useful for kernel crash analysis, so saved /memreserve/ entries default to no_dump=true and are tagged with name="memreserve" so consumers can distinguish them from /reserved-memory/ child nodes. Some DTBs declare the same or overlapping range in both /memreserve/ and a /reserved-memory/ child. Commit b413281 ("of: fdt: Scan /memreserve/ last") describes one such case on Khadas Vim3 where the range is in /memreserve/ and also in a /reserved-memory/ child carrying no-map. The /reserved-memory/ node's attributes (no-map, reusable, linux,no-dump) are the explicit declaration and must win over the firmware default, fdt_reserved_mem_save_memreserve() therefore inherits no_dump from the overlapping /reserved-memory/ entry rather than silently applying no_dump=true. Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Tested-by: Zhao Meijing <zhaomeijing@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
Provide two kdump-oriented helpers so that arch kexec_file code does not have to open-code the no-dump filtering loop: - of_reserved_mem_no_dump_nr_ranges() returns the number of reserved regions flagged with linux,no-dump. Each exclusion may split one existing crash_mem range into two, so callers use this count to pre-size their crash_mem allocation. - of_reserved_mem_exclude_no_dump() walks the reserved_mem[] array and calls crash_exclude_mem_range() for each no-dump region. Both helpers are guarded by CONFIG_KEXEC_FILE; empty inline stubs are provided for the !KEXEC_FILE case so architecture code can call them unconditionally. The consumers are added in the following arm64, riscv and loongarch patches in this series. Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Tested-by: Zhao Meijing <zhaomeijing@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
Exclude reserved memory regions marked with the linux,no-dump property from the elfcorehdr PT_LOAD segments when preparing kdump vmcore. Device firmware memory regions (e.g., GPU, DSP, modem) reserved via the device tree typically contain data that is not useful for kernel crash analysis and can significantly increase vmcore size. By honoring the no_dump flag in the reserved_mem array, these regions are filtered out from the crash dump, resulting in smaller and more focused vmcore files. Use the common of_reserved_mem_exclude_no_dump() helper to perform the exclusion, and pre-size the crash_mem array via of_reserved_mem_no_dump_nr_ranges(). Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Tested-by: Zhao Meijing <zhaomeijing@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
Apply the same no-dump reserved memory filtering to RISC-V kdump as was done for arm64. Use of_reserved_mem_exclude_no_dump() to drop flagged regions from the elfcorehdr PT_LOAD segments, and of_reserved_mem_no_dump_nr_ranges() to pre-size the crash_mem array. Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
Apply the same no-dump reserved memory filtering to LoongArch kdump as was done for arm64. Use of_reserved_mem_exclude_no_dump() to drop flagged regions from the elfcorehdr PT_LOAD segments, and of_reserved_mem_no_dump_nr_ranges() to pre-size the crash_mem array. Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
Describe the new 'linux,no-dump' reserved-memory device tree property and the automatic exclusion of /memreserve/ entries from the kdump vmcore. The section covers: - The two mechanisms that exclude reserved memory from the vmcore (firmware /memreserve/ entries and linux,no-dump child nodes). - Intended use cases (firmware-owned GPU, DSP and modem carveouts). - Interaction with the existing 'no-map' and 'reusable' flags, with the silent-ignore precedence implemented by the kernel. - Architectures honouring the hint (arm64, riscv, loongarch). - An illustrative reserved-memory DTS snippet. The DT binding for the property itself is maintained in dt-schema. Signed-off-by: Chen Wandun <chenwandun@lixiang.com> Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 1: "[01/11] of: reserved_mem: fix region count for nodes with multiple reg entries" |
|
Patch 2: "[02/11] of: reserved_mem: reject reserved memory outside physical address range" |
|
Patch 2: "[02/11] of: reserved_mem: reject reserved memory outside physical address range" |
|
Patch 2: "[02/11] of: reserved_mem: reject reserved memory outside physical address range" |
|
Patch 2: "[02/11] of: reserved_mem: reject reserved memory outside physical address range" |
|
Patch 2: "[02/11] of: reserved_mem: reject reserved memory outside physical address range" |
|
Patch 2: "[02/11] of: reserved_mem: reject reserved memory outside physical address range" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
|
Patch 11: "[11/11] Documentation: admin-guide: kdump: document linux,no-dump DT property" |
60ec8ef to
5927802
Compare
PR for series 1087275 applied to workflow
Name: kdump: reduce vmcore size and capture time via linux,no-dump
URL: https://patchwork.kernel.org/project/linux-riscv/list/?series=1087275
Version: 1