[PW_SID:977217] [RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type#590
[PW_SID:977217] [RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type#590linux-riscv-bot wants to merge 2 commits into
Conversation
…n type On an x86 machine, when cpu 0 is isolated with "isolcpus=", on initiating suspend to memory, a warning is triggered, followed by a kernel crash. This is on a defconfig + CONFIG_ENERGY_MODEL kernel: $ cat /proc/version Linux version 6.16.0-rc4 (shashank@machine) (gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, GNU ld (GNU Binutils for Ubuntu) 2.42) #56 SMP PREEMPT_DYNAMIC Mon Jun 30 16:27:42 JST 2025 $ cat /proc/cmdline kernel-6.16-rc4 console=tty0 initrd=ramdisk.cpio.lz4 console=ttyS4,115200n8 no_console_suspend ignore_loglevel isolcpus=0 $ echo mem > /sys/power/state [ 124.899083] PM: suspend entry (deep) <snip> [ 125.169816] smpboot: CPU 2 is now offline [ 125.174167] ------------[ cut here ]------------ [ 125.178838] WARNING: CPU: 1 PID: 20 at kernel/sched/topology.c:2459 build_sched_domains+0x1246/0x1550 [ 125.188117] Modules linked in: [ 125.191232] CPU: 1 UID: 0 PID: 20 Comm: cpuhp/1 Tainted: G S 6.16.0-rc4 #56 PREEMPT(voluntary) [ 125.201453] Tainted: [S]=CPU_OUT_OF_SPEC <snip> [ 125.303753] Call Trace: [ 125.306248] <TASK> [ 125.308394] ? cpu_attach_domain+0x3f1/0x730 [ 125.312710] ? __kmalloc_cache_noprof+0x26a/0x300 [ 125.317465] partition_sched_domains+0x294/0x7f0 [ 125.322136] cpuset_reset_sched_domains+0x1e/0x30 [ 125.326893] sched_cpu_deactivate+0x11d/0x160 [ 125.331298] ? __pfx_sched_cpu_deactivate+0x10/0x10 [ 125.336225] cpuhp_invoke_callback+0x107/0x470 [ 125.340714] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 125.345385] cpuhp_thread_fun+0xdb/0x160 [ 125.349352] smpboot_thread_fn+0xeb/0x220 [ 125.353411] kthread+0xf3/0x1f0 [ 125.356600] ? __pfx_kthread+0x10/0x10 [ 125.360402] ? __pfx_kthread+0x10/0x10 [ 125.364204] ret_from_fork+0x7d/0xd0 [ 125.367832] ? __pfx_kthread+0x10/0x10 [ 125.371634] ret_from_fork_asm+0x1a/0x30 [ 125.375614] </TASK> [ 125.377848] ---[ end trace 0000000000000000 ]--- [ 125.382511] BUG: unable to handle page fault for address: 0000000087520483 [ 125.389436] #PF: supervisor read access in kernel mode [ 125.394613] #PF: error_code(0x0000) - not-present page [ 125.399800] PGD 0 P4D 0 [ 125.402374] Oops: Oops: 0000 [#1] SMP NOPTI [ 125.406601] CPU: 1 UID: 0 PID: 20 Comm: cpuhp/1 Tainted: G S W 6.16.0-rc4 #56 PREEMPT(voluntary) [ 125.416819] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN <snip> [ 125.430828] RIP: 0010:partition_sched_domains+0x36d/0x7f0 [ 125.436265] Code: 02 00 48 8b 4d 00 41 bc 01 00 00 00 4c 89 c0 74 a0 b8 40 00 00 00 48 85 c9 74 05 f3 48 0f bc c1 48 98 48 8b 04 c5 e0 bb 85 86 <4e> 8b bc 30 c0 0a 00 00 8b 05 f5 8f 75 01 85 c0 0f 84 090 [ 125.455082] RSP: 0018:ffffb185001dfd90 EFLAGS: 00010246 [ 125.460352] RAX: 0000000100000003 RBX: ffff98ac9cae6cd0 RCX: 0000000000000000 [ 125.467529] RDX: 0000000000000000 RSI: ffff98ac80bf1ed8 RDI: 0000000000000040 [ 125.474715] RBP: ffff98ac9cae6cc8 R08: ffff98ac80bf1ed0 R09: fffffffffffffffe [ 125.481894] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 125.489070] R13: 0000000000000001 R14: ffffffff8751f9c0 R15: 0000000000000004 [ 125.496248] FS: 0000000000000000(0000) GS:ffff98b068749000(0000) knlGS:0000000000000000 [ 125.504379] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 125.510166] CR2: 0000000087520483 CR3: 0000000122f01004 CR4: 0000000000f70ef0 [ 125.517351] PKRU: 55555554 [ 125.520097] Call Trace: [ 125.522590] <TASK> [ 125.524733] cpuset_reset_sched_domains+0x1e/0x30 [ 125.529484] sched_cpu_deactivate+0x11d/0x160 [ 125.533894] ? __pfx_sched_cpu_deactivate+0x10/0x10 [ 125.538814] cpuhp_invoke_callback+0x107/0x470 [ 125.543305] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 125.547971] cpuhp_thread_fun+0xdb/0x160 [ 125.551933] smpboot_thread_fn+0xeb/0x220 [ 125.555991] kthread+0xf3/0x1f0 [ 125.559174] ? __pfx_kthread+0x10/0x10 [ 125.562966] ? __pfx_kthread+0x10/0x10 [ 125.566758] ret_from_fork+0x7d/0xd0 [ 125.570383] ? __pfx_kthread+0x10/0x10 [ 125.574184] ret_from_fork_asm+0x1a/0x30 [ 125.578149] </TASK> [ 125.580382] Modules linked in: [ 125.583485] CR2: 0000000087520483 [ 125.586853] ---[ end trace 0000000000000000 ]--- [ 125.591507] RIP: 0010:partition_sched_domains+0x36d/0x7f0 [ 125.596954] Code: 02 00 48 8b 4d 00 41 bc 01 00 00 00 4c 89 c0 74 a0 b8 40 00 00 00 48 85 c9 74 05 f3 48 0f bc c1 48 98 48 8b 04 c5 e0 bb 85 86 <4e> 8b bc 30 c0 0a 00 00 8b 05 f5 8f 75 01 85 c0 0f 84 090 [ 125.615763] RSP: 0018:ffffb185001dfd90 EFLAGS: 00010246 [ 125.621032] RAX: 0000000100000003 RBX: ffff98ac9cae6cd0 RCX: 0000000000000000 [ 125.628211] RDX: 0000000000000000 RSI: ffff98ac80bf1ed8 RDI: 0000000000000040 [ 125.635390] RBP: ffff98ac9cae6cc8 R08: ffff98ac80bf1ed0 R09: fffffffffffffffe [ 125.642568] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 125.649745] R13: 0000000000000001 R14: ffffffff8751f9c0 R15: 0000000000000004 [ 125.656923] FS: 0000000000000000(0000) GS:ffff98b068749000(0000) knlGS:0000000000000000 [ 125.665054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 125.670849] CR2: 0000000087520483 CR3: 0000000122f01004 CR4: 0000000000f70ef0 [ 125.678026] PKRU: 55555554 [ 125.680773] note: cpuhp/1[20] exited with irqs disabled This happens because in order to offline the last secondary cpu, i.e. cpu 1, build_sched_domains() ends up being passed an empty cpumask, since the only remaining cpu (cpu 0) is isolated. It warns and fails, after which perf domains are are attempted to be built, which crashes the kernel. The same problem occurs during cpu hotplug, but that was fixed by commit 38685e2 ("cpu/hotplug: Don't offline the last non-isolated CPU"). Fix this by ensuring that the primary cpu, the last standing cpu, is of domain type, so that build_sched_domains() is not passed an empty cpumask. Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
|
Patch 1: "[RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type" |
a7cb30d to
d776861
Compare
PR for series 977217 applied to workflow__riscv__fixes
Name: [RFC] kernel/cpu: in freeze_secondary_cpus() ensure primary cpu is of domain type
URL: https://patchwork.kernel.org/project/linux-riscv/list/?series=977217
Version: 1