Skip to content

[PW_SID:1060691] Fix bugs and performance of kstack offset randomisation#1543

Closed
linux-riscv-bot wants to merge 2 commits into
workflow__riscv__fixesfrom
pw1060691
Closed

[PW_SID:1060691] Fix bugs and performance of kstack offset randomisation#1543
linux-riscv-bot wants to merge 2 commits into
workflow__riscv__fixesfrom
pw1060691

Conversation

@linux-riscv-bot
Copy link
Copy Markdown

PR for series 1060691 applied to workflow__riscv__fixes

Name: Fix bugs and performance of kstack offset randomisation
URL: https://patchwork.kernel.org/project/linux-riscv/list/?series=1060691
Version: 5

ryanhrob added 2 commits March 3, 2026 15:48
kstack_offset was previously maintained per-cpu, but this caused a
couple of issues. So let's instead make it per-task.

Issue 1: add_random_kstack_offset() and choose_random_kstack_offset()
expected and required to be called with interrupts and preemption
disabled so that it could manipulate per-cpu state. But arm64, loongarch
and risc-v are calling them with interrupts and preemption enabled. I
don't _think_ this causes any functional issues, but it's certainly
unexpected and could lead to manipulating the wrong cpu's state, which
could cause a minor performance degradation due to bouncing the cache
lines. By maintaining the state per-task those functions can safely be
called in preemptible context.

Issue 2: add_random_kstack_offset() is called before executing the
syscall and expands the stack using a previously chosen random offset.
choose_random_kstack_offset() is called after executing the syscall and
chooses and stores a new random offset for the next syscall. With
per-cpu storage for this offset, an attacker could force cpu migration
during the execution of the syscall and prevent the offset from being
updated for the original cpu such that it is predictable for the next
syscall on that cpu. By maintaining the state per-task, this problem
goes away because the per-task random offset is updated after the
syscall regardless of which cpu it is executing on.

Fixes: 39218ff ("stack: Optionally randomize kernel stack offset each syscall")
Closes: https://lore.kernel.org/all/dd8c37bc-795f-4c7a-9086-69e584d8ab24@arm.com/
Cc: stable@vger.kernel.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
Previously different architectures were using random sources of
differing strength and cost to decide the random kstack offset. A number
of architectures (loongarch, powerpc, s390, x86) were using their
timestamp counter, at whatever the frequency happened to be. Other
arches (arm64, riscv) were using entropy from the crng via
get_random_u16().

There have been concerns that in some cases the timestamp counters may
be too weak, because they can be easily guessed or influenced by user
space. And get_random_u16() has been shown to be too costly for the
level of protection kstack offset randomization provides.

So let's use a common, architecture-agnostic source of entropy; a
per-cpu prng, seeded at boot-time from the crng. This has a few
benefits:

  - We can remove choose_random_kstack_offset(); That was only there to
    try to make the timestamp counter value a bit harder to influence
    from user space [*].

  - The architecture code is simplified. All it has to do now is call
    add_random_kstack_offset() in the syscall path.

  - The strength of the randomness can be reasoned about independently
    of the architecture.

  - Arches previously using get_random_u16() now have much faster
    syscall paths, see below results.

[*] Additionally, this gets rid of some redundant work on s390 and x86.
Before this patch, those architectures called
choose_random_kstack_offset() under arch_exit_to_user_mode_prepare(),
which is also called for exception returns to userspace which were *not*
syscalls (e.g. regular interrupts). Getting rid of
choose_random_kstack_offset() avoids a small amount of redundant work
for the non-syscall cases.

In some configurations, add_random_kstack_offset() will now call
instrumentable code, so for a couple of arches, I have moved the call a
bit later to the first point where instrumentation is allowed. This
doesn't impact the efficacy of the mechanism.

There have been some claims that a prng may be less strong than the
timestamp counter if not regularly reseeded. But the prng has a period
of about 2^113. So as long as the prng state remains secret, it should
not be possible to guess. If the prng state can be accessed, we have
bigger problems.

Additionally, we are only consuming 6 bits to randomize the stack, so
there are only 64 possible random offsets. I assert that it would be
trivial for an attacker to brute force by repeating their attack and
waiting for the random stack offset to be the desired one. The prng
approach seems entirely proportional to this level of protection.

Performance data are provided below. The baseline is v6.18 with rndstack
on for each respective arch. (I)/(R) indicate statistically significant
improvement/regression. arm64 platform is AWS Graviton3 (m7g.metal).
x86_64 platform is AWS Sapphire Rapids (m7i.24xlarge):

+-----------------+--------------+---------------+---------------+
| Benchmark       | Result Class |  per-cpu-prng |  per-cpu-prng |
|                 |              | arm64 (metal) |   x86_64 (VM) |
+=================+==============+===============+===============+
| syscall/getpid  | mean (ns)    |    (I) -9.50% |   (I) -17.65% |
|                 | p99 (ns)     |   (I) -59.24% |   (I) -24.41% |
|                 | p99.9 (ns)   |   (I) -59.52% |   (I) -28.52% |
+-----------------+--------------+---------------+---------------+
| syscall/getppid | mean (ns)    |    (I) -9.52% |   (I) -19.24% |
|                 | p99 (ns)     |   (I) -59.25% |   (I) -25.03% |
|                 | p99.9 (ns)   |   (I) -59.50% |   (I) -28.17% |
+-----------------+--------------+---------------+---------------+
| syscall/invalid | mean (ns)    |   (I) -10.31% |   (I) -18.56% |
|                 | p99 (ns)     |   (I) -60.79% |   (I) -20.06% |
|                 | p99.9 (ns)   |   (I) -61.04% |   (I) -25.04% |
+-----------------+--------------+---------------+---------------+

I tested an earlier version of this change on x86 bare metal and it
showed a smaller but still significant improvement. The bare metal
system wasn't available this time around so testing was done in a VM
instance. I'm guessing the cost of rdtsc is higher for VMs.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Linux RISC-V bot <linux.riscv.bot@gmail.com>
@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
build-rv32-defconfig
Desc: Builds riscv32 defconfig
Duration: 140.25 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
build-rv64-clang-allmodconfig
Desc: Builds riscv64 allmodconfig with Clang, and checks for errors and added warnings
Duration: 2245.83 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
build-rv64-gcc-allmodconfig
Desc: Builds riscv64 allmodconfig with GCC, and checks for errors and added warnings
Duration: 3208.66 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
build-rv64-nommu-k210-defconfig
Desc: Builds riscv64 defconfig with NOMMU for K210
Duration: 27.57 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
build-rv64-nommu-k210-virt
Desc: Builds riscv64 defconfig with NOMMU for the virt platform
Duration: 28.88 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
checkpatch
Desc: Runs checkpatch.pl on the patch
Duration: 4.43 seconds
Result: WARNING
Output:

WARNING: Argument 'tsk' is not used in function-like macro
#107: FILE: include/linux/randomize_kstack.h:99:
+#define random_kstack_task_init(tsk)		do { } while (0)

total: 0 errors, 1 warnings, 0 checks, 91 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

Commit cf1acb5a9d5e ("randomize_kstack: Maintain kstack_offset per task") has style problems, please review.

NOTE: Ignored message types: ALLOC_SIZEOF_STRUCT CAMELCASE COMMIT_LOG_LONG_LINE GIT_COMMIT_ID MACRO_ARG_REUSE NO_AUTHOR_SIGN_OFF

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.
total: 0 errors, 1 warnings, 0 checks, 91 lines checked
WARNING: Argument 'tsk' is not used in function-like macro


@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
dtb-warn-rv64
Desc: Checks for Device Tree warnings/errors
Duration: 86.26 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
header-inline
Desc: Detects static functions without inline keyword in header files
Duration: 0.25 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
kdoc
Desc: Detects for kdoc errors
Duration: 0.93 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
module-param
Desc: Detect module_param changes
Duration: 0.28 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
verify-fixes
Desc: Verifies that the Fixes: tags exist
Duration: 0.27 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 1: "[v5,1/2] randomize_kstack: Maintain kstack_offset per task"
verify-signedoff
Desc: Verifies that Signed-off-by: tags are correct
Duration: 0.29 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
build-rv32-defconfig
Desc: Builds riscv32 defconfig
Duration: 139.34 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
build-rv64-clang-allmodconfig
Desc: Builds riscv64 allmodconfig with Clang, and checks for errors and added warnings
Duration: 2282.06 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
build-rv64-gcc-allmodconfig
Desc: Builds riscv64 allmodconfig with GCC, and checks for errors and added warnings
Duration: 3173.01 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
build-rv64-nommu-k210-defconfig
Desc: Builds riscv64 defconfig with NOMMU for K210
Duration: 27.39 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
build-rv64-nommu-k210-virt
Desc: Builds riscv64 defconfig with NOMMU for the virt platform
Duration: 28.73 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
checkpatch
Desc: Runs checkpatch.pl on the patch
Duration: 3.44 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
dtb-warn-rv64
Desc: Checks for Device Tree warnings/errors
Duration: 85.33 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
header-inline
Desc: Detects static functions without inline keyword in header files
Duration: 0.25 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
kdoc
Desc: Detects for kdoc errors
Duration: 0.94 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
module-param
Desc: Detect module_param changes
Duration: 0.31 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
verify-fixes
Desc: Verifies that the Fixes: tags exist
Duration: 0.23 seconds
Result: PASS

@linux-riscv-bot
Copy link
Copy Markdown
Author

Patch 2: "[v5,2/2] randomize_kstack: Unify random source across arches"
verify-signedoff
Desc: Verifies that Signed-off-by: tags are correct
Duration: 0.31 seconds
Result: PASS

@linux-riscv-bot linux-riscv-bot deleted the pw1060691 branch March 11, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants