Skip to content

Fix GPU kernel InvalidIRError: drop PassthroughRNG path in tau-leaping kernel#589

Closed
ChrisRackauckas-Claude wants to merge 0 commit into
SciML:masterfrom
ChrisRackauckas-Claude:fix-gpu-passthroughrng-method-error
Closed

Fix GPU kernel InvalidIRError: drop PassthroughRNG path in tau-leaping kernel#589
ChrisRackauckas-Claude wants to merge 0 commit into
SciML:masterfrom
ChrisRackauckas-Claude:fix-gpu-passthroughrng-method-error

Conversation

@ChrisRackauckas-Claude
Copy link
Copy Markdown
Contributor

Closes #588.

Summary

On Julia 1.12 + CUDACore (latest), the GPU tau-leaping kernel fails at compile time with InvalidIRError: unsupported call to an unknown function (call to jl_f_throw_methoderror) originating from count_rand → randexp → rand → rand_native_52.

This PR replaces pois_rand(PoissonRandom.PassthroughRNG(), λ) with pois_rand(λ) in ext/JumpProcessesKernelAbstractionsExt.jl (one call site).

Root cause

GPUCompiler's @device_override is implemented via an overlay method table whose entries outrank the regular method table irrespective of method specificity. So although PoissonRandom defines a specific Random.randexp(::PassthroughRNG) = randexp(), CUDACore's @device_override Random.randexp(rng::AbstractRNG) wins on device. The override is therefore entered with rng::PassthroughRNG, and its body calls Random.rand(rng, UInt52Raw()). That dispatches to rand(::AbstractRNG, ::SamplerTrivial{UInt52Raw{UInt64}}) which calls rng_native_52(rng). PassthroughRNG <: Random.AbstractRNG directly — not via RandomNumbers.AbstractRNG, the only thing that carries an rng_native_52 fallback — so rng_native_52(::PassthroughRNG) is statically a MethodError. GPUCompiler refuses to lower the unreachable-but-not-provably-so throw.

I verified this on Julia 1.12.6 by reproducing the error on the CPU:

julia> Random.rand(PoissonRandom.PassthroughRNG(), Random.UInt52Raw())
ERROR: MethodError: no method matching rng_native_52(::PassthroughRNG)
Stacktrace:
 [1] rand(r::PassthroughRNG, ::Random.SamplerTrivial{Random.UInt52Raw{UInt64}, UInt64})
   @ Random Random/src/generation.jl:114
 [2] rand(rng::PassthroughRNG, X::Random.UInt52Raw{UInt64})
   @ Random Random/src/Random.jl:255

These are exactly frames [1] and [2] from the failing GPU run (actions/runs/25629456752).

Why this fix works

pois_rand(λ) dispatches to pois_rand(Random.default_rng(), λ). On device, Random.default_rng() is overridden by CUDACore to return Philox2x32(). Philox2x32 <: RandomNumbers.AbstractRNG{UInt64}, which provides Random.rng_native_52(::RandomNumbers.AbstractRNG) = UInt64 (RandomNumbers.jl/src/common.jl:16). The chain rand(Philox2x32, UInt52Raw()) → _rand52(rng, UInt64) → rand(rng, UInt64) then resolves to CUDACore's explicit Random.rand(::Philox2x32{R}, ::Type{UInt64}) method.

Semantically equivalent to the previous code: in both cases the device-side RNG state is the same Philox2x32 with per-warp state — the PassthroughRNG was always just an indirection to rand()/randexp()/randn() which themselves go through default_rng() on device.

Test plan

  • GPU Tests workflow passes on this PR (the existing test/gpu/regular_jumps.jl SIR/SEIR + SimpleTauLeaping + EnsembleGPUKernel(CUDABackend()) test exercises this exact code path and is currently the regression on master).
  • Run Tests (CPU) stays green — code path is on the GPU extension only, no CPU paths touched.

Notes

  • Please ignore until reviewed by @ChrisRackauckas.
  • Smallest possible fix; the broader PoissonRandom PassthroughRNG design issue (its Random.rand/randexp/randn family takes no Type arg and so cannot satisfy the standard Sampler→rng_native_52 chain) is out of scope here and arguably belongs upstream in PoissonRandom.

🤖 Generated with Claude Code

@ChrisRackauckas-Claude
Copy link
Copy Markdown
Contributor Author

Update — CI revealed a second, independent GPU issue. Pushing a workaround.

After the first commit fixed the MethodError on rng_native_52(::PassthroughRNG), the kernel compile got further and now fails with:

Reason: unsupported call to an unknown function (call to julia.get_pgcstack)
Stacktrace:
 [1] randexp @ CUDACore/src/device/random.jl:345
...
 [1] pois_rand @ PoissonRandom.jl:137
 [3] macro expansion @ JumpProcessesKernelAbstractionsExt.jl:139

Line 345 is result = randexp_unlikely(rng, idx, x) inside CUDACore's @device_override Random.randexp(::AbstractRNG). JuliaGPU/CUDA.jl#3086 (in v6.1.0) replaced the original ziggurat recursion with a while true loop calling @noinline randexp_unlikely and threading Union{Float64, Nothing} for retry. On Julia 1.12 this lowers to a real function call requiring GC-frame setup → get_pgcstack, which GPUCompiler refuses. That's an upstream CUDA.jl bug; flagging here, not opening an issue against CUDA.jl without explicit go-ahead from @ChrisRackauckas.

JumpProcesses-side workaround in the second commit: drop pois_rand from the kernel and inline a count-method Poisson sampler that uses only Random.rand(rng, Float64) and log(). No randexp, no @noinline retry helpers, so the bad code path is never compiled. Random.rand(rng, Float64) on device dispatches through RandomNumbers.jl's rand(::AbstractRNG{UInt64}, ::Type{Float64})rand(rng, UInt52()) → CUDACore's explicit Random.rand(::Philox2x32, ::Type{UInt64}) — fully inlinable.

Cost: O(λ) iterations per Poisson draw. Acceptable for SimpleTauLeaping's small-λ regime (τ is chosen so jump counts per step stay small). If a high-λ use case shows up later, we can add a normal-approximation fast path for λ ≳ 30.

@ChrisRackauckas-Claude ChrisRackauckas-Claude force-pushed the fix-gpu-passthroughrng-method-error branch from a190c41 to 28088bb Compare May 11, 2026 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GPU kernel InvalidIRError: pois_rand(PassthroughRNG, λ) hits jl_f_throw_methoderror on Julia 1.12 / latest CUDACore

1 participant