Skip to content

GPU kernel InvalidIRError: pois_rand(PassthroughRNG, λ) hits jl_f_throw_methoderror on Julia 1.12 / latest CUDACore #588

@ChrisRackauckas-Claude

Description

@ChrisRackauckas-Claude

After PR #587 fixes the UndefVarError: validate_pure_leaping_inputs, the GPU tests progress past validation but now fail at kernel compilation with a separate pre-existing issue.

Symptom

From SciML/JumpProcesses.jl actions run 25629456752:

LoadError: InvalidIRError: compiling MethodInstance for
  JumpProcessesKernelAbstractionsExt.gpu_simple_tau_leaping_kernel(...)
  resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to jl_f_throw_methoderror)
Stacktrace:
 [1] rand        @ Random/src/generation.jl:114    # Julia 1.12.6
 [2] rand        @ Random/src/Random.jl:255
 [3] randexp     @ CUDACore/src/device/random.jl:339   (device_override)
 [4] count_rand  @ PoissonRandom/MLLFD/src/PoissonRandom.jl:18
 [5] pois_rand   @ PoissonRandom/MLLFD/src/PoissonRandom.jl:137
 [6] kernel body @ ext/JumpProcessesKernelAbstractionsExt.jl:135

Reproducing call site

# ext/JumpProcessesKernelAbstractionsExt.jl:135
counts[k] = pois_rand(PoissonRandom.PassthroughRNG(), rate_cache[k])

Dispatch chain on the GPU device

  • randexp(::PassthroughRNG) (PoissonRandom) → bare randexp() with no rng
  • randexp()randexp(default_rng())randexp(::Philox2x32) (CUDACore @device_override makes default_rng() return Philox2x32() on device)
  • randexp(rng::AbstractRNG) @device_override at CUDACore/src/device/random.jl:339 calls Random.rand(rng, Random.UInt52Raw())
  • rand(rng, X) at Random.jl:255 → Sampler(rng, X, Val(1)) then rand(rng, sampler)
  • rand(r::AbstractRNG, ::SamplerTrivial{UInt52Raw{UInt64}}) at generation.jl:114 → _rand52(r, rng_native_52(r))

Somewhere along this chain on Julia 1.12.6 + latest CUDACore there is a path that resolves to throw(MethodError(...)) which GPUCompiler can't prove unreachable, hence InvalidIRError.

Environment from the failing run

  • Julia 1.12.6
  • CUDACore version gtlJx
  • PoissonRandom version MLLFD (older 0.4.x, with Random.rand(::PassthroughRNG) = rand() pattern)
  • KernelAbstractions ecO4B
  • GPUCompiler lHkad

Hypotheses to investigate

  1. PoissonRandom PassthroughRNG design hits a Julia-1.12-specific method-table edge case. The PassthroughRNG defines only Random.rand(rng) = rand(), Random.randexp(rng) = randexp(), Random.randn(rng) = randn() (no second-argument methods). On Julia 1.12, perhaps a new Sampler machinery infers a path that calls rand(rng, T) for some T and lacks a method, yielding a statically-reachable throw(MethodError).
  2. CUDACore's randexp(rng::AbstractRNG) override paired with Philox2x32 triggers a path to rand(rng, UInt52Raw()) whose sampler-construction sequence on Julia 1.12 reaches throw(MethodError).
  3. The fix may live in PoissonRandom (define a richer set of Random.rand(rng::PassthroughRNG, ...) methods, or drop PassthroughRNG in favor of using default_rng() on device). Note PoissonRandom 0.4.7 (master) exists, the failing run still uses an older artifact dir — confirm Project.toml compat allows the latest.

Suggested fix paths (not yet tried)

  • Replace pois_rand(PassthroughRNG(), λ) with pois_rand(λ) so the call uses Random.default_rng(), which CUDACore already overrides on device. Conceptually identical chain on the device but skips the PassthroughRNG → bare-fn indirection.
  • Hand-roll the few randexp calls inside the kernel using device intrinsics directly instead of going through PoissonRandom (most surgical for this kernel).
  • Update PoissonRandom to define a complete rand/randexp/randn family for PassthroughRNG that takes Type{T} arguments and forwards to bare rand(T) / etc., so dispatch closes statically on Julia 1.12.

Reproducer

The existing test test/gpu/regular_jumps.jl (SIR / SEIR with SimpleTauLeaping + EnsembleGPUKernel(CUDABackend())) hits this on Julia 1.12 + latest CUDACore.

Why this isn't fixed in #587

PR #587 is a one-line validate_pure_leaping_inputs qualification fix. The IR error is a different and deeper issue requiring separate investigation, and benefits from being its own focused PR. Per CLAUDE.md small-PR philosophy.

🤖 Reported by Claude Code while iterating on PR #587.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions