Skip to content

Fix nested ForwardDiff over NonlinearLeastSquaresProblem default vjp#932

Merged
ChrisRackauckas merged 3 commits into
SciML:masterfrom
ChrisRackauckas-Claude:fix/forwarddiff-nlls-vjp-nested
May 17, 2026
Merged

Fix nested ForwardDiff over NonlinearLeastSquaresProblem default vjp#932
ChrisRackauckas merged 3 commits into
SciML:masterfrom
ChrisRackauckas-Claude:fix/forwarddiff-nlls-vjp-nested

Conversation

@ChrisRackauckas-Claude
Copy link
Copy Markdown
Contributor

@ChrisRackauckas-Claude ChrisRackauckas-Claude commented May 4, 2026

Draft — please ignore until reviewed by @ChrisRackauckas.

Summary

Fixes the MethodError: no method matching Float64(::ForwardDiff.Dual{...}) that triggers in CI for the core test group on master, e.g.

  • lib/SimpleNonlinearSolve/test/core/forward_diff_tests.jl:90ForwardDiff.jl Integration NonlinearLeastSquaresProblem
  • test/forward_ad_tests.jl:128NLLS Hessian SciML/NonlinearSolve.jl#445

These showed up unrelated to the PR that surfaced them (e.g. https://github.com/SciML/NonlinearSolve.jl/actions/runs/25295011499/job/74152282985, https://github.com/SciML/NonlinearSolve.jl/actions/runs/25295011493/job/74152283034) — they regress on master.

Root cause

NonlinearSolveBase.nlls_generate_vjp_function (no user jac/vjp branch) uses DI.pullback with AutoForwardDiff() for small problems. Under DifferentiationInterface, AutoForwardDiff's pullback is emulated via pushforward, and the emulation writes the cotangent into a buffer keyed off eltype(u).

When the returned closure is itself called under an outer ForwardDiff layer — e.g. ForwardDiff.gradient(p -> sum(abs2, solve(NLLS(...; p)).u), p) or any ForwardDiff.hessian of a solve — the inner solve unwraps p to Float64 and computes uu::Vector{Float64}, but the closure runs under nonlinearsolve_∂f_∂p's ForwardDiff.jacobian(Base.Fix1(vjp_fn, uu), p), where p is reseeded as Vector{Dual}. So inside the closure:

  • u::Vector{Float64}
  • p::Vector{Dual{tag_outer, ...}}
  • Base.Fix2(raw_f, p)(u) outputs Vector{Dual{tag_outer, ...}}

DI's pushforward emulation then tries to splat the Dual cotangent values into a Vector{Float64} output buffer (arroftup_to_tupofarr in DI's linalg.jl), and setindex! rejects them.

Fix

When the inner backend is AutoForwardDiff, materialize the Jacobian via DI.jacobian and form 2 * J' * resid directly. DI.jacobian with AutoForwardDiff dispatches to ForwardDiff.jacobian, which handles nested Duals natively using fresh tags per layer.

For the IIP branch, the raw in-place function is wrapped as an OOP closure so that ForwardDiff.jacobian allocates an output buffer of the correctly-promoted Dual type. Using the raw IIP form would force the Dual nesting order of a pre-allocated output buffer, which doesn't compose under nested tags.

Reverse-mode backends — used for larger NLLS problems via select_reverse_mode_autodiff — keep going through DI.pullback (they don't have this issue, since reverse-mode AD has a real pullback rather than pushforward emulation).

Bumps NonlinearSolveBase to 2.25.1.

Test plan

  • Local repro of forward_diff_tests.jl:90's OOP NLLS test passes after fix (gradient computed correctly)
  • Local repro of forward_ad_tests.jl:128's NLLS Hessian test passes for both the IIP gradient and the IIP Hessian after fix (matches FiniteDiff.finite_difference_gradient / finite_difference_hessian to atol=1e-3)
  • Full SimpleNonlinearSolve core group passes locally on Julia 1.11 (35706 Pass, 4 Broken, 35710 Total | 11m11.3s)
  • Runic.jl --check clean

🤖 Generated with Claude Code

The default `nlls_generate_vjp_function` (no user `jac`/`vjp`) used
`DI.pullback` with `AutoForwardDiff()` for small problems. Under DI,
AutoForwardDiff's pullback is emulated via pushforward, which writes
the cotangent into a buffer keyed off `eltype(u)`. When the closure
runs under an outer ForwardDiff layer (`ForwardDiff.gradient(p ->
sum(abs2, solve(NLLS(...; p)).u), p)`, or any `ForwardDiff.hessian`
of a function that solves an NLLS), the captured outer-Dual `p`
makes the function output `Vector{Dual_outer}` while `u` stays
`Vector{Float64}`, so the pushforward emulation fails with
`MethodError: no method matching Float64(::ForwardDiff.Dual{...})`.

When the inner backend is `AutoForwardDiff`, materialize the
Jacobian via `DI.jacobian` (which dispatches to
`ForwardDiff.jacobian`, handling nested Duals natively via fresh
tags) and form `2 * J' * resid`. Reverse-mode backends — used for
larger NLLS problems via `select_reverse_mode_autodiff` — keep
going through `DI.pullback`.

For the IIP branch we wrap `raw_f` as an OOP closure so that
`ForwardDiff.jacobian` allocates an output buffer of the correctly
promoted Dual type under nested differentiation; using the raw IIP
form would force the Dual ordering of the pre-allocated output
buffer, which doesn't compose under nested tags.

Bumps `NonlinearSolveBase` to 2.25.1.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Pull the AutoForwardDiff path out as a top-level branch instead of an
`is_forwarddiff` flag threaded through both IIP and OOP closures. The
OOP path collapses to a one-liner; the IIP path replaces the
`mul!(reshape(du,1,:), reshape(resid,1,:), J, 2, false)` reshape
gymnastics with `mul!(du, J', ff(u), 2, false)` and reuses the OOP
wrapper closure for the residual computation. Same fix, less code.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas-Claude ChrisRackauckas-Claude force-pushed the fix/forwarddiff-nlls-vjp-nested branch from 0bed6f5 to 1b58c0a Compare May 17, 2026 11:14
@ChrisRackauckas ChrisRackauckas marked this pull request as ready for review May 17, 2026 13:22
@ChrisRackauckas ChrisRackauckas merged commit 34e9c99 into SciML:master May 17, 2026
119 of 121 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants