Fix nested ForwardDiff over NonlinearLeastSquaresProblem default vjp#932
Merged
ChrisRackauckas merged 3 commits intoMay 17, 2026
Conversation
The default `nlls_generate_vjp_function` (no user `jac`/`vjp`) used
`DI.pullback` with `AutoForwardDiff()` for small problems. Under DI,
AutoForwardDiff's pullback is emulated via pushforward, which writes
the cotangent into a buffer keyed off `eltype(u)`. When the closure
runs under an outer ForwardDiff layer (`ForwardDiff.gradient(p ->
sum(abs2, solve(NLLS(...; p)).u), p)`, or any `ForwardDiff.hessian`
of a function that solves an NLLS), the captured outer-Dual `p`
makes the function output `Vector{Dual_outer}` while `u` stays
`Vector{Float64}`, so the pushforward emulation fails with
`MethodError: no method matching Float64(::ForwardDiff.Dual{...})`.
When the inner backend is `AutoForwardDiff`, materialize the
Jacobian via `DI.jacobian` (which dispatches to
`ForwardDiff.jacobian`, handling nested Duals natively via fresh
tags) and form `2 * J' * resid`. Reverse-mode backends — used for
larger NLLS problems via `select_reverse_mode_autodiff` — keep
going through `DI.pullback`.
For the IIP branch we wrap `raw_f` as an OOP closure so that
`ForwardDiff.jacobian` allocates an output buffer of the correctly
promoted Dual type under nested differentiation; using the raw IIP
form would force the Dual ordering of the pre-allocated output
buffer, which doesn't compose under nested tags.
Bumps `NonlinearSolveBase` to 2.25.1.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Pull the AutoForwardDiff path out as a top-level branch instead of an `is_forwarddiff` flag threaded through both IIP and OOP closures. The OOP path collapses to a one-liner; the IIP path replaces the `mul!(reshape(du,1,:), reshape(resid,1,:), J, 2, false)` reshape gymnastics with `mul!(du, J', ff(u), 2, false)` and reuses the OOP wrapper closure for the residual computation. Same fix, less code. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
0bed6f5 to
1b58c0a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the
MethodError: no method matching Float64(::ForwardDiff.Dual{...})that triggers in CI for thecoretest group on master, e.g.lib/SimpleNonlinearSolve/test/core/forward_diff_tests.jl:90—ForwardDiff.jl Integration NonlinearLeastSquaresProblemtest/forward_ad_tests.jl:128—NLLS Hessian SciML/NonlinearSolve.jl#445These showed up unrelated to the PR that surfaced them (e.g. https://github.com/SciML/NonlinearSolve.jl/actions/runs/25295011499/job/74152282985, https://github.com/SciML/NonlinearSolve.jl/actions/runs/25295011493/job/74152283034) — they regress on master.
Root cause
NonlinearSolveBase.nlls_generate_vjp_function(no userjac/vjpbranch) usesDI.pullbackwithAutoForwardDiff()for small problems. UnderDifferentiationInterface,AutoForwardDiff's pullback is emulated via pushforward, and the emulation writes the cotangent into a buffer keyed offeltype(u).When the returned closure is itself called under an outer ForwardDiff layer — e.g.
ForwardDiff.gradient(p -> sum(abs2, solve(NLLS(...; p)).u), p)or anyForwardDiff.hessianof a solve — the inner solve unwrapsptoFloat64and computesuu::Vector{Float64}, but the closure runs undernonlinearsolve_∂f_∂p'sForwardDiff.jacobian(Base.Fix1(vjp_fn, uu), p), wherepis reseeded asVector{Dual}. So inside the closure:u::Vector{Float64}p::Vector{Dual{tag_outer, ...}}Base.Fix2(raw_f, p)(u)outputsVector{Dual{tag_outer, ...}}DI's pushforward emulation then tries to splat the Dual cotangent values into a
Vector{Float64}output buffer (arroftup_to_tupofarrin DI'slinalg.jl), andsetindex!rejects them.Fix
When the inner backend is
AutoForwardDiff, materialize the Jacobian viaDI.jacobianand form2 * J' * residdirectly.DI.jacobianwithAutoForwardDiffdispatches toForwardDiff.jacobian, which handles nested Duals natively using fresh tags per layer.For the IIP branch, the raw in-place function is wrapped as an OOP closure so that
ForwardDiff.jacobianallocates an output buffer of the correctly-promoted Dual type. Using the raw IIP form would force the Dual nesting order of a pre-allocated output buffer, which doesn't compose under nested tags.Reverse-mode backends — used for larger NLLS problems via
select_reverse_mode_autodiff— keep going throughDI.pullback(they don't have this issue, since reverse-mode AD has a real pullback rather than pushforward emulation).Bumps
NonlinearSolveBaseto 2.25.1.Test plan
forward_diff_tests.jl:90's OOP NLLS test passes after fix (gradient computed correctly)forward_ad_tests.jl:128'sNLLS Hessiantest passes for both the IIP gradient and the IIP Hessian after fix (matchesFiniteDiff.finite_difference_gradient/finite_difference_hessianto atol=1e-3)SimpleNonlinearSolvecoregroup passes locally on Julia 1.11 (35706 Pass, 4 Broken, 35710 Total | 11m11.3s)Runic.jl --checkclean🤖 Generated with Claude Code