Add order=2 Hessian preparation via value_gradient_and_hessian!!#163
Conversation
|
AbstractPPL.jl documentation for PR #163 is available at: |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #163 +/- ##
==========================================
+ Coverage 88.14% 88.69% +0.54%
==========================================
Files 15 15
Lines 886 982 +96
==========================================
+ Hits 781 871 +90
- Misses 105 111 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Extends `prepare(adtype, problem, x; order=2)` to build Hessian machinery for scalar-valued problems on the DI and Mooncake extensions, returning `(value, gradient, hessian)` from a new `value_gradient_and_hessian!!` generic. Unifies the per-extension caches (`DICache`, `MooncakeCache`) so one struct carries every derivative order, with explicit cross-arity error messages replacing prior `MethodError`s. DI uses the in-place `DI.value_gradient_and_hessian!` with caller-owned buffers; Mooncake uses its native `prepare_hessian_cache` API. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the order=1-prep error case for value_gradient_and_hessian!! out of :edge and into a new :hessian_edge group so Hessian-specific edge checks are only exercised by preparations that support order=2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Just one small suggestion.
Can we validate order before building the evaluator?
Right now order=3 still calls problem once, and the user can get an output-arity error before the actual invalid-order error. See:
julia> using AbstractPPL, ADTypes, DifferentiationInterface, ForwardDiff
julia> calls = Ref(0);
julia> f(x) = (calls[] += 1; [1 2; 3 4]);
julia> AbstractPPL.prepare(AutoForwardDiff(), f, zeros(2); order=3)
ERROR: ArgumentError: A prepared AD evaluator must return a scalar or AbstractVector; got Matrix{Int64}.
Stacktrace:
[1] _ad_output_arity
@ ~/Work/vectorly-ai/AbstractPPL.jl/src/evaluators/Evaluators.jl:277 [inlined]
[2] prepare(adtype::AutoForwardDiff{…}, problem::Function, x::Vector{…}; check_dims::Bool, context::Tuple{}, order::Int64)
@ AbstractPPLDifferentiationInterfaceExt ~/Work/vectorly-ai/AbstractPPL.jl/ext/AbstractPPLDifferentiationInterfaceExt.jl:75
[3] top-level scope
@ REPL[5]:1
Some type information was truncated. Use `show(err)` to see complete types.
julia> calls[]
1
julia>I expected this to throw ArgumentError("`order` must be 1 or 2, got 3.") without calling f.
Maybe both DI and Mooncake paths can do:
order in (1, 2) || throw(ArgumentError("`order` must be 1 or 2, got $order."))before the AbstractPPL.prepare(problem, ...) / evaluator(x).
`Prepared` gains an `Order` type parameter (`1` for gradient/jacobian, `2` for Hessian) with an `order(::Prepared)` accessor, so the prep order can be retrieved reliably without inspecting the backend-specific cache type. `value_and_gradient!!` on an order=2 prep now returns `(value, gradient)` via a dedicated gradient prep built alongside the Hessian prep — no O(n²) Hessian work for a gradient-only call. For DI's `SecondOrder` backend the gradient prep uses `DI.inner(adtype)` per DI's convention; the same unwrap runs on the hot path so prep and call use matching adtypes. `order` is now validated up-front via `Evaluators._validate_ad_order` (was duplicated across both extensions and fired only after the structural prep had already called `problem` once). DI: `DICache` is replaced by three concrete types — `DIGradientCache`, `DIJacobianCache`, `DIHessianCache` — eliminating the 6-nullable-field struct and runtime `=== nothing` checks. `_di_call_shape` is the shared target-and-constants helper used by both `_prepare_di` (order=1) and the order=2 path; the two preps share one target instance so compiled-tape ReverseDiff sees a consistent `Fix2` closure. Mooncake: `MooncakeCache` gains a `gradient_cache` field populated only at order=2; `_mooncake_gradient_cache` is now used by the NamedTuple path, the order=1 scalar branch, and the order=2 gradient prep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Thanks for catching this! Fixed in b98f027, together with a few related improvements helpful for AdvancedHMC/AdvancedVI:
|
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends
prepare(adtype, problem, x; order=2)to build Hessian machinery for scalar-valued problems on the DI and Mooncake extensions, returning(value, gradient, hessian)from a newvalue_gradient_and_hessian!!generic. Unifies the per-extension caches (DICache,MooncakeCache) so one struct carries every derivative order, with explicit cross-arity error messages replacing priorMethodErrors. DI uses the in-placeDI.value_gradient_and_hessian!with caller-owned buffers; Mooncake uses its nativeprepare_hessian_cacheAPI.Hessian is needed by some variational inference algorithms. See TuringLang/AdvancedVI.jl#255