Skip to content

Add order=2 Hessian preparation via value_gradient_and_hessian!!#163

Merged
yebai merged 5 commits into
mainfrom
hg/hessian-order
May 21, 2026
Merged

Add order=2 Hessian preparation via value_gradient_and_hessian!!#163
yebai merged 5 commits into
mainfrom
hg/hessian-order

Conversation

@yebai
Copy link
Copy Markdown
Member

@yebai yebai commented May 19, 2026

Extends prepare(adtype, problem, x; order=2) to build Hessian machinery for scalar-valued problems on the DI and Mooncake extensions, returning (value, gradient, hessian) from a new value_gradient_and_hessian!! generic. Unifies the per-extension caches (DICache, MooncakeCache) so one struct carries every derivative order, with explicit cross-arity error messages replacing prior MethodErrors. DI uses the in-place DI.value_gradient_and_hessian! with caller-owned buffers; Mooncake uses its native prepare_hessian_cache API.

Hessian is needed by some variational inference algorithms. See TuringLang/AdvancedVI.jl#255

@github-actions
Copy link
Copy Markdown
Contributor

AbstractPPL.jl documentation for PR #163 is available at:
https://TuringLang.github.io/AbstractPPL.jl/previews/PR163/

@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

❌ Patch coverage is 95.62044% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.69%. Comparing base (1599516) to head (d70a945).

Files with missing lines Patch % Lines
ext/AbstractPPLTestExt.jl 89.65% 3 Missing ⚠️
src/evaluators/Evaluators.jl 84.61% 2 Missing ⚠️
ext/AbstractPPLDifferentiationInterfaceExt.jl 98.24% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #163      +/-   ##
==========================================
+ Coverage   88.14%   88.69%   +0.54%     
==========================================
  Files          15       15              
  Lines         886      982      +96     
==========================================
+ Hits          781      871      +90     
- Misses        105      111       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Extends `prepare(adtype, problem, x; order=2)` to build Hessian machinery
for scalar-valued problems on the DI and Mooncake extensions, returning
`(value, gradient, hessian)` from a new `value_gradient_and_hessian!!`
generic. Unifies the per-extension caches (`DICache`, `MooncakeCache`) so
one struct carries every derivative order, with explicit cross-arity error
messages replacing prior `MethodError`s. DI uses the in-place
`DI.value_gradient_and_hessian!` with caller-owned buffers; Mooncake uses
its native `prepare_hessian_cache` API.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yebai yebai force-pushed the hg/hessian-order branch from 32c6a0d to 2cc3790 Compare May 19, 2026 21:40
Move the order=1-prep error case for value_gradient_and_hessian!! out
of :edge and into a new :hessian_edge group so Hessian-specific edge
checks are only exercised by preparations that support order=2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yebai yebai marked this pull request as ready for review May 19, 2026 21:58
@yebai yebai requested a review from shravanngoswamii May 20, 2026 19:56
Copy link
Copy Markdown
Member

@shravanngoswamii shravanngoswamii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small suggestion.

Can we validate order before building the evaluator?

Right now order=3 still calls problem once, and the user can get an output-arity error before the actual invalid-order error. See:

julia> using AbstractPPL, ADTypes, DifferentiationInterface, ForwardDiff

julia> calls = Ref(0);

julia> f(x) = (calls[] += 1; [1 2; 3 4]);

julia> AbstractPPL.prepare(AutoForwardDiff(), f, zeros(2); order=3)
ERROR: ArgumentError: A prepared AD evaluator must return a scalar or AbstractVector; got Matrix{Int64}.
Stacktrace:
 [1] _ad_output_arity
   @ ~/Work/vectorly-ai/AbstractPPL.jl/src/evaluators/Evaluators.jl:277 [inlined]
 [2] prepare(adtype::AutoForwardDiff{…}, problem::Function, x::Vector{…}; check_dims::Bool, context::Tuple{}, order::Int64)
   @ AbstractPPLDifferentiationInterfaceExt ~/Work/vectorly-ai/AbstractPPL.jl/ext/AbstractPPLDifferentiationInterfaceExt.jl:75
 [3] top-level scope
   @ REPL[5]:1
Some type information was truncated. Use `show(err)` to see complete types.

julia> calls[]
1

julia>

I expected this to throw ArgumentError("`order` must be 1 or 2, got 3.") without calling f.

Maybe both DI and Mooncake paths can do:

order in (1, 2) || throw(ArgumentError("`order` must be 1 or 2, got $order."))

before the AbstractPPL.prepare(problem, ...) / evaluator(x).

`Prepared` gains an `Order` type parameter (`1` for gradient/jacobian, `2`
for Hessian) with an `order(::Prepared)` accessor, so the prep order can be
retrieved reliably without inspecting the backend-specific cache type.

`value_and_gradient!!` on an order=2 prep now returns `(value, gradient)` via
a dedicated gradient prep built alongside the Hessian prep — no O(n²) Hessian
work for a gradient-only call. For DI's `SecondOrder` backend the gradient
prep uses `DI.inner(adtype)` per DI's convention; the same unwrap runs on the
hot path so prep and call use matching adtypes.

`order` is now validated up-front via `Evaluators._validate_ad_order` (was
duplicated across both extensions and fired only after the structural prep
had already called `problem` once).

DI: `DICache` is replaced by three concrete types — `DIGradientCache`,
`DIJacobianCache`, `DIHessianCache` — eliminating the 6-nullable-field
struct and runtime `=== nothing` checks. `_di_call_shape` is the shared
target-and-constants helper used by both `_prepare_di` (order=1) and the
order=2 path; the two preps share one target instance so compiled-tape
ReverseDiff sees a consistent `Fix2` closure.

Mooncake: `MooncakeCache` gains a `gradient_cache` field populated only at
order=2; `_mooncake_gradient_cache` is now used by the NamedTuple path, the
order=1 scalar branch, and the order=2 gradient prep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yebai
Copy link
Copy Markdown
Member Author

yebai commented May 21, 2026

Thanks for catching this! Fixed in b98f027, together with a few related improvements helpful for AdvancedHMC/AdvancedVI:

  • order is a type parameter of Prepared with a public order(::Prepared) accessor, so it can be retrieved reliably without inspecting cache internals.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@shravanngoswamii shravanngoswamii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thank you!

@yebai yebai merged commit 4390c43 into main May 21, 2026
15 checks passed
@yebai yebai deleted the hg/hessian-order branch May 21, 2026 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants