Suppose we have a function $f: \mathbb{R^n} \to \mathbb{R}$. Using dual numbers we can input a dual vector $x + J\epsilon$ and get out a dual number. This induces a function $g: \mathbb{R}^n \to \mathbb{R}^n$ defined as $g(x) \to \text{Dual parts of }f(x + J\epsilon)$. If we now input $y + I\epsilon'$ into $g$, where $\epsilon_i\epsilon'_j \neq 0$ for all $1 \leq i, j \leq n$, the $\epsilon_i\epsilon'_j$ terms give us the Hessian coefficients.
In application, if we allow definitions DualVector(::DualVector, ::AbstractMatrix) and Dual(::Dual, ::DualVector), this should have the same effect, and would open up discussion or examples concerning 2nd order methods in DualArrays.jl.
Example:
f(x) = x[1] * x[2]
g(x) = f(DualVector(x, I(length(x)))).partials
d = DualVector([1, 2], I(2))
- In order to evaluate
g(d), we start with a DualVector(d, I(2))
- We then pass through
f giving us Dual(d[1], [1, 0]) * Dual(d[2], [0, 1])
- The product rule makes this evaluate to
Dual(d[1] * d[2], [d[2], d[1]]) (Note: we can overload operations such that [d[2], d[1]] = d[2] * [1,0] + d[1] * [0,1] returns a DualVector. In general, for the latter argument to be a DualVector instead of a Vector{Dual} operations between Dual and AbstractVector also need to be overloaded)
- The jacobian of
[d[2], d[1]] is actually a matrix of nested dual parts, so this is our Hessian. We verify from the definition of d that this is the correct Hessian, [0 1;1 0].
Suppose we have a function$f: \mathbb{R^n} \to \mathbb{R}$ . Using dual numbers we can input a dual vector $x + J\epsilon$ and get out a dual number. This induces a function $g: \mathbb{R}^n \to \mathbb{R}^n$ defined as $g(x) \to \text{Dual parts of }f(x + J\epsilon)$ . If we now input $y + I\epsilon'$ into $g$ , where $\epsilon_i\epsilon'_j \neq 0$ for all $1 \leq i, j \leq n$ , the $\epsilon_i\epsilon'_j$ terms give us the Hessian coefficients.
In application, if we allow definitions
DualVector(::DualVector, ::AbstractMatrix)andDual(::Dual, ::DualVector), this should have the same effect, and would open up discussion or examples concerning 2nd order methods inDualArrays.jl.Example:
f(x) = x[1] * x[2]g(x) = f(DualVector(x, I(length(x)))).partialsd = DualVector([1, 2], I(2))g(d), we start with aDualVector(d, I(2))fgiving usDual(d[1], [1, 0]) * Dual(d[2], [0, 1])Dual(d[1] * d[2], [d[2], d[1]])(Note: we can overload operations such that[d[2], d[1]] = d[2] * [1,0] + d[1] * [0,1]returns aDualVector. In general, for the latter argument to be aDualVectorinstead of aVector{Dual}operations betweenDualandAbstractVectoralso need to be overloaded)[d[2], d[1]]is actually a matrix of nested dual parts, so this is our Hessian. We verify from the definition ofdthat this is the correct Hessian,[0 1;1 0].