Add tagdepth fast path to tag comparison by ChrisRackauckas · Pull Request #807 · JuliaDiff/ForwardDiff.jl

ChrisRackauckas · 2026-05-13T20:29:40Z

Fixes SciML/OrdinaryDiffEq.jl#3381 and superscedes SciML/OrdinaryDiffEq.jl#3587 . Also fixes NonlinearSolve.jl master and superscedes SciML/NonlinearSolve.jl#932

Superscedes #724 and is a better solution to #714.

The crux of the issue is that ForwardDiff.jl's tagging system is somewhat designed around the tag only being used once, i.e. the function is created, the derivative function is called, the tag is set for that derivative as a type of the function being differentiated, and therefore it's unique. Then this ends up working with nested differentiation because you call the inner function first, usually, before the outer function, or only do the combination, and so the tag ordering is set correctly.

Mixing tagging with precompilation then leads to this issue where it's possible for the outer tag to be precompiled before the inner tag. This makes the tag ordering the opposite, and what happens is then that the type promotion mechanism gets confused because it is tied to the tag ordering. This seems pretty fundamental because it's a useful property, it's the core property used to prevent perturbation confusion, but it means that this interaction between nested differentiation and precompilation ends up having odd bugs.

I tried working around this downstream (SciML/OrdinaryDiffEq.jl#3587) but it was very nasty. Basically, you had to make sure you didn't have dual numbers automatically converting Float64s, as then sometimes it could convert to the inner type instead of the outer type, and it wouldn't do the normal conversion of first to the inner to then the wrapped outer type because doing so required the outer type to postdate the inner type.

But, this really then showcases that the bug truly only manifests with nested types. And if you have nested types, you know you don't have perturbation confusion if one tag is nested deeper than the other tag, because there are not the same number of partials. So in the case where the tag depths are not the same, you can do an alternative tag ordering since you will have already proven perturbations aren't confusing. And in that case, you can choose the deeper nested tags to just always be < the less deeper tags.

So added that and poof, tag nesting worked out in these cases with precompilation. So I think this captures the true crux of the problem and solves it at its core.

Fixes SciML/OrdinaryDiffEq.jl#3381 and superscedes SciML/OrdinaryDiffEq.jl#3587 . Also fixes NonlinearSolve.jl master and superscedes SciML/NonlinearSolve.jl#932 Superscedes #724 and is a better solution to #714. The crux of the issue is that ForwardDiff.jl's tagging system is somewhat designed around the tag only being used once, i.e. the function is created, the derivative function is called, the tag is set for that derivative as a type of the function being differentiated, and therefore it's unique. Then this ends up working with nested differentiation because you call the inner function first, usually, before the outer function, or only do the combination, and so the tag ordering is set correctly. Mixing tagging with precompilation then leads to this issue where it's possible for the outer tag to be precompiled before the inner tag. This makes the tag ordering the opposite, and what happens is then that the type promotion mechanism gets confused because it is tied to the tag ordering. This seems pretty fundamental because it's a useful property, it's the core property used to prevent perturbation confusion, but it means that this interaction between nested differentiation and precompilation ends up having odd bugs. I tried working around this downstream (SciML/OrdinaryDiffEq.jl#3587) but it was very nasty. Basically, you had to make sure you didn't have dual numbers automatically converting Float64s, as then sometimes it could convert to the inner type instead of the outer type, and it wouldn't do the normal conversion of first to the inner to then the wrapped outer type because doing so required the outer type to postdate the inner type. But, this really then showcases that the bug truly only manifests with nested types. And if you have nested types, you know you don't have perturbation confusion if one tag is nested deeper than the other tag, because there are not the same number of partials. So in the case where the tag depths are not the same, you can do an alternative tag ordering since you will have already proven perturbations aren't confusing. And in that case, you can choose the deeper nested tags to just always be `<` the less deeper tags. So added that and poof, tag nesting worked out in these cases with precompilation. So I think this captures the true crux of the problem and solves it at its core.

codecov · 2026-05-13T20:37:57Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.80%. Comparing base (7e52ffa) to head (8cc8c72).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #807      +/-   ##
==========================================
+ Coverage   90.75%   90.80%   +0.05%     
==========================================
  Files          11       11              
  Lines        1071     1077       +6     
==========================================
+ Hits          972      978       +6     
  Misses         99       99

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

devmotion · 2026-05-13T21:16:22Z

+@inline tagdepth(::Type) = 0
+@inline tagdepth(::Type{<:Dual{T,V,N}}) where {T,V,N} = 1 + tagdepth(V)
+@inline tagdepth(::Type{<:Tag{F,V}}) where {F,V} = 1 + tagdepth(V)


Why not just defining this function on ::Type and Type{<:Dual}? Why ::Type{<:Tag} as well?

Because when you nest it's a tag of duals

devmotion · 2026-05-13T21:20:47Z

+      d1 = tagdepth(Tag{F1,V1})
+      d2 = tagdepth(Tag{F2,V2})
+      d1 != d2 && return d1 < d2


You can define Tags basically arbitrarily, so why would this be safe?

If the tags are defined according to the interfaces in the package, T is the type being differentiated. If T is the type being differentiated, then this gaurentees tag ordering by differentiation hierarchy. We previously had PRs closed about documenting this saying that it is non public API, so since this package always obeys that invarient internally and it's purposefully non public, it would be non breaking to enforce it.

devmotion · 2026-05-13T21:21:59Z

+      d1 = tagdepth(Tag{F1,V1})
+      d2 = tagdepth(Tag{F2,V2})
+      d1 != d2 && return d1 < d2
+      return tagcount(Tag{F1,V1}) < tagcount(Tag{F2,V2})


It seems one could still run into the same precompilation-caused problems here, e.g., if V1 === V2 (e.g. both Float64)?

Build an example? All of the examples from before that cannot happen because it's not Float64 in both cases but Dual of Float64, and that tag nesting is exactly the part you missed from the earlier part.

ChrisRackauckas · 2026-05-14T00:07:01Z

@devmotion From your comments I think you missed the core part of this. The tag by design does f and eltype (V). That eltype when nested is itself a Tag Dual. In a nested differentiation context that establishes its own natural ordering for the duals, as in the nested context the promotion of duals to the outer dual via appending zero partials to the inner dual is the natural action.

As a safety check, what I can do is add a proof that V1 nests V2 exactly as well, i.e. the stripped V1 matches V2. In that case the definition is very clear, and this is the actual case that is triggered by precompilation issue.

devmotion · 2026-05-15T21:28:35Z

Following my earlier review comments, I had a gut feeling that the depth-based fast path solves the symptom in some common cases but isn't really fixing the root cause — and I worried it might even produce incorrect results in cases where it triggers. To check, I challenged Claude to construct problematic examples for this PR. Two emerged; both reproduce empirically against this branch on Julia 1.12.

TL;DR

The PR introduces a regression. A 3-level nested differentiation where the innermost derivative is seeded with a constant scalar previously worked on master, but throws DualMismatchError with this PR applied. No precompile race required — it reproduces in a fresh session.
The PR does not fix the same-depth case. When both interacting tags have V === Float64 (i.e. the canonical "outer derivative of an inner scalar derivative whose closure captures the outer point"), the new tagdepth fast path is bypassed and the comparison falls back to the same tagcount branch as master. An inverted tagcount ordering produces a DualMismatchError both with and without the PR.

The mechanism behind (1): tagdepth(::Type{<:Tag{F,V}}) = 1 + tagdepth(V) looks through V only, not F. When the inner derivative is seeded with a literal Float64, the inner tag is Tag{<closure>, Float64} with depth 1 — even though that closure captures a Dual{T_middle, ..., 2} (the outer-nesting context is in F, not in V). The fast path then orders this tag inside the depth-2 middle tag; the binary-op macro at src/dual.jl:154 dispatches to y_body; the result no longer has the inner tag as outermost; and the inner partials(T_inner, result) lands on the throw(DualMismatchError(T,S)) branch at dual.jl:110-115.

Example 1 — regression: 3-level nested derivative with constant inner seed

using ForwardDiff

inner_deriv(d) = ForwardDiff.derivative(y -> y^2 * d, 1.0)     # = 2*d
middle_grad(v) = ForwardDiff.gradient(v) do u
    sum(inner_deriv(ui) * ui for ui in u)                       # = sum(2*ui^2)
end                                                             # grad = [4*u1, 4*u2]
outer_fn(x) = sum(middle_grad([x, 2x]))                         # = 12x

ForwardDiff.derivative(outer_fn, 0.5)                           # analytic: 12.0

	master (`7e52ffa`)	this PR
Result	`12.0`	`DualMismatchError`

Tags involved at the failing multiplication y * d inside inner_deriv:

Tag	`V`	`tagdepth`
`T_outer = Tag{typeof(outer_fn), Float64}`	`Float64`	1
`T_middle = Tag{<gradient closure>, Dual{T_outer, Float64, 1}}`	`Dual{...}`	2
`T_inner = Tag{<derivative closure>, Float64}` (constant seed)	`Float64`	1

The PR's fast path sees tagdepth(T_middle)=2 > tagdepth(T_inner)=1 and orders T_inner ≺ T_middle. But in execution order T_inner is created last — its derivative is the one being extracted by inner_deriv — so it must be the outermost tag in the result type, not the inner one. (Note that the inner tag's F parameter is a closure whose type does contain a Dual{T_middle, ...}, but tagdepth doesn't recurse into F.)

A simpler variant also throws on this PR and works on master:

inner2(d) = ForwardDiff.derivative(y -> y * d, 1.0)             # = d
mid2(v)   = ForwardDiff.gradient(u -> sum(inner2.(u) .* u), v)  # grad of sum(ui^2) = [2*u1, 2*u2]
out2(x)   = sum(mid2([x, x]))                                   # = 4x
ForwardDiff.derivative(out2, 0.5)                               # analytic: 4.0

Example 2 — same-depth still broken: inverted tagcount with V=Float64 on both sides

using ForwardDiff
const Tag = ForwardDiff.Tag

struct OuterF end
struct InnerF
    x_dual::ForwardDiff.Dual{Tag{OuterF, Float64}, Float64, 1}
end
(c::InnerF)(y) = sin(c.x_dual * y)
(::OuterF)(x_dual::ForwardDiff.Dual{Tag{OuterF, Float64}, Float64, 1}) =
    ForwardDiff.derivative(InnerF(x_dual), 1.0)

# Force tagcount to be evaluated in INVERTED order (simulating a precompile race):
ForwardDiff.tagcount(Tag{InnerF, Float64})    # returns 0
ForwardDiff.tagcount(Tag{OuterF, Float64})    # returns 1

# Analytic: d/dx (d/dy sin(x*y)|_{y=1}) at x=0.5 = cos(0.5) - 0.5*sin(0.5) ≈ 0.6378697925882713
ForwardDiff.derivative(OuterF(), 0.5)

	master	this PR
Result	`DualMismatchError`	`DualMismatchError`
Path through `≺`	`tagcount` (inverted)	`tagdepth(...)=1` on both sides → fast path skipped → same `tagcount` (inverted)

Both tags have V === Float64, so the new fast path never fires. This is the case I flagged on line 31 of the diff. The pattern (outer derivative of an inner scalar derivative whose closure captures the outer point) is the canonical scalar nested derivative — it isn't exotic, and it's exactly what would be expected to suffer from precompile-induced tagcount inversion in downstream packages.

riftsim-jarod · 2026-05-17T20:54:59Z

I ran into this one over the past two days, and was surprised to see these comments to be so fresh.

For my case: a process simulation engine differentiates a solver residual with ForwardDiff, the residual calls a thermodynamics library (Clapeyron) that runs its own nested ForwardDiff internally, and a precompile workload baked an inverted tagcount. ~Half of CI runs threw DualMismatchError -- but it's one of those nasty stochastic errors -- it is inconsistent in failure. #807's tagdepth approach does fix it.

I also reproduced @devmotion's regression independently: @devmotion's 3-level / constant-inner-seed example returns 12.0 on master and throws DualMismatchError with #807 applied.

The cause is the one already noted in this thread — tagdepth reads depth off V only, so a tag whose real nesting lives in a captured closure F gets an understated depth and the fast path mis-orders it. The "prove V1 nests V2" safety check mentioned above works. Here is a concrete version that seems to fix all of these issues on my side:

  @inline tagdepth(::Type) = 0
  @inline tagdepth(::Type{<:Dual{T,V,N}}) where {T,V,N} = 1 + tagdepth(V)
  @inline tagdepth(::Type{<:Tag{F,V}}) where {F,V} = 1 + tagdepth(V)

  # Does Tag `target` genuinely appear in X's value-type / tag chain?
  # Deliberately does NOT recurse closure type params F.
  @inline contains_tag(::Type, target) = false
  @inline contains_tag(::Type{<:Tag{F,V}}, target) where {F,V} = contains_tag(V, target)
  @inline contains_tag(::Type{<:Dual{T,V,N}}, target) where {T,V,N} =
      (T === target) || contains_tag(T, target) || contains_tag(V, target)

  @inline function ≺(::Type{Tag{F1,V1}}, ::Type{Tag{F2,V2}}) where {F1,V1,F2,V2}
      T1 = Tag{F1,V1}
      T2 = Tag{F2,V2}
      d1 = tagdepth(T1)
      d2 = tagdepth(T2)
      if d1 != d2
          genuinely_nested = d1 > d2 ? contains_tag(T1, T2) : contains_tag(T2, T1)
          genuinely_nested && return d1 < d2   # depth wins ONLY if nesting is real
      end
      return tagcount(T1) < tagcount(T2)
  end

≺ then takes the depth fast-path only when contains_tag(deeper, shallower) holds — the shallower tag genuinely appears in the deeper tag's value-type chain. contains_tag deliberately does not recurse closure params F, so the case where depth is hidden in a closure falls back to tagcount, unchanged.

Verified: the inverted-tagcount repro → fixed; @devmotion's example → back to 12.0; our full downstream suite (4173 assertions, 108 items) → 0 DualMismatchError (was 21), 0 failures. I don't know this package well enough to know if this causes other unintended problems.

Hope this is helpful.

ChrisRackauckas-Claude · 2026-05-17T21:53:12Z

I asked Claude to dig in on this thread and try five candidate approaches against a unified test matrix. Reporting back since the comparison is concrete.

Test harness (8 cases)

Single Julia 1.11 / ForwardDiff master process; isolated env per approach with OrdinaryDiffEqRosenbrock/SciMLBase/ADTypes and a dev'd ForwardDiff. Each case is "OK" if it solves without error and returns a finite, correct result.

using ForwardDiff, OrdinaryDiffEqRosenbrock, SciMLBase, ADTypes

const results = NamedTuple{(:case, :status, :detail), Tuple{String, String, String}}[]
function runtest(body, name)
    try
        r = body()
        push!(results, (case = name, status = "OK", detail = string(r)[1:min(80, end)]))
    catch e
        push!(results, (case = name, status = "FAIL",
                detail = "$(typeof(e).name.name): $(sprint(showerror, e)[1:min(180, end)])"))
    end
end

# ===== Cases 1-6: SciML/OrdinaryDiffEq.jl#3587 deterministic matrix =====
# Outer ForwardDiff layer over a Rosenbrock23 ODE solve, with the outer Dual
# carried in via `p_dual` (and optionally `u0_dual`). Varies specialization
# mode {Full, Auto, No} × u0 eltype {Float64, Dual}.
function ode!(du, u, p, t)
    du[1] = -p[1] * u[1]
    du[2] = -u[1] - p[2] * u[2]
    return nothing
end
const TestTag = ForwardDiff.Tag{:NestedForwardDiffOuter, Float64}
const p_dual = ForwardDiff.Dual{TestTag, Float64, 2}[
    ForwardDiff.Dual{TestTag}(1.5, ForwardDiff.Partials{2, Float64}((1.0, 0.0))),
    ForwardDiff.Dual{TestTag}(2.0, ForwardDiff.Partials{2, Float64}((0.0, 1.0))),
]
const u0_dual = [ForwardDiff.Dual{TestTag}(1.0, ForwardDiff.Partials{2, Float64}((0.0, 0.0))) for _ in 1:2]
for spec in (SciMLBase.FullSpecialize, SciMLBase.AutoSpecialize, SciMLBase.NoSpecialize)
    for (label, u0) in (("Float64u0", [1.0, 1.0]), ("Dualu0", u0_dual))
        runtest("#3587.$(spec).$(label)") do
            ode_f = ODEFunction{true, spec}(ode!)
            prob = ODEProblem(ode_f, u0, (0.0, 1.0), p_dual)
            sol = solve(prob, Rosenbrock23(autodiff = AutoForwardDiff(chunksize = 2));
                reltol = 1.0e-8, abstol = 1.0e-8)
            SciMLBase.successful_retcode(sol) && all(u -> all(isfinite, u), sol.u) ? "ok" : "bad_retcode"
        end
    end
end

# ===== Case 7: @devmotion's example 1 — 3-level nested derivative with =====
# inner Float64 seed. Reproduces the regression introduced by #807's
# depth-only fast path.
runtest("devmotion.ex1.simple") do
    inner2(d) = ForwardDiff.derivative(y -> y * d, 1.0)
    mid2(v)   = ForwardDiff.gradient(u -> sum(inner2.(u) .* u), v)
    out2(x)   = sum(mid2([x, x]))
    g = ForwardDiff.derivative(out2, 0.5)
    isapprox(g, 4.0; atol = 1.0e-10) ? "ok (g=$g)" : "wrong (g=$g, want 4.0)"
end

# ===== Case 8: @devmotion's example 2 — same-depth (both V=Float64) tags =====
# with tagcount forced into inverted order (simulating a precompile race
# that baked the inner tag's literal before the outer tag's).
struct OuterF_TestSuite end
struct InnerF_TestSuite
    x_dual::ForwardDiff.Dual{ForwardDiff.Tag{OuterF_TestSuite, Float64}, Float64, 1}
end
(c::InnerF_TestSuite)(y) = sin(c.x_dual * y)
(::OuterF_TestSuite)(x_dual::ForwardDiff.Dual{ForwardDiff.Tag{OuterF_TestSuite, Float64}, Float64, 1}) =
    ForwardDiff.derivative(InnerF_TestSuite(x_dual), 1.0)

runtest("devmotion.ex2.inverted_tagcount") do
    ForwardDiff.tagcount(ForwardDiff.Tag{InnerF_TestSuite, Float64})  # → 0
    ForwardDiff.tagcount(ForwardDiff.Tag{OuterF_TestSuite, Float64})  # → 1 (inverted)
    g = ForwardDiff.derivative(OuterF_TestSuite(), 0.5)
    expected = cos(0.5) - 0.5 * sin(0.5)
    isapprox(g, expected; atol = 1.0e-10) ? "ok (g=$g)" : "wrong (g=$g, want $expected)"
end

Summary table

Approach	8-case harness	ForwardDiff testsuite (master = 9035/9035)
Stock master	1/8	9035/9035
PR #807 (current)	6/8	9035/9035
A — OrdinaryDiffEq-side: skip `standardtag` if `eltype(u0)<:Dual`	7/8	n/a
B — runtime-cached non-`@generated` tagcount + eager nested registration	8/8	9019/9035 (2 errored) ❌
C — `tagdepth` recurses into `T.parameters` ∪ `fieldtypes(T)`, depth-bound 6	8/8	9035/9035 ✅
D — depth walker + `tagcount` + `objectid` 3-tier fallback	8/8	9035/9035 ✅
E — depth + `contains_tag` containment guard (the variant just proposed above)	7/8	9035/9035 ✅

Per-case results

Case	Stock	#807	A	B	C	D	E
`#3587.FullSpecialize.Float64u0`	❌	✅	✅	✅	✅	✅	✅
`#3587.FullSpecialize.Dualu0`	❌	✅	✅	✅	✅	✅	✅
`#3587.AutoSpecialize.Float64u0`	❌	✅	✅	✅	✅	✅	✅
`#3587.AutoSpecialize.Dualu0`	❌	✅	✅	✅	✅	✅	✅
`#3587.NoSpecialize.Float64u0`	❌	✅	✅	✅	✅	✅	✅
`#3587.NoSpecialize.Dualu0`	❌	✅	✅	✅	✅	✅	✅
`devmotion.ex1.simple`	✅	❌	✅	✅	✅	✅	✅
`devmotion.ex2.inverted_tagcount`	❌	❌	❌	✅	✅	✅	❌

Code

Approach A — OrdinaryDiffEq-side: skip `standardtag` when `u0` is Dual

Patches OrdinaryDiffEqDifferentiation and OrdinaryDiffEqNonlinearSolve. No ForwardDiff change. Avoids the bug at the source for SciML, but doesn't address pure-ForwardDiff scenarios.

# lib/OrdinaryDiffEqDifferentiation/src/alg_utils.jl
function prepare_ADType(autodiff_alg::AutoForwardDiff, prob, u0, p, ::Val{true})
    if eltype(u0) <: ForwardDiff.Dual
        return _prepare_ADType_fwd(autodiff_alg, prob, u0, nothing)
    end
    tag = ForwardDiff.Tag(OrdinaryDiffEqTag(), eltype(u0))
    return _prepare_ADType_fwd(autodiff_alg, prob, u0, tag)
end

# lib/OrdinaryDiffEqNonlinearSolve/src/initialize_dae.jl
function _tagged_autodiff(u, ::Val{CS} = Val(1)) where {CS}
    if eltype(u) <: ForwardDiff.Dual
        return AutoForwardDiff{CS}()
    end
    return AutoForwardDiff{CS}(ForwardDiff.Tag(OrdinaryDiffEqTag(), eltype(u)))
end

Approach B — Runtime-cached `tagcount` (NOT shippable)

Replaces @generated tagcount with a runtime IdDict{Type,UInt} registry; eagerly registers tags reachable through V and fieldtypes(F) before the outer tag. Conceptually clean (eliminates precompile-baking entirely) but regresses 2 existing FD tests in GradientTest.jl:279 / JacobianTest.jl:317: _register_inner_tags(::Type{Dual{T,V,N}}) unconditionally calls tagcount(T), but T in a Dual parameter can be a bare user marker (OuterTestTag), not a Tag{F,V}. Documented FD usage. Could be fixed by guarding tagcount(T) to only fire when T<:Tag, but I didn't pursue that branch.

const TAGCOUNT_REGISTRY = IdDict{Type,UInt}()
const TAGCOUNT_LOCK = ReentrantLock()

_register_inner_tags(::Type) = nothing
function _register_inner_tags(::Type{Dual{T,V,N}}) where {T,V,N}
    tagcount(T)                     # <-- bug: T may not be <:Tag
    _register_inner_tags(V)
    return nothing
end

_register_field_tags(@nospecialize(F)) = nothing
@generated function _register_field_tags(::Type{F}) where {F}
    if isa(F, DataType) && isconcretetype(F) && fieldcount(F) > 0
        body = Expr(:block)
        for ft in fieldtypes(F); push!(body.args, :(_register_inner_tags($ft))); end
        push!(body.args, :(return nothing)); return body
    else
        return :(return nothing)
    end
end

function tagcount(::Type{Tag{F,V}}) where {F,V}
    T = Tag{F,V}
    v = get(TAGCOUNT_REGISTRY, T, typemax(UInt))
    v === typemax(UInt) || return v
    _register_inner_tags(V)
    F isa Type && _register_field_tags(F)
    @lock TAGCOUNT_LOCK begin
        v2 = get(TAGCOUNT_REGISTRY, T, typemax(UInt))
        v2 === typemax(UInt) || return v2
        new_v = Threads.atomic_add!(TAGCOUNT, UInt(1))
        TAGCOUNT_REGISTRY[T] = new_v
        return new_v
    end
end

Approach C — `tagdepth` recurses into `T.parameters` ∪ `fieldtypes(T)`, depth-bound 6

Strict superset of #807's depth fast-path. The walker is @generated so steady-state runtime cost is zero; tagcount semantics are unchanged (still used as same-depth tiebreaker). Catches both closures-capturing-Duals (via T.parameters) and structs-holding-Dual-fields (via fieldtypes(T)). Hard depth bound at 6 — deeper nesting falls back to tagcount.

@generated function _walk_max_depth(::Type{T}, ::Val{D}) where {T,D}
    D <= 0 && return :(0)
    isa(T, DataType) || return :(0)
    T <: Dual && return :(tagdepth($T))
    T <: Tag  && return :(tagdepth($T))
    ms = Any[]
    if isstructtype(T)
        for p in T.parameters
            p isa Type && push!(ms, :(_walk_max_depth($p, Val($(D-1)))))
        end
        if isconcretetype(T)
            for ft in fieldtypes(T)
                ft isa Type && push!(ms, :(_walk_max_depth($ft, Val($(D-1)))))
            end
        end
    end
    isempty(ms) ? :(0) : Expr(:call, :max, ms...)
end

@inline _safe_tagdepth(x) = 0
@inline _safe_tagdepth(::Type{T}) where {T} = tagdepth(T)

@inline tagdepth(::Type{T}) where {T} = _walk_max_depth(T, Val(6))
@inline tagdepth(::Type{<:Dual{T,V,N}}) where {T,V,N} = 1 + max(_safe_tagdepth(T), tagdepth(V))
@inline tagdepth(::Type{<:Tag{F,V}}) where {F,V}     = 1 + max(_safe_tagdepth(F), tagdepth(V))

@inline function ≺(::Type{Tag{F1,V1}}, ::Type{Tag{F2,V2}}) where {F1,V1,F2,V2}
    d1 = tagdepth(Tag{F1,V1}); d2 = tagdepth(Tag{F2,V2})
    d1 != d2 && return d1 < d2
    return tagcount(Tag{F1,V1}) < tagcount(Tag{F2,V2})
end

Approach D — depth walker + `objectid` 3-tier fallback

Similar shape to C but with an explicit objectid tiebreaker after tagcount. The agent that produced this variant noted C is "the more principled fix"; D adds defensive ordering at the cost of more code.

function _tag_depth_impl(T, seen, budget)
    T isa Type || return 0
    T in seen && return 0
    push!(seen, T); budget[] <= 0 && return 0; budget[] -= 1
    md = 0
    if T isa DataType
        if T <: Tag && length(T.parameters) == 2
            F = T.parameters[1]; V = T.parameters[2]
            d_v = _tag_depth_impl(V, seen, budget)
            d_f = F isa Type ? _tag_depth_impl(F, seen, budget) : 0
            md = max(md, 1 + max(d_v, d_f))
        end
        for p in T.parameters
            d = _tag_depth_impl(p, seen, budget); d > md && (md = d)
        end
        if isconcretetype(T) && !(T <: Tuple) && !(T <: Tag) && !(T <: Dual)
            try
                for ft in fieldtypes(T)
                    d = _tag_depth_impl(ft, seen, budget); d > md && (md = d)
                end
            catch end
        end
    end
    return md
end
function _tag_depth(::Type{Tag{F,V}}) where {F,V}
    seen = Base.IdSet{Any}(); budget = Ref(500)
    d_v = _tag_depth_impl(V, seen, budget)
    d_f = F isa Type ? _tag_depth_impl(F, seen, budget) : 0
    return 1 + max(d_v, d_f)
end
@generated effective_tag_depth(::Type{Tag{F,V}}) where {F,V} = :($(_tag_depth(Tag{F,V})))

@inline function ≺(::Type{Tag{F1,V1}}, ::Type{Tag{F2,V2}}) where {F1,V1,F2,V2}
    d1 = effective_tag_depth(Tag{F1,V1}); d2 = effective_tag_depth(Tag{F2,V2})
    d1 != d2 && return d1 < d2
    tc1 = tagcount(Tag{F1,V1}); tc2 = tagcount(Tag{F2,V2})
    tc1 != tc2 && return tc1 < tc2
    return objectid(Tag{F1,V1}) < objectid(Tag{F2,V2})
end

Approach E — depth + `contains_tag` containment guard

The variant proposed in the comment just above. Strict improvement on #807: depth fast-path only fires when the shallower tag genuinely appears in the deeper tag's V-chain. Closure-F captures don't trigger the fast-path, so they fall back to tagcount — fixing devmotion.ex1's regression. Same-V same-depth cases (devmotion.ex2) aren't addressed; if a downstream forces an inverted tagcount in that shape it still fails.

@inline tagdepth(::Type) = 0
@inline tagdepth(::Type{<:Dual{T,V,N}}) where {T,V,N} = 1 + tagdepth(V)
@inline tagdepth(::Type{<:Tag{F,V}}) where {F,V} = 1 + tagdepth(V)

@inline contains_tag(::Type, target) = false
@inline contains_tag(::Type{<:Tag{F,V}}, target) where {F,V} = contains_tag(V, target)
@inline contains_tag(::Type{<:Dual{T,V,N}}, target) where {T,V,N} =
    (T === target) || contains_tag(T, target) || contains_tag(V, target)

@inline function ≺(::Type{Tag{F1,V1}}, ::Type{Tag{F2,V2}}) where {F1,V1,F2,V2}
    T1 = Tag{F1,V1}; T2 = Tag{F2,V2}
    d1 = tagdepth(T1); d2 = tagdepth(T2)
    if d1 != d2
        genuinely_nested = d1 > d2 ? contains_tag(T1, T2) : contains_tag(T2, T1)
        genuinely_nested && return d1 < d2
    end
    return tagcount(T1) < tagcount(T2)
end

Verdict

If full 8/8 is wanted: C is the cleanest 8/8 fix that survives the full FD testsuite.

If the goal is the smallest defensible improvement on #807: E is the +14-LOC strict superset — it eliminates the regression devmotion identified without taking on the complexity of walking F's parameters/fields. The cost is leaving the same-V precompile-bake-inverted case unaddressed.

A is orthogonal — it's the SciML-side mitigation that ships independently of any ForwardDiff change. Worth pairing with whatever ForwardDiff fix lands as defense in depth.

(All numbers verified locally on Julia 1.11; per-approach diffs and test artifacts are reproducible from /tmp/test_suite.jl. B is listed for completeness but isn't shippable as written.)

ChrisRackauckas requested a review from devmotion May 13, 2026 20:30

devmotion reviewed May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tagdepth fast path to tag comparison#807

Add tagdepth fast path to tag comparison#807
ChrisRackauckas wants to merge 1 commit into
masterfrom
ChrisRackauckas-patch-1

ChrisRackauckas commented May 13, 2026

Uh oh!

codecov Bot commented May 13, 2026

Uh oh!

devmotion May 13, 2026

Uh oh!

ChrisRackauckas May 13, 2026

Uh oh!

devmotion May 13, 2026

Uh oh!

ChrisRackauckas May 13, 2026

Uh oh!

devmotion May 13, 2026

Uh oh!

ChrisRackauckas May 13, 2026

Uh oh!

ChrisRackauckas commented May 14, 2026

Uh oh!

devmotion commented May 15, 2026

Uh oh!

riftsim-jarod commented May 17, 2026

Uh oh!

ChrisRackauckas-Claude commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ChrisRackauckas commented May 13, 2026

Uh oh!

codecov Bot commented May 13, 2026

Codecov Report

Uh oh!

devmotion May 13, 2026

Choose a reason for hiding this comment

Uh oh!

ChrisRackauckas May 13, 2026

Choose a reason for hiding this comment

Uh oh!

devmotion May 13, 2026

Choose a reason for hiding this comment

Uh oh!

ChrisRackauckas May 13, 2026

Choose a reason for hiding this comment

Uh oh!

devmotion May 13, 2026

Choose a reason for hiding this comment

Uh oh!

ChrisRackauckas May 13, 2026

Choose a reason for hiding this comment

Uh oh!

ChrisRackauckas commented May 14, 2026

Uh oh!

devmotion commented May 15, 2026

TL;DR

Example 1 — regression: 3-level nested derivative with constant inner seed

Example 2 — same-depth still broken: inverted tagcount with V=Float64 on both sides

Uh oh!

riftsim-jarod commented May 17, 2026

Uh oh!

ChrisRackauckas-Claude commented May 17, 2026

Test harness (8 cases)

Summary table

Per-case results

Code

Approach A — OrdinaryDiffEq-side: skip standardtag when u0 is Dual

Approach B — Runtime-cached tagcount (NOT shippable)

Approach C — tagdepth recurses into T.parameters ∪ fieldtypes(T), depth-bound 6

Approach D — depth walker + objectid 3-tier fallback

Approach E — depth + contains_tag containment guard

Verdict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Approach A — OrdinaryDiffEq-side: skip `standardtag` when `u0` is Dual

Approach B — Runtime-cached `tagcount` (NOT shippable)

Approach C — `tagdepth` recurses into `T.parameters` ∪ `fieldtypes(T)`, depth-bound 6

Approach D — depth walker + `objectid` 3-tier fallback

Approach E — depth + `contains_tag` containment guard