Skip to content

Add tagdepth fast path to tag comparison#807

Open
ChrisRackauckas wants to merge 1 commit into
masterfrom
ChrisRackauckas-patch-1
Open

Add tagdepth fast path to tag comparison#807
ChrisRackauckas wants to merge 1 commit into
masterfrom
ChrisRackauckas-patch-1

Conversation

@ChrisRackauckas
Copy link
Copy Markdown
Member

Fixes SciML/OrdinaryDiffEq.jl#3381 and superscedes SciML/OrdinaryDiffEq.jl#3587 . Also fixes NonlinearSolve.jl master and superscedes SciML/NonlinearSolve.jl#932

Superscedes #724 and is a better solution to #714.

The crux of the issue is that ForwardDiff.jl's tagging system is somewhat designed around the tag only being used once, i.e. the function is created, the derivative function is called, the tag is set for that derivative as a type of the function being differentiated, and therefore it's unique. Then this ends up working with nested differentiation because you call the inner function first, usually, before the outer function, or only do the combination, and so the tag ordering is set correctly.

Mixing tagging with precompilation then leads to this issue where it's possible for the outer tag to be precompiled before the inner tag. This makes the tag ordering the opposite, and what happens is then that the type promotion mechanism gets confused because it is tied to the tag ordering. This seems pretty fundamental because it's a useful property, it's the core property used to prevent perturbation confusion, but it means that this interaction between nested differentiation and precompilation ends up having odd bugs.

I tried working around this downstream (SciML/OrdinaryDiffEq.jl#3587) but it was very nasty. Basically, you had to make sure you didn't have dual numbers automatically converting Float64s, as then sometimes it could convert to the inner type instead of the outer type, and it wouldn't do the normal conversion of first to the inner to then the wrapped outer type because doing so required the outer type to postdate the inner type.

But, this really then showcases that the bug truly only manifests with nested types. And if you have nested types, you know you don't have perturbation confusion if one tag is nested deeper than the other tag, because there are not the same number of partials. So in the case where the tag depths are not the same, you can do an alternative tag ordering since you will have already proven perturbations aren't confusing. And in that case, you can choose the deeper nested tags to just always be < the less deeper tags.

So added that and poof, tag nesting worked out in these cases with precompilation. So I think this captures the true crux of the problem and solves it at its core.

Fixes SciML/OrdinaryDiffEq.jl#3381 and superscedes SciML/OrdinaryDiffEq.jl#3587 . Also fixes NonlinearSolve.jl master and superscedes SciML/NonlinearSolve.jl#932

Superscedes #724 and is a better solution to #714.

The crux of the issue is that ForwardDiff.jl's tagging system is somewhat designed around the tag only being used once, i.e. the function is created, the derivative function is called, the tag is set for that derivative as a type of the function being differentiated, and therefore it's unique. Then this ends up working with nested differentiation because you call the inner function first, usually, before the outer function, or only do the combination, and so the tag ordering is set correctly.

Mixing tagging with precompilation then leads to this issue where it's possible for the outer tag to be precompiled before the inner tag. This makes the tag ordering the opposite, and what happens is then that the type promotion mechanism gets confused because it is tied to the tag ordering. This seems pretty fundamental because it's a useful property, it's the core property used to prevent perturbation confusion, but it means that this interaction between nested differentiation and precompilation ends up having odd bugs.

I tried working around this downstream (SciML/OrdinaryDiffEq.jl#3587) but it was very nasty. Basically, you had to make sure you didn't have dual numbers automatically converting Float64s, as then sometimes it could convert to the inner type instead of the outer type, and it wouldn't do the normal conversion of first to the inner to then the wrapped outer type because doing so required the outer type to postdate the inner type.

But, this really then showcases that the bug truly only manifests with nested types. And if you have nested types, you know you don't have perturbation confusion if one tag is nested deeper than the other tag, because there are not the same number of partials. So in the case where the tag depths are not the same, you can do an alternative tag ordering since you will have already proven perturbations aren't confusing. And in that case, you can choose the deeper nested tags to just always be `<` the less deeper tags.

So added that and poof, tag nesting worked out in these cases with precompilation. So I think this captures the true crux of the problem and solves it at its core.
@ChrisRackauckas ChrisRackauckas requested a review from devmotion May 13, 2026 20:30
@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.80%. Comparing base (7e52ffa) to head (8cc8c72).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #807      +/-   ##
==========================================
+ Coverage   90.75%   90.80%   +0.05%     
==========================================
  Files          11       11              
  Lines        1071     1077       +6     
==========================================
+ Hits          972      978       +6     
  Misses         99       99              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread src/config.jl
Comment on lines +23 to +25
@inline tagdepth(::Type) = 0
@inline tagdepth(::Type{<:Dual{T,V,N}}) where {T,V,N} = 1 + tagdepth(V)
@inline tagdepth(::Type{<:Tag{F,V}}) where {F,V} = 1 + tagdepth(V)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just defining this function on ::Type and Type{<:Dual}? Why ::Type{<:Tag} as well?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because when you nest it's a tag of duals

Comment thread src/config.jl
Comment on lines +28 to +30
d1 = tagdepth(Tag{F1,V1})
d2 = tagdepth(Tag{F2,V2})
d1 != d2 && return d1 < d2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can define Tags basically arbitrarily, so why would this be safe?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the tags are defined according to the interfaces in the package, T is the type being differentiated. If T is the type being differentiated, then this gaurentees tag ordering by differentiation hierarchy. We previously had PRs closed about documenting this saying that it is non public API, so since this package always obeys that invarient internally and it's purposefully non public, it would be non breaking to enforce it.

Comment thread src/config.jl
d1 = tagdepth(Tag{F1,V1})
d2 = tagdepth(Tag{F2,V2})
d1 != d2 && return d1 < d2
return tagcount(Tag{F1,V1}) < tagcount(Tag{F2,V2})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems one could still run into the same precompilation-caused problems here, e.g., if V1 === V2 (e.g. both Float64)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build an example? All of the examples from before that cannot happen because it's not Float64 in both cases but Dual of Float64, and that tag nesting is exactly the part you missed from the earlier part.

@ChrisRackauckas
Copy link
Copy Markdown
Member Author

@devmotion From your comments I think you missed the core part of this. The tag by design does f and eltype (V). That eltype when nested is itself a Tag Dual. In a nested differentiation context that establishes its own natural ordering for the duals, as in the nested context the promotion of duals to the outer dual via appending zero partials to the inner dual is the natural action.

As a safety check, what I can do is add a proof that V1 nests V2 exactly as well, i.e. the stripped V1 matches V2. In that case the definition is very clear, and this is the actual case that is triggered by precompilation issue.

@devmotion
Copy link
Copy Markdown
Member

Following my earlier review comments, I had a gut feeling that the depth-based fast path solves the symptom in some common cases but isn't really fixing the root cause — and I worried it might even produce incorrect results in cases where it triggers. To check, I challenged Claude to construct problematic examples for this PR. Two emerged; both reproduce empirically against this branch on Julia 1.12.

TL;DR

  1. The PR introduces a regression. A 3-level nested differentiation where the innermost derivative is seeded with a constant scalar previously worked on master, but throws DualMismatchError with this PR applied. No precompile race required — it reproduces in a fresh session.

  2. The PR does not fix the same-depth case. When both interacting tags have V === Float64 (i.e. the canonical "outer derivative of an inner scalar derivative whose closure captures the outer point"), the new tagdepth fast path is bypassed and the comparison falls back to the same tagcount branch as master. An inverted tagcount ordering produces a DualMismatchError both with and without the PR.

The mechanism behind (1): tagdepth(::Type{<:Tag{F,V}}) = 1 + tagdepth(V) looks through V only, not F. When the inner derivative is seeded with a literal Float64, the inner tag is Tag{<closure>, Float64} with depth 1 — even though that closure captures a Dual{T_middle, ..., 2} (the outer-nesting context is in F, not in V). The fast path then orders this tag inside the depth-2 middle tag; the binary-op macro at src/dual.jl:154 dispatches to y_body; the result no longer has the inner tag as outermost; and the inner partials(T_inner, result) lands on the throw(DualMismatchError(T,S)) branch at dual.jl:110-115.

Example 1 — regression: 3-level nested derivative with constant inner seed

using ForwardDiff

inner_deriv(d) = ForwardDiff.derivative(y -> y^2 * d, 1.0)     # = 2*d
middle_grad(v) = ForwardDiff.gradient(v) do u
    sum(inner_deriv(ui) * ui for ui in u)                       # = sum(2*ui^2)
end                                                             # grad = [4*u1, 4*u2]
outer_fn(x) = sum(middle_grad([x, 2x]))                         # = 12x

ForwardDiff.derivative(outer_fn, 0.5)                           # analytic: 12.0
master (7e52ffa) this PR
Result 12.0 DualMismatchError

Tags involved at the failing multiplication y * d inside inner_deriv:

Tag V tagdepth
T_outer = Tag{typeof(outer_fn), Float64} Float64 1
T_middle = Tag{<gradient closure>, Dual{T_outer, Float64, 1}} Dual{...} 2
T_inner = Tag{<derivative closure>, Float64} (constant seed) Float64 1

The PR's fast path sees tagdepth(T_middle)=2 > tagdepth(T_inner)=1 and orders T_inner ≺ T_middle. But in execution order T_inner is created last — its derivative is the one being extracted by inner_deriv — so it must be the outermost tag in the result type, not the inner one. (Note that the inner tag's F parameter is a closure whose type does contain a Dual{T_middle, ...}, but tagdepth doesn't recurse into F.)

A simpler variant also throws on this PR and works on master:

inner2(d) = ForwardDiff.derivative(y -> y * d, 1.0)             # = d
mid2(v)   = ForwardDiff.gradient(u -> sum(inner2.(u) .* u), v)  # grad of sum(ui^2) = [2*u1, 2*u2]
out2(x)   = sum(mid2([x, x]))                                   # = 4x
ForwardDiff.derivative(out2, 0.5)                               # analytic: 4.0

Example 2 — same-depth still broken: inverted tagcount with V=Float64 on both sides

using ForwardDiff
const Tag = ForwardDiff.Tag

struct OuterF end
struct InnerF
    x_dual::ForwardDiff.Dual{Tag{OuterF, Float64}, Float64, 1}
end
(c::InnerF)(y) = sin(c.x_dual * y)
(::OuterF)(x_dual::ForwardDiff.Dual{Tag{OuterF, Float64}, Float64, 1}) =
    ForwardDiff.derivative(InnerF(x_dual), 1.0)

# Force tagcount to be evaluated in INVERTED order (simulating a precompile race):
ForwardDiff.tagcount(Tag{InnerF, Float64})    # returns 0
ForwardDiff.tagcount(Tag{OuterF, Float64})    # returns 1

# Analytic: d/dx (d/dy sin(x*y)|_{y=1}) at x=0.5 = cos(0.5) - 0.5*sin(0.5) ≈ 0.6378697925882713
ForwardDiff.derivative(OuterF(), 0.5)
master this PR
Result DualMismatchError DualMismatchError
Path through tagcount (inverted) tagdepth(...)=1 on both sides → fast path skipped → same tagcount (inverted)

Both tags have V === Float64, so the new fast path never fires. This is the case I flagged on line 31 of the diff. The pattern (outer derivative of an inner scalar derivative whose closure captures the outer point) is the canonical scalar nested derivative — it isn't exotic, and it's exactly what would be expected to suffer from precompile-induced tagcount inversion in downstream packages.

@riftsim-jarod
Copy link
Copy Markdown

I ran into this one over the past two days, and was surprised to see these comments to be so fresh.

For my case: a process simulation engine differentiates a solver residual with ForwardDiff, the residual calls a thermodynamics library (Clapeyron) that runs its own nested ForwardDiff internally, and a precompile workload baked an inverted tagcount. ~Half of CI runs threw DualMismatchError -- but it's one of those nasty stochastic errors -- it is inconsistent in failure. #807's tagdepth approach does fix it.

I also reproduced @devmotion's regression independently: @devmotion's 3-level / constant-inner-seed example returns 12.0 on master and throws DualMismatchError with #807 applied.

The cause is the one already noted in this thread — tagdepth reads depth off V only, so a tag whose real nesting lives in a captured closure F gets an understated depth and the fast path mis-orders it. The "prove V1 nests V2" safety check mentioned above works. Here is a concrete version that seems to fix all of these issues on my side:

  @inline tagdepth(::Type) = 0
  @inline tagdepth(::Type{<:Dual{T,V,N}}) where {T,V,N} = 1 + tagdepth(V)
  @inline tagdepth(::Type{<:Tag{F,V}}) where {F,V} = 1 + tagdepth(V)

  # Does Tag `target` genuinely appear in X's value-type / tag chain?
  # Deliberately does NOT recurse closure type params F.
  @inline contains_tag(::Type, target) = false
  @inline contains_tag(::Type{<:Tag{F,V}}, target) where {F,V} = contains_tag(V, target)
  @inline contains_tag(::Type{<:Dual{T,V,N}}, target) where {T,V,N} =
      (T === target) || contains_tag(T, target) || contains_tag(V, target)

  @inline function (::Type{Tag{F1,V1}}, ::Type{Tag{F2,V2}}) where {F1,V1,F2,V2}
      T1 = Tag{F1,V1}
      T2 = Tag{F2,V2}
      d1 = tagdepth(T1)
      d2 = tagdepth(T2)
      if d1 != d2
          genuinely_nested = d1 > d2 ? contains_tag(T1, T2) : contains_tag(T2, T1)
          genuinely_nested && return d1 < d2   # depth wins ONLY if nesting is real
      end
      return tagcount(T1) < tagcount(T2)
  end

then takes the depth fast-path only when contains_tag(deeper, shallower) holds — the shallower tag genuinely appears in the deeper tag's value-type chain. contains_tag deliberately does not recurse closure params F, so the case where depth is hidden in a closure falls back to tagcount, unchanged.

Verified: the inverted-tagcount repro → fixed; @devmotion's example → back to 12.0; our full downstream suite (4173 assertions, 108 items) → 0 DualMismatchError (was 21), 0 failures. I don't know this package well enough to know if this causes other unintended problems.

Hope this is helpful.

@ChrisRackauckas-Claude
Copy link
Copy Markdown
Contributor

I asked Claude to dig in on this thread and try five candidate approaches against a unified test matrix. Reporting back since the comparison is concrete.

Test harness (8 cases)

Single Julia 1.11 / ForwardDiff master process; isolated env per approach with OrdinaryDiffEqRosenbrock/SciMLBase/ADTypes and a dev'd ForwardDiff. Each case is "OK" if it solves without error and returns a finite, correct result.

using ForwardDiff, OrdinaryDiffEqRosenbrock, SciMLBase, ADTypes

const results = NamedTuple{(:case, :status, :detail), Tuple{String, String, String}}[]
function runtest(body, name)
    try
        r = body()
        push!(results, (case = name, status = "OK", detail = string(r)[1:min(80, end)]))
    catch e
        push!(results, (case = name, status = "FAIL",
                detail = "$(typeof(e).name.name): $(sprint(showerror, e)[1:min(180, end)])"))
    end
end

# ===== Cases 1-6: SciML/OrdinaryDiffEq.jl#3587 deterministic matrix =====
# Outer ForwardDiff layer over a Rosenbrock23 ODE solve, with the outer Dual
# carried in via `p_dual` (and optionally `u0_dual`). Varies specialization
# mode {Full, Auto, No} × u0 eltype {Float64, Dual}.
function ode!(du, u, p, t)
    du[1] = -p[1] * u[1]
    du[2] = -u[1] - p[2] * u[2]
    return nothing
end
const TestTag = ForwardDiff.Tag{:NestedForwardDiffOuter, Float64}
const p_dual = ForwardDiff.Dual{TestTag, Float64, 2}[
    ForwardDiff.Dual{TestTag}(1.5, ForwardDiff.Partials{2, Float64}((1.0, 0.0))),
    ForwardDiff.Dual{TestTag}(2.0, ForwardDiff.Partials{2, Float64}((0.0, 1.0))),
]
const u0_dual = [ForwardDiff.Dual{TestTag}(1.0, ForwardDiff.Partials{2, Float64}((0.0, 0.0))) for _ in 1:2]
for spec in (SciMLBase.FullSpecialize, SciMLBase.AutoSpecialize, SciMLBase.NoSpecialize)
    for (label, u0) in (("Float64u0", [1.0, 1.0]), ("Dualu0", u0_dual))
        runtest("#3587.$(spec).$(label)") do
            ode_f = ODEFunction{true, spec}(ode!)
            prob = ODEProblem(ode_f, u0, (0.0, 1.0), p_dual)
            sol = solve(prob, Rosenbrock23(autodiff = AutoForwardDiff(chunksize = 2));
                reltol = 1.0e-8, abstol = 1.0e-8)
            SciMLBase.successful_retcode(sol) && all(u -> all(isfinite, u), sol.u) ? "ok" : "bad_retcode"
        end
    end
end

# ===== Case 7: @devmotion's example 1 — 3-level nested derivative with =====
# inner Float64 seed. Reproduces the regression introduced by #807's
# depth-only fast path.
runtest("devmotion.ex1.simple") do
    inner2(d) = ForwardDiff.derivative(y -> y * d, 1.0)
    mid2(v)   = ForwardDiff.gradient(u -> sum(inner2.(u) .* u), v)
    out2(x)   = sum(mid2([x, x]))
    g = ForwardDiff.derivative(out2, 0.5)
    isapprox(g, 4.0; atol = 1.0e-10) ? "ok (g=$g)" : "wrong (g=$g, want 4.0)"
end

# ===== Case 8: @devmotion's example 2 — same-depth (both V=Float64) tags =====
# with tagcount forced into inverted order (simulating a precompile race
# that baked the inner tag's literal before the outer tag's).
struct OuterF_TestSuite end
struct InnerF_TestSuite
    x_dual::ForwardDiff.Dual{ForwardDiff.Tag{OuterF_TestSuite, Float64}, Float64, 1}
end
(c::InnerF_TestSuite)(y) = sin(c.x_dual * y)
(::OuterF_TestSuite)(x_dual::ForwardDiff.Dual{ForwardDiff.Tag{OuterF_TestSuite, Float64}, Float64, 1}) =
    ForwardDiff.derivative(InnerF_TestSuite(x_dual), 1.0)

runtest("devmotion.ex2.inverted_tagcount") do
    ForwardDiff.tagcount(ForwardDiff.Tag{InnerF_TestSuite, Float64})  # → 0
    ForwardDiff.tagcount(ForwardDiff.Tag{OuterF_TestSuite, Float64})  # → 1 (inverted)
    g = ForwardDiff.derivative(OuterF_TestSuite(), 0.5)
    expected = cos(0.5) - 0.5 * sin(0.5)
    isapprox(g, expected; atol = 1.0e-10) ? "ok (g=$g)" : "wrong (g=$g, want $expected)"
end

Summary table

Approach 8-case harness ForwardDiff testsuite (master = 9035/9035)
Stock master 1/8 9035/9035
PR #807 (current) 6/8 9035/9035
A — OrdinaryDiffEq-side: skip standardtag if eltype(u0)<:Dual 7/8 n/a
B — runtime-cached non-@generated tagcount + eager nested registration 8/8 9019/9035 (2 errored)
Ctagdepth recurses into T.parametersfieldtypes(T), depth-bound 6 8/8 9035/9035
D — depth walker + tagcount + objectid 3-tier fallback 8/8 9035/9035
E — depth + contains_tag containment guard (the variant just proposed above) 7/8 9035/9035

Per-case results

Case Stock #807 A B C D E
#3587.FullSpecialize.Float64u0
#3587.FullSpecialize.Dualu0
#3587.AutoSpecialize.Float64u0
#3587.AutoSpecialize.Dualu0
#3587.NoSpecialize.Float64u0
#3587.NoSpecialize.Dualu0
devmotion.ex1.simple
devmotion.ex2.inverted_tagcount

Code

Approach A — OrdinaryDiffEq-side: skip standardtag when u0 is Dual

Patches OrdinaryDiffEqDifferentiation and OrdinaryDiffEqNonlinearSolve. No ForwardDiff change. Avoids the bug at the source for SciML, but doesn't address pure-ForwardDiff scenarios.

# lib/OrdinaryDiffEqDifferentiation/src/alg_utils.jl
function prepare_ADType(autodiff_alg::AutoForwardDiff, prob, u0, p, ::Val{true})
    if eltype(u0) <: ForwardDiff.Dual
        return _prepare_ADType_fwd(autodiff_alg, prob, u0, nothing)
    end
    tag = ForwardDiff.Tag(OrdinaryDiffEqTag(), eltype(u0))
    return _prepare_ADType_fwd(autodiff_alg, prob, u0, tag)
end

# lib/OrdinaryDiffEqNonlinearSolve/src/initialize_dae.jl
function _tagged_autodiff(u, ::Val{CS} = Val(1)) where {CS}
    if eltype(u) <: ForwardDiff.Dual
        return AutoForwardDiff{CS}()
    end
    return AutoForwardDiff{CS}(ForwardDiff.Tag(OrdinaryDiffEqTag(), eltype(u)))
end

Approach B — Runtime-cached tagcount (NOT shippable)

Replaces @generated tagcount with a runtime IdDict{Type,UInt} registry; eagerly registers tags reachable through V and fieldtypes(F) before the outer tag. Conceptually clean (eliminates precompile-baking entirely) but regresses 2 existing FD tests in GradientTest.jl:279 / JacobianTest.jl:317: _register_inner_tags(::Type{Dual{T,V,N}}) unconditionally calls tagcount(T), but T in a Dual parameter can be a bare user marker (OuterTestTag), not a Tag{F,V}. Documented FD usage. Could be fixed by guarding tagcount(T) to only fire when T<:Tag, but I didn't pursue that branch.

const TAGCOUNT_REGISTRY = IdDict{Type,UInt}()
const TAGCOUNT_LOCK = ReentrantLock()

_register_inner_tags(::Type) = nothing
function _register_inner_tags(::Type{Dual{T,V,N}}) where {T,V,N}
    tagcount(T)                     # <-- bug: T may not be <:Tag
    _register_inner_tags(V)
    return nothing
end

_register_field_tags(@nospecialize(F)) = nothing
@generated function _register_field_tags(::Type{F}) where {F}
    if isa(F, DataType) && isconcretetype(F) && fieldcount(F) > 0
        body = Expr(:block)
        for ft in fieldtypes(F); push!(body.args, :(_register_inner_tags($ft))); end
        push!(body.args, :(return nothing)); return body
    else
        return :(return nothing)
    end
end

function tagcount(::Type{Tag{F,V}}) where {F,V}
    T = Tag{F,V}
    v = get(TAGCOUNT_REGISTRY, T, typemax(UInt))
    v === typemax(UInt) || return v
    _register_inner_tags(V)
    F isa Type && _register_field_tags(F)
    @lock TAGCOUNT_LOCK begin
        v2 = get(TAGCOUNT_REGISTRY, T, typemax(UInt))
        v2 === typemax(UInt) || return v2
        new_v = Threads.atomic_add!(TAGCOUNT, UInt(1))
        TAGCOUNT_REGISTRY[T] = new_v
        return new_v
    end
end

Approach C — tagdepth recurses into T.parametersfieldtypes(T), depth-bound 6

Strict superset of #807's depth fast-path. The walker is @generated so steady-state runtime cost is zero; tagcount semantics are unchanged (still used as same-depth tiebreaker). Catches both closures-capturing-Duals (via T.parameters) and structs-holding-Dual-fields (via fieldtypes(T)). Hard depth bound at 6 — deeper nesting falls back to tagcount.

@generated function _walk_max_depth(::Type{T}, ::Val{D}) where {T,D}
    D <= 0 && return :(0)
    isa(T, DataType) || return :(0)
    T <: Dual && return :(tagdepth($T))
    T <: Tag  && return :(tagdepth($T))
    ms = Any[]
    if isstructtype(T)
        for p in T.parameters
            p isa Type && push!(ms, :(_walk_max_depth($p, Val($(D-1)))))
        end
        if isconcretetype(T)
            for ft in fieldtypes(T)
                ft isa Type && push!(ms, :(_walk_max_depth($ft, Val($(D-1)))))
            end
        end
    end
    isempty(ms) ? :(0) : Expr(:call, :max, ms...)
end

@inline _safe_tagdepth(x) = 0
@inline _safe_tagdepth(::Type{T}) where {T} = tagdepth(T)

@inline tagdepth(::Type{T}) where {T} = _walk_max_depth(T, Val(6))
@inline tagdepth(::Type{<:Dual{T,V,N}}) where {T,V,N} = 1 + max(_safe_tagdepth(T), tagdepth(V))
@inline tagdepth(::Type{<:Tag{F,V}}) where {F,V}     = 1 + max(_safe_tagdepth(F), tagdepth(V))

@inline function (::Type{Tag{F1,V1}}, ::Type{Tag{F2,V2}}) where {F1,V1,F2,V2}
    d1 = tagdepth(Tag{F1,V1}); d2 = tagdepth(Tag{F2,V2})
    d1 != d2 && return d1 < d2
    return tagcount(Tag{F1,V1}) < tagcount(Tag{F2,V2})
end

Approach D — depth walker + objectid 3-tier fallback

Similar shape to C but with an explicit objectid tiebreaker after tagcount. The agent that produced this variant noted C is "the more principled fix"; D adds defensive ordering at the cost of more code.

function _tag_depth_impl(T, seen, budget)
    T isa Type || return 0
    T in seen && return 0
    push!(seen, T); budget[] <= 0 && return 0; budget[] -= 1
    md = 0
    if T isa DataType
        if T <: Tag && length(T.parameters) == 2
            F = T.parameters[1]; V = T.parameters[2]
            d_v = _tag_depth_impl(V, seen, budget)
            d_f = F isa Type ? _tag_depth_impl(F, seen, budget) : 0
            md = max(md, 1 + max(d_v, d_f))
        end
        for p in T.parameters
            d = _tag_depth_impl(p, seen, budget); d > md && (md = d)
        end
        if isconcretetype(T) && !(T <: Tuple) && !(T <: Tag) && !(T <: Dual)
            try
                for ft in fieldtypes(T)
                    d = _tag_depth_impl(ft, seen, budget); d > md && (md = d)
                end
            catch end
        end
    end
    return md
end
function _tag_depth(::Type{Tag{F,V}}) where {F,V}
    seen = Base.IdSet{Any}(); budget = Ref(500)
    d_v = _tag_depth_impl(V, seen, budget)
    d_f = F isa Type ? _tag_depth_impl(F, seen, budget) : 0
    return 1 + max(d_v, d_f)
end
@generated effective_tag_depth(::Type{Tag{F,V}}) where {F,V} = :($(_tag_depth(Tag{F,V})))

@inline function (::Type{Tag{F1,V1}}, ::Type{Tag{F2,V2}}) where {F1,V1,F2,V2}
    d1 = effective_tag_depth(Tag{F1,V1}); d2 = effective_tag_depth(Tag{F2,V2})
    d1 != d2 && return d1 < d2
    tc1 = tagcount(Tag{F1,V1}); tc2 = tagcount(Tag{F2,V2})
    tc1 != tc2 && return tc1 < tc2
    return objectid(Tag{F1,V1}) < objectid(Tag{F2,V2})
end

Approach E — depth + contains_tag containment guard

The variant proposed in the comment just above. Strict improvement on #807: depth fast-path only fires when the shallower tag genuinely appears in the deeper tag's V-chain. Closure-F captures don't trigger the fast-path, so they fall back to tagcount — fixing devmotion.ex1's regression. Same-V same-depth cases (devmotion.ex2) aren't addressed; if a downstream forces an inverted tagcount in that shape it still fails.

@inline tagdepth(::Type) = 0
@inline tagdepth(::Type{<:Dual{T,V,N}}) where {T,V,N} = 1 + tagdepth(V)
@inline tagdepth(::Type{<:Tag{F,V}}) where {F,V} = 1 + tagdepth(V)

@inline contains_tag(::Type, target) = false
@inline contains_tag(::Type{<:Tag{F,V}}, target) where {F,V} = contains_tag(V, target)
@inline contains_tag(::Type{<:Dual{T,V,N}}, target) where {T,V,N} =
    (T === target) || contains_tag(T, target) || contains_tag(V, target)

@inline function (::Type{Tag{F1,V1}}, ::Type{Tag{F2,V2}}) where {F1,V1,F2,V2}
    T1 = Tag{F1,V1}; T2 = Tag{F2,V2}
    d1 = tagdepth(T1); d2 = tagdepth(T2)
    if d1 != d2
        genuinely_nested = d1 > d2 ? contains_tag(T1, T2) : contains_tag(T2, T1)
        genuinely_nested && return d1 < d2
    end
    return tagcount(T1) < tagcount(T2)
end

Verdict

If full 8/8 is wanted: C is the cleanest 8/8 fix that survives the full FD testsuite.

If the goal is the smallest defensible improvement on #807: E is the +14-LOC strict superset — it eliminates the regression devmotion identified without taking on the complexity of walking F's parameters/fields. The cost is leaving the same-V precompile-bake-inverted case unaddressed.

A is orthogonal — it's the SciML-side mitigation that ships independently of any ForwardDiff change. Worth pairing with whatever ForwardDiff fix lands as defense in depth.

(All numbers verified locally on Julia 1.11; per-approach diffs and test artifacts are reproducible from /tmp/test_suite.jl. B is listed for completeness but isn't shippable as written.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ForwardDiff errors for NonlinearSolve over ODE solve

4 participants