Skip to content

perf(ax): lazy Chromium descendant BFS, session-scoped invariant caches, throttled run walk, shared TextKit stack#679

Merged
FuJacob merged 2 commits into
mainfrom
perf/ax-hot-path
Jun 12, 2026
Merged

perf(ax): lazy Chromium descendant BFS, session-scoped invariant caches, throttled run walk, shared TextKit stack#679
FuJacob merged 2 commits into
mainfrom
perf/ax-hot-path

Conversation

@FuJacob

@FuJacob FuJacob commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Summary

The AX resolver is the hot loop the app runs many times per second, and four of its costs were paid far more often than the data could change. The Chromium descendant BFS (up to ~200 node visits at 2-4 IPC each) was enumerated eagerly on every poll tick before the first candidate was even evaluated, even though shallow candidates win in the common case and BFS appends were unreachable whenever they did. The secure-field probe trio, the terminal AXDOMClassList read, and the focused element's role pair were re-fetched per tick despite being invariant while focus stays in one field. The Branch 2.5 static-text-run walk (~300 nodes, the primary caret path in Gmail-class editors) ran unthrottled inside every tick while its sibling deep walk has had a 100ms throttle since #667-era work. And the hidden TextKit caret estimator (#670) rebuilt its storage/layout-manager/container stack on every keystroke a ghost was visible.

Changes, all preserving resolution semantics:

  • resolveCandidate stages enumeration: shallow candidates (focused node, two ancestors, their children) are evaluated first and the descendant BFS runs only when none resolves with full capabilities. Evaluation order is unchanged because shallow elements always preceded BFS appends.
  • New FocusSessionScopedCache keyed on focusChangeSequence plus element key caches the secure-field verdict and the terminal-detection class list. Sequence scoping is the safety property: CFHash-based element identities recycle across fields, and a stale "not secure" verdict must never survive a field switch.
  • candidateSnapshot reuses the focused element's already-read role/subrole (CFEqual identity check, no IPC) instead of repeating the two reads on every tick.
  • New StaticTextRunWalkThrottle reuses collected run frames for 100ms while the cheap caret-placement math still reruns against live text and selection, so the caret tracks keystrokes inside slightly stale frames, the same accepted tradeoff as DeepGeometryWalkThrottle. The throttle is restricted to the focused element so its single slot can never serve frames from a different root; deep-walk leaf calls stay unthrottled (they are bounded upstream).
  • TextLayoutCaretEstimator mutates one shared main-actor TextKit stack instead of allocating a fresh trio per estimate.

Net effect: a steady-state Chromium tick drops from several hundred IPC round trips to roughly a dozen, and Gmail-class hosts stop paying a 300-node walk per keystroke.

Validation

xcodebuild -project Cotabby.xcodeproj -scheme Cotabby -destination 'platform=macOS' build-for-testing \
  -derivedDataPath build/DerivedData CODE_SIGNING_ALLOWED=NO CODE_SIGNING_REQUIRED=NO
# ** TEST BUILD SUCCEEDED **

xcodebuild ... test-without-building \
  -only-testing:CotabbyTests/FocusSessionScopedCacheTests \
  -only-testing:CotabbyTests/StaticTextRunWalkThrottleTests \
  -only-testing:CotabbyTests/DeepGeometryWalkThrottleTests \
  -only-testing:CotabbyTests/FocusSnapshotResolverSelectionTests \
  -only-testing:CotabbyTests/FocusCapabilityResolverTests \
  -only-testing:CotabbyTests/TextLayoutCaretEstimatorTests \
  -only-testing:CotabbyTests/CaretGeometrySelectorTests \
  -only-testing:CotabbyTests/SuggestionCaretLayoutRepairTests
# Test Suite 'Selected tests' passed; 0 failures

swiftlint lint --quiet
# exit 0

New tests: FocusSessionScopedCacheTests (reuse within a sequence, drop on sequence change, the recycled-identity safety case) and StaticTextRunWalkThrottleTests (window reuse, expiry, field-switch reset, cached-empty-result).

Linked issues

Refs #661

Risk / rollout notes

  • The BFS staging changes behavior only in the case where a shallow candidate has full capabilities AND the BFS used to run anyway; in that case the BFS results were provably unreachable (first-full-capability-wins over an ordered list), so only the wasted IPC is removed. When no shallow candidate qualifies, the BFS runs exactly as before.
  • Caret geometry in Gmail-class hosts can now trail a text reflow by up to 100ms during fast typing (run frames cached; placement math live). This mirrors the existing deep-walk throttle tradeoff that has been shipping since the geometry-cache work.
  • Secure-field caching is intentionally sequence-scoped, not identity-scoped; a field switch always re-probes. Within one field session an element cannot change secure-ness.
  • Coordinate with Trust native AX caret geometry; scope the layout-estimate override to web content #676 (estimator provenance gating): this PR does not change estimator gating, only the allocation pattern inside layoutLocalCaret.

🤖 Generated with Claude Code

Greptile Summary

This PR optimises the AX resolver hot loop across four independent dimensions — lazy Chromium descendant BFS, session-scoped caches for invariant AX reads, a throttle for the Branch 2.5 static-text-run walk, and a shared TextKit measurement stack — while explicitly preserving resolution semantics in each case.

  • Staged BFS (resolveCandidate): shallow candidates (focused node, 2 ancestors, their children) are now evaluated before the Chromium descendant BFS. A shallow winner makes BFS results unreachable in the old code too (first-full-capability-wins over an ordered list), so this is a pure IPC saving with no semantic change.
  • FocusSessionScopedCache: caches isSecure (3 AX round trips) and isIntegratedTerminal (1 AX round trip) per focus session. Keyed on focusChangeSequence rather than raw element identity, which correctly handles CFHash recycling across field switches.
  • StaticTextRunWalkThrottle: matches the existing DeepGeometryWalkThrottle contract; restricted to the focused element so the single cache slot cannot serve frames from a different BFS root; caret-placement math still runs live against current text and selection.

Confidence Score: 5/5

Safe to merge. All four optimisations are strictly additive: they reduce IPC work without altering which candidate wins, which secure-field verdict is returned, or how caret placement math is computed.

The BFS staging is provably equivalent to the old single-pass loop because shallow candidates always preceded BFS appends. The session-scoped caches clear on every sequence increment, preventing stale secure-field verdicts from surviving field switches. The static-run-walk throttle mirrors the existing deep-walk throttle pattern exactly and is correctly scoped to the focused element. The shared TextKit stack is safe under @mainactor confinement. New tests cover the key invariants for both new types.

No files require special attention. The most load-bearing change is in FocusSnapshotResolver.swift and the logic there preserves the existing resolution order.

Important Files Changed

Filename Overview
Cotabby/Services/Focus/FocusSnapshotResolver.swift Core resolver refactored: staged BFS (shallow-first, BFS only when no shallow winner), role/subrole reuse via FocusedElementReading, secure-field and terminal-detection caches, staticRunThrottle plumbing. Logic is semantics-preserving and well-guarded.
Cotabby/Services/Focus/FocusSessionScopedCache.swift New generic session-scoped cache keyed on focusChangeSequence + element key. Sequence invalidation is the correct safety mechanism for CFHash recycling. nonisolated deinit workaround matches existing codebase pattern.
Cotabby/Services/Focus/StaticTextRunWalkThrottle.swift New throttle for the Branch 2.5 static-text-run walk, mirroring DeepGeometryWalkThrottle. Sequence-keyed to handle Chrome node recycling; now parameter injectable for testability.
Cotabby/Services/Focus/AXTextGeometryResolver.swift Added staticRunThrottle and focusChangeSequence parameters to resolveCaretRect / resolveCaretFromChildTextRuns. Nil keeps the previous unthrottled path for non-focused candidates.
Cotabby/Support/TextLayoutCaretEstimator.swift Replaced per-call NSTextStorage/NSLayoutManager/NSTextContainer allocation with a single shared MeasurementStack. Safe under @mainactor; setAttributedString + container.size both invalidate layout before ensureLayout is called.
CotabbyTests/FocusSessionScopedCacheTests.swift Tests cover reuse within a sequence, independent key tracking, and sequence-change invalidation including the recycled-identity safety case.
CotabbyTests/StaticTextRunWalkThrottleTests.swift Tests cover window reuse, expiry after interval, immediate field-switch reset, and cached-empty-result.

Sequence Diagram

sequenceDiagram
    participant FT as FocusTracker
    participant FSR as FocusSnapshotResolver
    participant FESC as FocusSessionScopedCache
    participant STRWT as StaticTextRunWalkThrottle
    participant ATGR as AXTextGeometryResolver
    participant TLCE as TextLayoutCaretEstimator

    FT->>FSR: resolveSnapshot(focusedElement, focusChangeSequence)
    FSR->>FSR: read role/subrole once
    FSR->>FSR: shallowCandidateElements()
    FSR->>FSR: winner(in: shallow)
    alt shallow candidate wins
        FSR-->>FT: FocusSnapshot (BFS never runs)
    else no shallow winner
        FSR->>FSR: appendEditableDescendants (BFS)
        FSR->>FSR: winner(in: deepCandidates)
    end
    FSR->>FESC: isSecure lookup
    FESC-->>FSR: cached Bool
    FSR->>FESC: isTerminal lookup
    FESC-->>FSR: cached Bool
    FSR->>ATGR: resolveCaretRect(staticRunThrottle?)
    ATGR->>STRWT: runs(sequence, interval, now, walk)
    alt within window
        STRWT-->>ATGR: cached TextRuns
    else expired or field switched
        STRWT-->>ATGR: fresh TextRuns
    end
    ATGR->>TLCE: estimate — shared MeasurementStack
    TLCE-->>ATGR: LocalCaretPosition
    ATGR-->>FSR: CaretGeometryResult
    FSR-->>FT: FocusSnapshot
Loading

Reviews (3): Last reviewed commit: "fix(tests): nonisolated deinit on the ne..." | Re-trigger Greptile

… field attributes per tick

One focus resolve in Chromium paid an eager bounded descendant BFS (up to
~200 visits, several IPC each) before the first candidate was even
evaluated, plus per-tick re-reads of attributes that cannot change while
focus stays in one field: the secure-field probe trio, the terminal
AXDOMClassList, and the focused element's role pair. The Branch 2.5
static-text-run walk (~300 nodes, Gmail-class hosts) also ran unthrottled
inside every tick, and the hidden TextKit caret estimator rebuilt its
storage/layout-manager/container stack on every estimate.

The BFS now runs only when no shallow candidate resolves with full
capabilities (evaluation order unchanged: shallow always preceded BFS
appends, so any shallow winner made BFS results unreachable). Invariant
reads are cached per focus-change sequence so recycled element identities
can never leak a stale secure verdict across fields. The run walk reuses
collected frames for 100ms (the deep-walk tradeoff) while caret placement
still reruns against live text. The estimator now mutates one shared
TextKit stack.
@FuJacob FuJacob force-pushed the perf/ax-hot-path branch from 73d7ac2 to 5cd4dba Compare June 12, 2026 01:58
Both new @mainactor cache classes are deallocated inside app-hosted unit
tests, which routes their teardown through the isolated-deinit
back-deploy shim and aborts the CI test run after every test has passed
(562 tests, 0 failures, then a crash-restart). Same documented
workaround as EmojiUsageStore and SystemMetricsStore.
@FuJacob FuJacob merged commit a926881 into main Jun 12, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant