Skip to content

perf(ram): lazy SymSpell and emoji catalog, bounded emoji frequency map#682

Merged
FuJacob merged 2 commits into
mainfrom
perf/lazy-resources
Jun 12, 2026
Merged

perf(ram): lazy SymSpell and emoji catalog, bounded emoji frequency map#682
FuJacob merged 2 commits into
mainfrom
perf/lazy-resources

Conversation

@FuJacob

@FuJacob FuJacob commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Summary

Three idle-footprint cuts, none of which change steady-state behavior:

  • SymSpell loads on first need, not at launch. The corrector already lazy-loads every language with an LRU of two and falls back to NSSpellChecker while an index builds; the only eager part was the launch-time preload. A built index is an [Int: [Int32]] delete-variant map over ~83-100k words (order tens of MB resident), paid by every session for a feature consulted only when the typo gate actually finds a misspelling. The first consultation now triggers the same background build through cachedIndexOrRequestLoad. Cost of staying cold: the first correction or two after launch rank via the system spell checker instead of corpus frequency, exactly the designed fallback.
  • Emoji catalog builds on first : capture. EmojiPickerController now takes a matcherProvider and builds the matcher (bundle JSON decode + lowercased index, a few MB) on the first match request instead of at app construction. One-time ~tens-of-ms build on first picker use, user-paced.
  • Bounded emoji frequency map. EmojiUsageStore.frequency grew one entry per unique emoji forever. It now trims above 300 entries down to 200, preserving current recents and (by construction, since trimming removes the rarest first) heavy hitters. Frequency only breaks ties within a relevance tier, so trimming cannot change which emoji match a query.

Validation

xcodebuild -project Cotabby.xcodeproj -scheme Cotabby -destination 'platform=macOS' build-for-testing \
  -derivedDataPath build/DerivedData CODE_SIGNING_ALLOWED=NO CODE_SIGNING_REQUIRED=NO
# ** TEST BUILD SUCCEEDED **

xcodebuild ... test-without-building \
  -only-testing:CotabbyTests/EmojiUsageStoreTests \
  -only-testing:CotabbyTests/EmojiPickerControllerTests \
  -only-testing:CotabbyTests/SymSpellCorrectorTests \
  -only-testing:CotabbyTests/EmojiMatcherTests
# 19 tests, 0 failures

swiftlint lint --quiet
# exit 0

New test: test_frequencyIsBoundedAndKeepsHeavyHitters floods the store with 320 unique aliases and asserts the bound plus survivor counts.

Linked issues

Refs #661

Risk / rollout notes

  • The launch-preload removal reverses an explicit earlier choice ("preserve the warm English path"). The tradeoff is documented at the construction site: tens of MB resident in every session versus system-checker ranking for the first correction after launch.
  • First emoji picker open pays a one-time catalog build on the main actor (small, user-initiated). If it ever shows up in practice it can move behind the capture-open hop.
  • Existing persisted frequency blobs above the cap trim on the next emoji commit, not at migration.

🤖 Generated with Claude Code

Greptile Summary

This PR reduces idle memory footprint across three subsystems without changing steady-state behavior: SymSpell index load is deferred until the first correction, the emoji catalog is built on the first picker interaction, and the emoji frequency map is capped at 300 entries (trimming to 200 on overflow).

  • Lazy SymSpell & emoji catalog: Both expensive initializations (tens of MB for SymSpell, a few MB for emoji indexing) are now deferred behind a nil preload and a matcherProvider closure, respectively. The SymSpell fallback (NSSpellChecker) covers the cold path; the emoji catalog builds on first : capture, which is user-paced.
  • Bounded frequency map: trimFrequencyIfNeeded() fires above 300 entries and cuts to 200, protecting the current recents set and using effectiveTarget = max(frequencyTrimTarget, keepAlways.count) to stay correct if recentsCap is ever raised above frequencyTrimTarget.

Confidence Score: 5/5

Safe to merge — all three deferred-load changes preserve existing runtime behavior, previous review concerns about under-trim and loose test assertions are both resolved.

The lazy SymSpell path is a deliberate, documented tradeoff with a safe fallback (NSSpellChecker). The emoji matcher lazy-init is correct and main-actor-isolated. The trim logic is sound: overflow ≤ removable.count is guaranteed by the effectiveTarget = max(frequencyTrimTarget, keepAlways.count) construction, and the new test uses an exact count assertion that would catch both a missing trim and an under-trim.

No files require special attention.

Important Files Changed

Filename Overview
Cotabby/Models/EmojiUsageStore.swift Adds frequencyCap/frequencyTrimTarget constants and a trimFrequencyIfNeeded() method. Overflow calculation is safe: effectiveTarget = max(target, keepAlways.count) ensures prefix(overflow) never exceeds removable.count even if recentsCap grows beyond frequencyTrimTarget.
Cotabby/App/Coordinators/EmojiPickerController.swift Replaces the injected EmojiMatcher with a @mainactor matcherProvider closure and a hand-rolled lazy var pattern. Correct and thread-safe since the class is @mainactor; cachedMatcher is set exactly once on first access.
Cotabby/App/Core/CotabbyAppEnvironment.swift Removes the SymSpell preload logic (preloadLanguage: nil) and wraps EmojiMatcher construction in a deferred closure; both are clean one-liners with thorough inline rationale.
CotabbyTests/EmojiUsageStoreTests.swift New test floods the store with 320 unique aliases and asserts an exact count of 220 — tight enough to catch both a missing trim and an under-trim that removes fewer entries than the target demands. Joy survival assertion validates heavy-hitter protection.
CotabbyTests/EmojiPickerControllerTests.swift Minimal mechanical update to match the new matcherProvider: closure API; all existing test logic is preserved.

Reviews (2): Last reviewed commit: "review: clamp the frequency trim floor t..." | Re-trigger Greptile

The English SymSpell index was built eagerly at startup (an in-memory
delete-variant index measured in the tens of MB) for a corrector that is
only consulted once the typo gate finds a misspelling, and whose
designed NSSpellChecker fallback already covers the load window. The
emoji catalog (252KB JSON decoded into a few MB of indexed lowercased
strings) was decoded at app construction for a picker most sessions
never open. Both now build on first use. The emoji frequency map also
grew one entry per unique emoji forever; it is now bounded with
trimming that keeps recents and heavy hitters, which cannot change
match results because frequency only breaks ties within a relevance
tier.
Comment thread CotabbyTests/EmojiUsageStoreTests.swift Outdated
Comment thread Cotabby/Models/EmojiUsageStore.swift
@FuJacob FuJacob merged commit eba22f3 into main Jun 12, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant