Logger

This project focuses on comparing three storage/indexing strategies for log search, with datasets sized Small (~10), Medium (500), and Large (5000) for context.

Storage & Indexing Options

Hybrid Index (Bucket + Inverted Index) An in-memory approach that combines a bucket index for (category + level) and an inverted index for tokenized message text. Token lookup supports prefix matching (e.g. sess → session), then candidates are intersected and finished with a substring check. Build time is O(n + totalTokens). Typical searches are sub-linear; worst case remains O(n). Fastest in RAM, but not persistent.

flowchart LR
  Q["Query (text/category/level)"] --> B["Bucket Index (cat+lvl)"]
  Q --> T["Token Index (inverted)"]
  B --> I["Intersect candidates"]
  T --> I
  I --> F["Final substring filter"]
  F --> R["Results"]

SQLite FTS An on-disk solution using a normal logs table plus an FTS5 virtual table for message text. Text queries use MATCH, while category/level/date are standard SQL filters. Build time is O(n + totalTokens). Search is sub-linear in practice, worst case O(n). This is the most robust option for true full-text search at scale.

flowchart LR
  Q["Query (text/category/level)"] --> M["FTS MATCH"]
  Q --> S["SQL filters"]
  M --> J["Join logs + fts"]
  S --> J
  J --> R["Results"]

SwiftData (Token-Aware) A persistent SwiftData model that stores a precomputed tokens array per log. Structured fields are filtered with predicates, then token prefix checks plus substring validation are performed in memory. Build time is O(n). Text search trends toward O(n), but is faster than a pure contains scan.

flowchart LR
  Q["Query (text/category/level)"] --> P["SwiftData predicates (structured)"]
  P --> C["Candidate set"]
  C --> T["Token prefix check (in memory)"]
  T --> F["Substring check"]
  F --> R["Results"]

Tokenization & Stop Words

Tokenization is lowercased, diacritic-insensitive, split on non-alphanumeric, with a basic stop-words list (e.g. the, a, an, and, or, to, of, in, on, for, with, is, are, was, were, at, by).

Performance Summary

Hybrid Index: fastest in RAM, not persistent. Sub-linear search in typical cases.
SQLite FTS: best full-text performance and durability.
SwiftData: easiest native persistence, but not a true FTS engine.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Logger.xcodeproj		Logger.xcodeproj
Logger		Logger
LoggerTests		LoggerTests
LoggerUITests		LoggerUITests
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Logger

Storage & Indexing Options

Tokenization & Stop Words

Performance Summary

About

Uh oh!

Releases

Packages

Languages

Nemanja92/Logger

Folders and files

Latest commit

History

Repository files navigation

Logger

Storage & Indexing Options

Tokenization & Stop Words

Performance Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages