Skip to content

Optimize compilation pipeline: ~2x faster template compilation#154

Open
nicbet wants to merge 1 commit into
judofyr:masterfrom
nicbet:optimize-compilation-pipeline
Open

Optimize compilation pipeline: ~2x faster template compilation#154
nicbet wants to merge 1 commit into
judofyr:masterfrom
nicbet:optimize-compilation-pipeline

Conversation

@nicbet
Copy link
Copy Markdown

@nicbet nicbet commented Mar 13, 2026

Summary

The propose changes optimize Temple's compilation pipeline to achieve ~1.5-2x faster template compilation across all template sizes.

The key insight from profiling is that 73% of compilation time was spent in GC, driven overwhelmingly by Ripper object allocations in StaticAnalyzer and StringSplitter. These two filters called Ripper on every :dynamic expression, even though the vast majority of template expressions (e.g. @user.name, item.price) can never be static values or string literals.

The proposed changes benefit every engine built on Temple (ERB, Slim, Haml, etc.) since the optimized code is in the shared filter and generator infrastructure.

Benchmark results (Ruby 3.4.8, Apple Silicon M1 Max)

Full compilation in iterations/sec (higher is better):

Template Before After Speedup
blog post (49 lines) 2,085 3,866 1.85x
product listing (121 lines) 701 1,281 1.83x
admin dashboard (127 lines) 685 1,360 1.99x
landing page (112 lines) 1,497 2,238 1.49x
email template (94 lines) 1,052 2,080 1.98x

Per-filter improvement (admin dashboard, 127 lines):

Filter Before After Speedup
StaticAnalyzer 1,648 i/s 10,244 i/s 6.2x
StringSplitter 3,216 i/s 9,586 i/s 3.0x

GC samples in the StackProf profile dropped from 19,714 to 8,955 (-55%).

Description of Changes

1. Fast-reject in StaticAnalyzer.static? results in 6.2x speed-up

File: lib/temple/static_analyzer.rb

Rationale: StaticAnalyzer determines if a :dynamic expression is actually a compile-time constant (like "hello", 42, true) so it can be pre-evaluated to a :static. It does this by parsing the code with Ripper.lex and checking every token, but first it also runs a full SyntaxChecker parse. For a typical template, nearly every expression contains instance variables (@foo), method calls (user.name), or globals ($var), which can never be static.

Change: Added a DYNAMIC_PATTERN regex (/[@$]|[a-zA-Z_][a-zA-Z_0-9]*\s*[.(]/) that rejects obviously-dynamic code before touching Ripper. This matches @/$ (variables) and identifier. or identifier( (method calls). In a typical 127-line admin dashboard template, this fast-rejects 35 of 38 dynamic expressions, avoiding 35 × (SyntaxChecker + Ripper.lex) invocations.

Why it's safe: The regex only returns false-positives in the "reject" direction — it can only cause static? to return false for code that would have returned false anyway after the full Ripper analysis. It never causes static? to return true for dynamic code.

2. Fast-reject in StringSplitter#string_literal? results in 3.0x speedup

File: lib/temple/filters/string_splitter.rb

Rationale: StringSplitter splits interpolated string literals like "Hello #{name}" into [:static, "Hello "] + [:dynamic, "name"] so downstream filters can optimize each piece. It checks every :dynamic node by parsing with Ripper.sexp, but most template expressions aren't string literals at all.

Change: Added a STRING_START_PATTERN regex (/\A\s*["'%]/) that rejects code not starting with a string delimiter before invoking Ripper. String literals in Ruby must start with ", ', or % (for %Q, %q, etc.). Expressions like @user.name or item.price fail this check instantly. In the admin dashboard, this fast-rejects 33 of 38 expressions.

Why it's safe: A Ruby string literal must begin with a quote or %-literal prefix. If the code doesn't start with one of these (after whitespace), it cannot be a string literal regardless of what Ripper would say. This is a necessary condition check, not a heuristic.

3. ERB Parser: string case instead of regex match

File: lib/temple/erb/parser.rb

Rationale: The ERB parser matched the =/== indicator with when /=/, creating and evaluating a Regexp on every ERB tag.

Change: Replaced when /=/ with when '=', '=='. String equality is faster than regex matching and the set of valid indicators is fixed and small.

4. Engine: while loop instead of inject

File: lib/temple/engine.rb

Rationale: Engine#call used inject to thread input through the filter chain, which allocates a block object on every call. For engines that compile many templates (e.g. Rails view rendering), this adds up.

Change: Replaced with an index-based while loop. Same semantics, zero block allocation.

5. Generator: fast-path for on_multi + cached buffer

File: lib/temple/generator.rb

Rationale: on_multi is the most frequently called generator method (once per node in the AST). The previous implementation always allocated an array via map + join, even for 0 or 1 children. The buffer method (called for every :static and :dynamic node) did a hash lookup through ImmutableMap on every call.

Change: Added fast-paths for 0 and 1 children in on_multi (avoiding array allocation). Cached the buffer name in @buffer after first lookup.

6. MultiFlattener: avoid intermediate array allocation

File: lib/temple/filters/multi_flattener.rb

Rationale: When flattening nested :multi nodes, exp[1..-1] allocates a new array just to pass to concat. For deeply nested templates this happens hundreds of times per compilation.

Change: Replaced result.concat(exp[1..-1]) with an index-based while loop that appends elements directly. Same result, no intermediate array.

7. Dispatcher: constant regex + match?

File: lib/temple/mixins/dispatcher.rb

Rationale: dispatched_methods created a new Regexp object on every call and used =~ (which sets $~ globals) instead of match?.

Change: Moved the regex to a constant DISPATCHED_RE and switched to match?. This is a one-time cost during dispatcher compilation so the impact is small, but it's a correctness improvement (avoids polluting $~).

Benchmark suite

Added benchmark/run.rb and benchmark/templates.rb with 5 realistic page templates covering different template profiles:

  • Blog post (49 lines) — moderate dynamics, iteration
  • Product listing (121 lines) — heavy dynamics, nested loops, conditionals
  • Admin dashboard (127 lines) — tables, conditionals, many attributes
  • Landing page (112 lines) — mostly static HTML, stress-tests static merging
  • Email template (94 lines) — deeply nested tables, inline styles

Run with ruby benchmark/run.rb or ruby benchmark/run.rb --profile (requires benchmark-ips and optionally stackprof).

Test plan

  • All 162 existing specs pass (bundle exec rspec)
  • Verified ERB <%= and <%== parsing produces identical output
  • Verified StaticAnalyzer.static? correctly identifies static expressions ("hello", 42, true, nil) and rejects dynamic ones (@foo, user.name, $global)
  • Verified StringSplitter correctly splits interpolated strings and passes through non-strings
  • Benchmarked before/after on 5 realistic templates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant