Optimize compilation pipeline: ~2x faster template compilation#154
Open
nicbet wants to merge 1 commit into
Open
Optimize compilation pipeline: ~2x faster template compilation#154nicbet wants to merge 1 commit into
nicbet wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The propose changes optimize Temple's compilation pipeline to achieve ~1.5-2x faster template compilation across all template sizes.
The key insight from profiling is that 73% of compilation time was spent in GC, driven overwhelmingly by Ripper object allocations in
StaticAnalyzerandStringSplitter. These two filters called Ripper on every:dynamicexpression, even though the vast majority of template expressions (e.g.@user.name,item.price) can never be static values or string literals.The proposed changes benefit every engine built on Temple (ERB, Slim, Haml, etc.) since the optimized code is in the shared filter and generator infrastructure.
Benchmark results (Ruby 3.4.8, Apple Silicon M1 Max)
Full compilation in iterations/sec (higher is better):
Per-filter improvement (admin dashboard, 127 lines):
GC samples in the StackProf profile dropped from 19,714 to 8,955 (-55%).
Description of Changes
1. Fast-reject in
StaticAnalyzer.static?results in 6.2x speed-upFile:
lib/temple/static_analyzer.rbRationale:
StaticAnalyzerdetermines if a:dynamicexpression is actually a compile-time constant (like"hello",42,true) so it can be pre-evaluated to a:static. It does this by parsing the code withRipper.lexand checking every token, but first it also runs a fullSyntaxCheckerparse. For a typical template, nearly every expression contains instance variables (@foo), method calls (user.name), or globals ($var), which can never be static.Change: Added a
DYNAMIC_PATTERNregex (/[@$]|[a-zA-Z_][a-zA-Z_0-9]*\s*[.(]/) that rejects obviously-dynamic code before touching Ripper. This matches@/$(variables) andidentifier.oridentifier((method calls). In a typical 127-line admin dashboard template, this fast-rejects 35 of 38 dynamic expressions, avoiding 35 × (SyntaxChecker + Ripper.lex) invocations.Why it's safe: The regex only returns false-positives in the "reject" direction — it can only cause
static?to returnfalsefor code that would have returnedfalseanyway after the full Ripper analysis. It never causesstatic?to returntruefor dynamic code.2. Fast-reject in
StringSplitter#string_literal?results in 3.0x speedupFile:
lib/temple/filters/string_splitter.rbRationale:
StringSplittersplits interpolated string literals like"Hello #{name}"into[:static, "Hello "]+[:dynamic, "name"]so downstream filters can optimize each piece. It checks every:dynamicnode by parsing withRipper.sexp, but most template expressions aren't string literals at all.Change: Added a
STRING_START_PATTERNregex (/\A\s*["'%]/) that rejects code not starting with a string delimiter before invoking Ripper. String literals in Ruby must start with",', or%(for%Q,%q, etc.). Expressions like@user.nameoritem.pricefail this check instantly. In the admin dashboard, this fast-rejects 33 of 38 expressions.Why it's safe: A Ruby string literal must begin with a quote or
%-literal prefix. If the code doesn't start with one of these (after whitespace), it cannot be a string literal regardless of what Ripper would say. This is a necessary condition check, not a heuristic.3. ERB Parser: string
caseinstead of regex matchFile:
lib/temple/erb/parser.rbRationale: The ERB parser matched the
=/==indicator withwhen /=/, creating and evaluating a Regexp on every ERB tag.Change: Replaced
when /=/withwhen '=', '=='. String equality is faster than regex matching and the set of valid indicators is fixed and small.4. Engine:
whileloop instead ofinjectFile:
lib/temple/engine.rbRationale:
Engine#callusedinjectto thread input through the filter chain, which allocates a block object on every call. For engines that compile many templates (e.g. Rails view rendering), this adds up.Change: Replaced with an index-based
whileloop. Same semantics, zero block allocation.5. Generator: fast-path for
on_multi+ cachedbufferFile:
lib/temple/generator.rbRationale:
on_multiis the most frequently called generator method (once per node in the AST). The previous implementation always allocated an array viamap+join, even for 0 or 1 children. Thebuffermethod (called for every:staticand:dynamicnode) did a hash lookup throughImmutableMapon every call.Change: Added fast-paths for 0 and 1 children in
on_multi(avoiding array allocation). Cached the buffer name in@bufferafter first lookup.6. MultiFlattener: avoid intermediate array allocation
File:
lib/temple/filters/multi_flattener.rbRationale: When flattening nested
:multinodes,exp[1..-1]allocates a new array just to pass toconcat. For deeply nested templates this happens hundreds of times per compilation.Change: Replaced
result.concat(exp[1..-1])with an index-basedwhileloop that appends elements directly. Same result, no intermediate array.7. Dispatcher: constant regex +
match?File:
lib/temple/mixins/dispatcher.rbRationale:
dispatched_methodscreated a new Regexp object on every call and used=~(which sets$~globals) instead ofmatch?.Change: Moved the regex to a constant
DISPATCHED_REand switched tomatch?. This is a one-time cost during dispatcher compilation so the impact is small, but it's a correctness improvement (avoids polluting$~).Benchmark suite
Added
benchmark/run.rbandbenchmark/templates.rbwith 5 realistic page templates covering different template profiles:Run with
ruby benchmark/run.rborruby benchmark/run.rb --profile(requiresbenchmark-ipsand optionallystackprof).Test plan
bundle exec rspec)<%=and<%==parsing produces identical outputStaticAnalyzer.static?correctly identifies static expressions ("hello",42,true,nil) and rejects dynamic ones (@foo,user.name,$global)StringSplittercorrectly splits interpolated strings and passes through non-strings