Summary
Add Partial Prerendering (PPR) to merjs — serve pre-rendered static shells from Cloudflare KV edge (~1ms) while streaming dynamic content via WASM resolves. Inspired by Next.js 14 PPR, but adapted to merjs's Zig/WASM architecture.
This RFC debates the approach, identifies problems, and proposes a novel solution that plays to merjs's strengths.
Historical Context — jQuery to PPR
The placeholder/resolve pattern has been reinvented every ~5 years since the 90s:
| Era |
Technology |
How it works |
| 1993 |
SSI |
<!--#include virtual="/header.html" --> — server blocks, assembles fragments |
| 2001 |
ESI (Akamai) |
<esi:include src="/api/nav" /> — same, but at CDN edge with per-fragment TTLs |
| 2006 |
jQuery .load() |
$('#cart').load('/partials/cart') — client pulls HTML into placeholder divs |
| 2010 |
Facebook BigPipe |
Server pushes <script>BigPipe.onPageletArrive({id:"cart",content:"..."})</script> via chunked HTML |
| 2014 |
Marko <await> |
Declarative out-of-order streaming with client-reorder |
| 2022 |
React 18 Suspense |
<Suspense fallback={<Skeleton/>}> — same BigPipe pattern, React-managed |
| 2023 |
Next.js PPR |
Static shell from CDN + streamed Suspense holes |
| now |
merjs |
? |
merjs already implements the BigPipe pattern exactly — look at StreamWriter.resolve() in mer.zig:42-53:
pub fn resolve(self, id, content) {
self.write("<div hidden id=\"S:");
self.write(id);
self.write("\">");
self.write(content);
self.write("</div><script>");
self.write("(function(){var p=document.getElementById('P:");
self.write(id);
self.write("'),s=document.getElementById('S:");
self.write(id);
self.write("');if(p&&s){p.outerHTML=s.innerHTML;s.remove()}}())");
self.write("</script>");
}
This is BigPipe.onPageletArrive(), minus the jQuery. The question is: how do we split this across time — shell at build/deploy, resolves at request?
What merjs Already Has
Three rendering modes that form a natural progression:
| Mode |
How it works |
Where |
SSG (prerender = true) |
Build-time render → dist/*.html |
src/prerender.zig |
| Shell-first streaming |
Layout head flushed, body blocks, tail follows |
src/server.zig:321-349 |
True streaming (renderStream) |
Placeholder/resolve — skeletons swap to real content |
src/server.zig:275-318 |
PPR would sit between SSG (fully static) and streaming (fully dynamic): static shell + dynamic holes.
The Debate: What's Good
1. The core insight is real
Most pages are 80% static, 20% dynamic. KV edge read is ~1ms, WASM render is ~5-15ms, external fetch is ~100-500ms. Serving the 80% from edge while overlapping the 20% computation is genuinely faster.
2. merjs is uniquely positioned
Unlike Next.js which retrofitted PPR onto React's hydration model, merjs has no hydration. The resolve scripts are self-contained inline <script> tags — no JS bundle hash dependencies, no React runtime version coupling. The static shell is truly static.
3. The placeholder()/resolve() split already exists
The boundary between "shell" and "dynamic" is already expressed in user code. No compiler analysis needed.
4. Cloudflare KV is a natural fit
KV is designed for read-heavy, write-on-deploy workloads — exactly the access pattern for pre-rendered shells. Two-tier cache (Cache API L1 per-colo + KV L2 global) gives per-colo speed with global durability.
The Debate: What's Bad
1. Shell extraction is harder than it looks
renderStream runs linearly — writes shell, then placeholders, then fetches, then resolves. To extract just the shell at build time, you need to run renderStream and stop before resolves. But how does the framework know where the shell ends?
- Convention-based ("everything before first
resolve()"): fragile if someone writes shell HTML after a resolve
- Two-pass render: runs
renderStream twice, page code might have side effects
- Separate functions (
renderShell() + renderDynamic()): doubles the API surface
Next.js avoids this by using React's component tree — static components are identified by the absence of headers()/cookies() calls. merjs doesn't have a component tree — it's imperative stream.write() calls. The boundary is implicit in control flow, not explicit in structure.
2. Cache invalidation is the actual hard problem
PPR shifts complexity from rendering to caching:
- Version-aware cache keys (deploy v3 shouldn't serve v2 shells)
- KV propagation takes up to 60 seconds globally
- Stale-while-revalidate during propagation windows
- Per-route TTLs (blog changes hourly, about page changes quarterly)
The current prerender.zig is beautifully simple: render → write file → done. KV cache management adds operational surface area disproportionate to the perf gain for most sites.
3. The WASM/Worker architecture fights you
The current WASM API returns a single buffer (handle() → pointer to complete response). There's no streaming from WASM to JS. To stream resolves progressively, you'd need:
- A streaming WASM API (chunks written to shared memory + signals to JS) — significant complexity
- Or buffer all resolves in WASM, append to shell in JS — loses progressive streaming benefit
4. Is it actually faster for merjs?
Current WASM renders full pages in ~5-15ms. Adding KV lookup (~1ms hit, ~50ms cold miss) + shell parsing + stitching might not beat "just render the whole thing in WASM." PPR shines when SSR is slow (React: 50-200ms). merjs is already fast enough that the delta might be 5-7ms.
5. It breaks the simplicity contract
merjs's pitch: write Zig, get a web app, zero npm, zero complexity. PPR adds:
- New
pub const ppr = true flag
- Authors must reason about static-vs-dynamic splits
- KV namespace config in wrangler.toml
- Cache invalidation on deploy
- Debug complexity ("why stale content?" → KV propagation delay)
Alternative Approaches Considered
Option A: Transparent PPR (auto-detect)
Run every renderStream page in collect mode at build time. Capture write() + placeholder() output as the shell. Everything after first resolve() is dynamic.
Pro: Zero API changes. Con: Relies on convention that writes precede resolves.
Option B: ISR-style (cache at request time)
Skip build-time extraction. First request renders full page, extracts shell, caches in KV. Subsequent requests serve cached shell + run resolves.
Pro: No build changes, works with dynamic routes. Con: First visitor gets no benefit.
Option C: HTMLRewriter composition
Store full static page (with skeleton placeholders) in KV. At request time, stream through Cloudflare's HTMLRewriter — intercept <div id="P:..."> and replace with WASM output.
return new HTMLRewriter()
.on('div[id^="P:"]', new DynamicSlotHandler(wasm))
.transform(new Response(cachedShell));
Pro: True edge streaming, no WASM streaming API needed. Con: Parsing HTML you just generated (+1-2ms).
Option D: Full-page caching with TTL
Pages declare pub const cache_ttl = 300;. Worker caches full rendered response in KV. No PPR, just HTTP caching done right.
Pro: 20 lines of JS, zero Zig changes. Con: Doesn't help mixed static/dynamic pages.
Proposed Solution: "Incremental Edge Composition" (IEC)
None of the above are quite right for merjs. Here's a novel approach that plays to merjs's actual strengths:
The Key Insight
merjs's WASM render is already so fast (~8ms) that the bottleneck is never rendering — it's external data fetches (100-500ms). The existing two-phase worker protocol (collect_fetch_urls → parallel fetch → handle) already solves this for full renders. PPR should optimize the same bottleneck, not the render.
IEC = cache the render output per-placeholder, not per-page.
How It Works
REQUEST FLOW:
1. Worker receives request for /product/42
2. Worker calls WASM collect_fetch_urls() — gets ["api.com/product/42", "api.com/reviews/42"]
3. For each URL, check KV cache:
- KV hit → use cached response (skip fetch)
- KV miss → fetch from origin, store in KV with TTL
4. Provide results to WASM (existing provide_fetch_result API)
5. WASM renders full page (~8ms) — this is fast, don't optimize it
6. Cache the FULL rendered page in Cache API (L1, per-colo, short TTL)
7. Return response
What's Different From PPR
|
Next.js PPR |
merjs IEC |
| What's cached |
The static shell HTML |
The data fetches + full rendered pages |
| What's dynamic |
Suspense boundary content |
Nothing — re-render is cheap |
| Cache granularity |
Per-page shell |
Per-fetch-URL data + per-page output |
| Invalidation unit |
Deploy (new shell) |
Per-URL TTL (data freshness) |
| Cold request |
Shell instant + stream dynamic |
Full render (~8ms) + cache populate |
| Warm request |
Shell instant + stream dynamic |
Full page from Cache API (~1ms) |
Why This Is Better For merjs
-
No shell extraction problem. You don't need to split renderStream into static/dynamic parts. The entire page renders in WASM — it's fast enough.
-
Data-level caching is more reusable. If /product/42 and /product/42/reviews both fetch api.com/product/42, the data cache is shared. Shell caching can't do this.
-
No new page API. Zero changes to how authors write pages. No ppr = true, no thinking about static-vs-dynamic boundaries.
-
Graceful degradation. Cache miss = full render in ~8ms + fetches. That's already fast. Cache hit = ~1ms. There's no "stale shell + fresh dynamic = visual mismatch" problem.
-
Works with dynamic routes. /users/:id pages get cached per-ID automatically. No build-time enumeration needed.
Page Author API (unchanged!)
// app/product.zig — no changes needed
pub const meta: mer.Meta = .{ .title = "Product" };
pub fn renderStream(req: mer.Request, stream: *mer.StreamWriter) void {
stream.write("<h1>Product</h1>");
stream.placeholder("details", "<div class='skeleton'>...</div>");
stream.placeholder("reviews", "<div class='skeleton'>...</div>");
const results = mer.fetchAll(req.allocator, &.{
.{ .url = "https://api.example.com/product/42" },
.{ .url = "https://api.example.com/reviews/42" },
});
stream.resolve("details", formatProduct(results[0]));
stream.resolve("reviews", formatReviews(results[1]));
}
Worker Changes (the only changes needed)
// worker.js — add fetch-level caching + page-level caching
export default {
async fetch(request, env, ctx) {
const url = new URL(request.url);
const cache = caches.default;
// L1: Check full-page cache (per-colo, ~0ms)
const pageKey = new Request(url.href, { method: "GET" });
const cached = await cache.match(pageKey);
if (cached) return cached;
// L2: Render with data-level caching
const wasm = await getInstance();
const input = `${request.method} ${url.pathname}`;
// ... existing collect_fetch_urls() call ...
// Fetch with per-URL KV caching
await Promise.all(urls.map(async (fetchUrl) => {
// Check KV for cached data
const dataKey = `data:${fetchUrl}`;
const cachedData = await env.DATA_CACHE.get(dataKey);
let body;
if (cachedData) {
body = cachedData; // KV hit — skip network
} else {
const res = await fetch(fetchUrl);
body = await res.text();
// Cache in KV with TTL (don't block response)
ctx.waitUntil(
env.DATA_CACHE.put(dataKey, body, { expirationTtl: 300 })
);
}
// Provide to WASM (existing API)
provideToWasm(wasm, fetchUrl, body);
}));
// Render full page in WASM (~8ms)
const response = renderInWasm(wasm, input);
// Cache full page in Cache API (don't block)
if (response.status === 200) {
const cacheResponse = new Response(response.body, {
headers: { ...response.headers, "Cache-Control": "s-maxage=60" },
});
ctx.waitUntil(cache.put(pageKey, cacheResponse.clone()));
}
return response;
}
};
wrangler.toml addition
[[kv_namespaces]]
binding = "DATA_CACHE"
id = "..."
Cache Invalidation
- Data cache (KV): TTL-based. Product data expires in 5 min, weather in 1 min. No deploy-time purge needed.
- Page cache (Cache API): Short TTL (30-60s). Per-colo, auto-evicts. Pages automatically refresh as data cache updates.
- Deploy purge: Optional —
wrangler kv:bulk delete to clear data cache. Page cache expires naturally via TTL.
Performance Characteristics
| Scenario |
Latency |
What happens |
| Full cache hit |
~1ms |
Page from Cache API (per-colo) |
| Page miss, data hit |
~10ms |
WASM render + data from KV |
| Page miss, data miss |
~100-500ms |
WASM render + live fetch + cache populate |
| First request ever |
~100-500ms |
Same as above, but populates both caches |
Compare to PPR:
| PPR shell hit | ~1ms + streaming | Shell from KV + WASM resolve stream |
| PPR shell miss | ~50ms + render | KV cold read + full fallback render |
IEC's warm path (~1ms) matches PPR's warm path (~1ms shell + streaming overhead). IEC's cold path is simpler and has no "stale shell + fresh data" visual mismatch risk.
Implementation Plan
If we go with IEC, the changes are minimal:
- worker.js — Add data-level KV caching around the existing fetch loop + page-level Cache API caching (~50 lines)
- wrangler.toml — Add
DATA_CACHE KV namespace binding
- Optional:
pub const cache_ttl — Let pages declare data freshness hints that the worker reads from the WASM route metadata
No changes to: mer.zig, prerender.zig, dispatch.zig, router.zig, codegen.zig, build.zig.
Open Questions
- Should we support per-fetch-URL TTLs? e.g.
mer.fetch(alloc, .{ .url = "...", .cache_ttl = 60 })
- Should the page-level Cache API TTL be configurable per-route, or global?
- Do we want a
cache_ttl = 0 escape hatch for truly dynamic pages (e.g. reading cookies)?
- Should we add cache status headers (
X-Cache: HIT/MISS) for debugging?
- Is there a future where we want true PPR (shell + streaming) on top of IEC for pages with very slow external APIs?
TL;DR
Don't cache the shell. Cache the data. merjs's WASM render is fast enough (~8ms) that the bottleneck is external fetches, not rendering. Cache fetch results in KV (data-level) + cache full pages in Cache API (page-level). Zero Zig changes, ~50 lines of worker.js, same performance as PPR on warm requests, simpler mental model, no static/dynamic boundary problem.
Summary
Add Partial Prerendering (PPR) to merjs — serve pre-rendered static shells from Cloudflare KV edge (~1ms) while streaming dynamic content via WASM resolves. Inspired by Next.js 14 PPR, but adapted to merjs's Zig/WASM architecture.
This RFC debates the approach, identifies problems, and proposes a novel solution that plays to merjs's strengths.
Historical Context — jQuery to PPR
The placeholder/resolve pattern has been reinvented every ~5 years since the 90s:
<!--#include virtual="/header.html" -->— server blocks, assembles fragments<esi:include src="/api/nav" />— same, but at CDN edge with per-fragment TTLs.load()$('#cart').load('/partials/cart')— client pulls HTML into placeholder divs<script>BigPipe.onPageletArrive({id:"cart",content:"..."})</script>via chunked HTML<await>client-reorder<Suspense fallback={<Skeleton/>}>— same BigPipe pattern, React-managedmerjs already implements the BigPipe pattern exactly — look at
StreamWriter.resolve()inmer.zig:42-53:This is
BigPipe.onPageletArrive(), minus the jQuery. The question is: how do we split this across time — shell at build/deploy, resolves at request?What merjs Already Has
Three rendering modes that form a natural progression:
prerender = true)dist/*.htmlsrc/prerender.zigsrc/server.zig:321-349renderStream)src/server.zig:275-318PPR would sit between SSG (fully static) and streaming (fully dynamic): static shell + dynamic holes.
The Debate: What's Good
1. The core insight is real
Most pages are 80% static, 20% dynamic. KV edge read is ~1ms, WASM render is ~5-15ms, external fetch is ~100-500ms. Serving the 80% from edge while overlapping the 20% computation is genuinely faster.
2. merjs is uniquely positioned
Unlike Next.js which retrofitted PPR onto React's hydration model, merjs has no hydration. The resolve scripts are self-contained inline
<script>tags — no JS bundle hash dependencies, no React runtime version coupling. The static shell is truly static.3. The
placeholder()/resolve()split already existsThe boundary between "shell" and "dynamic" is already expressed in user code. No compiler analysis needed.
4. Cloudflare KV is a natural fit
KV is designed for read-heavy, write-on-deploy workloads — exactly the access pattern for pre-rendered shells. Two-tier cache (Cache API L1 per-colo + KV L2 global) gives per-colo speed with global durability.
The Debate: What's Bad
1. Shell extraction is harder than it looks
renderStreamruns linearly — writes shell, then placeholders, then fetches, then resolves. To extract just the shell at build time, you need to runrenderStreamand stop before resolves. But how does the framework know where the shell ends?resolve()"): fragile if someone writes shell HTML after a resolverenderStreamtwice, page code might have side effectsrenderShell()+renderDynamic()): doubles the API surfaceNext.js avoids this by using React's component tree — static components are identified by the absence of
headers()/cookies()calls. merjs doesn't have a component tree — it's imperativestream.write()calls. The boundary is implicit in control flow, not explicit in structure.2. Cache invalidation is the actual hard problem
PPR shifts complexity from rendering to caching:
The current
prerender.zigis beautifully simple: render → write file → done. KV cache management adds operational surface area disproportionate to the perf gain for most sites.3. The WASM/Worker architecture fights you
The current WASM API returns a single buffer (
handle()→ pointer to complete response). There's no streaming from WASM to JS. To stream resolves progressively, you'd need:4. Is it actually faster for merjs?
Current WASM renders full pages in ~5-15ms. Adding KV lookup (~1ms hit, ~50ms cold miss) + shell parsing + stitching might not beat "just render the whole thing in WASM." PPR shines when SSR is slow (React: 50-200ms). merjs is already fast enough that the delta might be 5-7ms.
5. It breaks the simplicity contract
merjs's pitch: write Zig, get a web app, zero npm, zero complexity. PPR adds:
pub const ppr = trueflagAlternative Approaches Considered
Option A: Transparent PPR (auto-detect)
Run every
renderStreampage in collect mode at build time. Capturewrite()+placeholder()output as the shell. Everything after firstresolve()is dynamic.Pro: Zero API changes. Con: Relies on convention that writes precede resolves.
Option B: ISR-style (cache at request time)
Skip build-time extraction. First request renders full page, extracts shell, caches in KV. Subsequent requests serve cached shell + run resolves.
Pro: No build changes, works with dynamic routes. Con: First visitor gets no benefit.
Option C: HTMLRewriter composition
Store full static page (with skeleton placeholders) in KV. At request time, stream through Cloudflare's
HTMLRewriter— intercept<div id="P:...">and replace with WASM output.Pro: True edge streaming, no WASM streaming API needed. Con: Parsing HTML you just generated (+1-2ms).
Option D: Full-page caching with TTL
Pages declare
pub const cache_ttl = 300;. Worker caches full rendered response in KV. No PPR, just HTTP caching done right.Pro: 20 lines of JS, zero Zig changes. Con: Doesn't help mixed static/dynamic pages.
Proposed Solution: "Incremental Edge Composition" (IEC)
None of the above are quite right for merjs. Here's a novel approach that plays to merjs's actual strengths:
The Key Insight
merjs's WASM render is already so fast (~8ms) that the bottleneck is never rendering — it's external data fetches (100-500ms). The existing two-phase worker protocol (
collect_fetch_urls→ parallel fetch →handle) already solves this for full renders. PPR should optimize the same bottleneck, not the render.IEC = cache the render output per-placeholder, not per-page.
How It Works
What's Different From PPR
Why This Is Better For merjs
No shell extraction problem. You don't need to split
renderStreaminto static/dynamic parts. The entire page renders in WASM — it's fast enough.Data-level caching is more reusable. If
/product/42and/product/42/reviewsboth fetchapi.com/product/42, the data cache is shared. Shell caching can't do this.No new page API. Zero changes to how authors write pages. No
ppr = true, no thinking about static-vs-dynamic boundaries.Graceful degradation. Cache miss = full render in ~8ms + fetches. That's already fast. Cache hit = ~1ms. There's no "stale shell + fresh dynamic = visual mismatch" problem.
Works with dynamic routes.
/users/:idpages get cached per-ID automatically. No build-time enumeration needed.Page Author API (unchanged!)
Worker Changes (the only changes needed)
wrangler.toml addition
Cache Invalidation
wrangler kv:bulk deleteto clear data cache. Page cache expires naturally via TTL.Performance Characteristics
Compare to PPR:
| PPR shell hit | ~1ms + streaming | Shell from KV + WASM resolve stream |
| PPR shell miss | ~50ms + render | KV cold read + full fallback render |
IEC's warm path (~1ms) matches PPR's warm path (~1ms shell + streaming overhead). IEC's cold path is simpler and has no "stale shell + fresh data" visual mismatch risk.
Implementation Plan
If we go with IEC, the changes are minimal:
DATA_CACHEKV namespace bindingpub const cache_ttl— Let pages declare data freshness hints that the worker reads from the WASM route metadataNo changes to:
mer.zig,prerender.zig,dispatch.zig,router.zig,codegen.zig,build.zig.Open Questions
mer.fetch(alloc, .{ .url = "...", .cache_ttl = 60 })cache_ttl = 0escape hatch for truly dynamic pages (e.g. reading cookies)?X-Cache: HIT/MISS) for debugging?TL;DR
Don't cache the shell. Cache the data. merjs's WASM render is fast enough (~8ms) that the bottleneck is external fetches, not rendering. Cache fetch results in KV (data-level) + cache full pages in Cache API (page-level). Zero Zig changes, ~50 lines of worker.js, same performance as PPR on warm requests, simpler mental model, no static/dynamic boundary problem.