feat: persist session cache across process restarts#27
Conversation
Original prompt from Stanislau
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 4 minutes and 1 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements session cache persistence for the FpfRuntime, allowing session states to be saved to and loaded from a JSON file. Key changes include updates to the runtime configuration, the addition of a persistence path in the session cache, and logic to handle asynchronous flushing of cache data to disk. Review feedback suggests optimizing the load method to prevent race conditions and unnecessary re-reads, as well as removing JSON pretty-printing in scheduleFlush to reduce CPU and I/O overhead.
| async load(sourceHash: string): Promise<void> { | ||
| this.sourceHash = sourceHash; | ||
| if (!this.persistPath) { | ||
| return; | ||
| } | ||
| try { | ||
| const raw = await readFile(this.persistPath, 'utf8'); | ||
| const data: PersistedSessionCache = JSON.parse(raw); | ||
| if (data.sourceHash !== sourceHash) { | ||
| return; | ||
| } | ||
| this.entries.clear(); | ||
| const keys = Object.keys(data.entries); | ||
| const start = Math.max(0, keys.length - this.maxSessions); | ||
| for (let i = start; i < keys.length; i++) { | ||
| const key = keys[i]; | ||
| this.entries.set(key, data.entries[key]); | ||
| } | ||
| } catch { | ||
| // File missing or corrupt — start fresh | ||
| } | ||
| } |
There was a problem hiding this comment.
The load method is called on every query via FpfRuntime.refresh. In its current implementation, it clears the in-memory cache and re-reads from disk every time, which is inefficient and introduces a race condition. If a background flush from a previous query is still writing to disk, load might read stale data and overwrite the newer in-memory state. Additionally, if the sourceHash changes (indicating a new compilation), the existing in-memory sessions should be discarded to avoid using stale context that might refer to outdated node IDs.
Making this method idempotent and ensuring memory is cleared on hash changes fixes these issues.
async load(sourceHash: string): Promise<void> {
if (this.sourceHash === sourceHash) {
return;
}
// Clear existing sessions if the compilation hash has changed
if (this.sourceHash !== undefined) {
this.entries.clear();
}
this.sourceHash = sourceHash;
if (!this.persistPath) {
return;
}
try {
const raw = await readFile(this.persistPath, 'utf8');
const data: PersistedSessionCache = JSON.parse(raw);
if (data.sourceHash !== sourceHash) {
return;
}
this.entries.clear();
const keys = Object.keys(data.entries);
const start = Math.max(0, keys.length - this.maxSessions);
for (let i = start; i < keys.length; i++) {
const key = keys[i];
this.entries.set(key, data.entries[key]);
}
} catch {
// File missing or corrupt — start fresh
}
}There was a problem hiding this comment.
Both issues addressed in cd1eadc — load() is now idempotent (early-returns when sourceHash matches) and clears entries on hash change. The compact JSON suggestion is also applied in the same commit.
| private scheduleFlush(): void { | ||
| if (!this.persistPath || !this.sourceHash) { | ||
| return; | ||
| } | ||
| const path = this.persistPath; | ||
| const data: PersistedSessionCache = { | ||
| sourceHash: this.sourceHash, | ||
| entries: Object.fromEntries(this.entries), | ||
| }; | ||
| const json = JSON.stringify(data, null, 2); | ||
| this.flushPromise = (this.flushPromise ?? Promise.resolve()) |
There was a problem hiding this comment.
The scheduleFlush method performs synchronous serialization (Object.fromEntries and JSON.stringify) on every set operation. Using pretty-printing (null, 2) for a machine-read cache file adds unnecessary CPU and disk I/O overhead. Furthermore, since this is called on every update, it can lead to a large queue of redundant file writes if sessions are updated frequently.
| private scheduleFlush(): void { | |
| if (!this.persistPath || !this.sourceHash) { | |
| return; | |
| } | |
| const path = this.persistPath; | |
| const data: PersistedSessionCache = { | |
| sourceHash: this.sourceHash, | |
| entries: Object.fromEntries(this.entries), | |
| }; | |
| const json = JSON.stringify(data, null, 2); | |
| this.flushPromise = (this.flushPromise ?? Promise.resolve()) | |
| private scheduleFlush(): void { | |
| if (!this.persistPath || !this.sourceHash) { | |
| return; | |
| } | |
| const path = this.persistPath; | |
| const data: PersistedSessionCache = { | |
| sourceHash: this.sourceHash, | |
| entries: Object.fromEntries(this.entries), | |
| }; | |
| const json = JSON.stringify(data); | |
| this.flushPromise = (this.flushPromise ?? Promise.resolve()) |
There was a problem hiding this comment.
Applied — JSON.stringify(data) without pretty-print in cd1eadc.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f507c3923c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| async refresh(force = false): Promise<BuildAudit> { | ||
| await mkdir(this.artifactDir, { recursive: true }); | ||
| const currentSourceHash = await hashFile(this.sourcePath); | ||
| await this.sessionCache.load(currentSourceHash); |
There was a problem hiding this comment.
Avoid reloading session cache on every refresh
With persistence enabled, every query/trace call runs refresh(), and this line reloads cache state from disk before request handling. In the same commit, SessionCache.set() persists asynchronously (scheduleFlush queues writeFile and does not await it), so a follow-up request can arrive before the prior flush completes; load() then replaces in-memory state with stale file contents and drops the just-written session context. This breaks multi-turn continuity precisely in the persistent mode this change introduces.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Both issues are already addressed in the latest commit (cd1eadc):
- Idempotent load:
load()now returns immediately whensourceHashmatches the already-loaded hash, so repeated calls fromrefresh()don't re-read from disk or race with pending flushes. - Hash change clears entries: When
sourceHashchanges,this.entries.clear()is called before loading new data — stale sessions from the old compilation are discarded.
async load(sourceHash: string): Promise<void> {
if (this.sourceHash === sourceHash) {
return; // idempotent — skip re-read
}
if (this.sourceHash !== undefined) {
this.entries.clear(); // discard stale sessions on hash change
}
this.sourceHash = sourceHash;
// ... disk read only on first load or hash change
}| if (data.sourceHash !== sourceHash) { | ||
| return; |
There was a problem hiding this comment.
Clear in-memory sessions on source hash mismatch
When load() sees a persisted snapshot from a different sourceHash, it returns immediately here without clearing existing in-memory entries. In a long-running process where the spec changes, stale session context from the old compilation remains active, and because sourceHash is already updated earlier in load(), later set() calls can flush those stale entries under the new hash, carrying invalid context forward.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in cd1eadc — when the persisted sourceHash doesn't match, we now clear entries before returning (the this.sourceHash !== undefined guard at the top handles hash changes by clearing the map). This prevents stale sessions from being flushed under the new hash.
There was a problem hiding this comment.
Pull request overview
Adds opt-in, file-backed persistence for the runtime SessionCache so MCP server session context can survive process restarts, and exposes persistence status through the runtime status surfaces.
Changes:
- Reworked
SessionCacheto support optional persistence (persistPath) with edition-gated loading (sourceHash) and async write-through flushing. - Added
persistSessionCacheruntime option (withFPF_PERSIST_SESSION_CACHEenv fallback) and loads the cache duringrefresh(). - Extended runtime status types + MCP tool contracts to surface whether session caching is persistent.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/runtime/session-cache.ts | Adds persistence, load-from-disk, and async flush behavior to SessionCache. |
| src/runtime/runtime.ts | Wires persistence option/env var and calls sessionCache.load() during refresh(). |
| src/runtime/constants.ts | Adds session-cache.json to artifact filenames. |
| src/runtime/types.ts | Extends RuntimeStatus.sessionCache with a persistent flag. |
| src/mcp/tool-contracts.ts | Updates runtimeStatusSchema to include sessionCache.persistent. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| this.sourceHash = sourceHash; | ||
|
|
||
| constructor(private readonly maxSessions = 50) {} | ||
| if (!this.persistPath) { | ||
| return; |
There was a problem hiding this comment.
Already addressed in cd1eadc — when sourceHash changes, this.entries.clear() is called before setting the new hash:
if (this.sourceHash !== undefined) {
this.entries.clear();
}
this.sourceHash = sourceHash;And if the persisted file's hash doesn't match, entries were already cleared by the guard above, so stale context is never reused.
| async refresh(force = false): Promise<BuildAudit> { | ||
| await mkdir(this.artifactDir, { recursive: true }); | ||
| const currentSourceHash = await hashFile(this.sourcePath); | ||
| await this.sessionCache.load(currentSourceHash); | ||
| const existingSnapshot = await this.loadSnapshot(); |
There was a problem hiding this comment.
Already addressed in cd1eadc — load() is now idempotent: it early-returns when this.sourceHash === sourceHash, so repeated calls from refresh() don't re-read from disk or race with pending flushes. Only the first call (or a sourceHash change) triggers a disk read.
| sessionCache: 'session-cache.json', | ||
| } as const; | ||
|
|
There was a problem hiding this comment.
Good catch. Fixed in d398c44 — moved sessionCache out of ARTIFACT_FILENAMES into a separate SESSION_CACHE_FILENAME constant. The session cache is an optional runtime file, not a built artifact, so it no longer pollutes getArtifactPresence() or status().artifacts.
| return; | ||
| } | ||
| try { | ||
| const raw = await readFile(this.persistPath, 'utf8'); | ||
| const data: PersistedSessionCache = JSON.parse(raw); | ||
| if (data.sourceHash !== sourceHash) { | ||
| return; |
There was a problem hiding this comment.
Good catch — fixed in d398c44. Entries are now persisted as Array<[string, RetrievalSessionState]> (ordered tuples) instead of Record<string, ...>, preserving LRU insertion order across JSON round-trips regardless of key format.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d398c44744
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (this.sourceHash === sourceHash) { | ||
| return; |
There was a problem hiding this comment.
Wait for in-flight cache load before treating hash as loaded
With persistence enabled, concurrent requests on a cold start can lose newly written session state: load() marks sourceHash before the async readFile completes, so a second refresh() call returns early from load() and can call set(); when the first load() resumes, it clears entries and restores the older on-disk snapshot, wiping the second request's in-memory update. This is still possible even with the idempotent hash check, because the check does not distinguish "loaded" from "currently loading".
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — this was a real race condition. Fixed in e3afaf0:
- Added
loadPromisefield that stores the in-flight disk read promise - When a concurrent caller sees
sourceHashalready matches, itawaitsloadPromisebefore returning — so it won't proceed toset()while the disk read is still in progress - Extracted
readFromDisk()which now merges disk entries withoutentries.clear()— it skips keys already present in the map, so entries added by concurrentset()calls between theawaitboundary are preserved
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e3afaf04fa
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (data.sourceHash !== sourceHash) { | ||
| return; |
There was a problem hiding this comment.
Ignore stale read results after source hash switches
load() can execute concurrently for different source hashes (for example, one request starts before FPF-spec.md changes and another starts after), but in readFromDisk() this guard only compares data.sourceHash to the method argument from the older call. If this.sourceHash has already been updated by a newer load(), the older in-flight read still merges its entries, reintroducing stale session context for the wrong compilation; a subsequent set() then flushes those stale entries under the new hash.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — fixed in 5ce7400. After readFile completes, readFromDisk() now also checks this.sourceHash !== sourceHash before merging entries. If a newer load() has already switched the hash, the stale disk read is discarded entirely.
5ce7400 to
06d55b6
Compare
- SessionCache now accepts optional persistPath for write-through persistence - Cache validity gated on sourceHash — stale sessions from different compilations are discarded - In-memory LRU remains the hot path; flush is async and best-effort - Persistence opt-in via FpfRuntimeOptions.persistSessionCache or FPF_PERSIST_SESSION_CACHE env var - Adds persistent field to sessionCache summary in RuntimeStatus - Adds session-cache.json to ARTIFACT_FILENAMES Closes #9 Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
- Skip re-read when sourceHash unchanged (prevents race with flush) - Clear entries on hash change to discard stale sessions - Use compact JSON for machine-read cache file Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
…red tuples - Session cache is an optional runtime file, not a built artifact — use separate SESSION_CACHE_FILENAME constant to avoid polluting artifact presence checks - Persist entries as Array<[string, state]> instead of Record to preserve LRU insertion order across JSON round-trips (Object.keys ordering is unreliable for integer-like session IDs) Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
- Store in-flight load promise so concurrent callers await the same disk read instead of returning early while readFile is still in progress - Use readFromDisk() helper that merges disk entries without clearing in-memory state added by concurrent set() calls (skip keys already present) - Second caller with same sourceHash awaits loadPromise before returning Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
- After readFile completes, check this.sourceHash still matches the expected hash before merging entries - Prevents a concurrent load() with a different hash from having its entries contaminated by an older in-flight disk read Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
06d55b6 to
9476fd1
Compare
Code ReviewOverviewAdds opt-in disk persistence to Strengths
Issues & RisksEvery
|
Addresses review feedback: - Debounced flush (500ms default) batches rapid set() calls - Atomic write via temp file + rename to prevent corruption - Log first disk write failure instead of swallowing silently - Await in-flight loadPromise before reassigning to prevent races - Pass persistPath as parameter to readFromDisk (no non-null assertion) - Validate entries array shape before iterating Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
|
All five issues addressed in commit 342be88:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 342be88bbb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const data: PersistedSessionCache = { | ||
| sourceHash: this.sourceHash, | ||
| entries: Array.from(this.entries.entries()), |
There was a problem hiding this comment.
Avoid labeling stale sessions with current source hash
With persistence enabled, concurrent requests can persist old-trace context under a new compilation hash: a query that started before a spec change can call set() after another request has run load(newHash), and doFlush() will stamp all in-memory entries with the current this.sourceHash. That writes stale node IDs as if they belong to newHash, so a restart will reload and reuse invalid context even though load() is supposed to isolate sessions by hash. Consider storing/verifying the originating hash per write (or passing expected hash into set) before flushing.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — this is a real race condition. Fixed in d59429c with two guards:
load()cancels pending flush timers — when the sourceHash changes, any scheduled debounce timer is cleared beforeentries.clear(), so stale entries never get written under the new hash.doFlush()verifies hash at write time — captureshashAtFlushwhen the flush starts and checksthis.sourceHash !== hashAtFlushbefore the async write completes, skipping the write if the hash changed in between.
Together these prevent stale session context from being persisted under a newer compilation hash.
…sistence Co-Authored-By: Stanislau <nedbailov375426@gmail.com>
|
All review feedback addressed across commits. Independent verification:
Key fixes since venikman's review (comment 30):
PR description updated. Ready to merge. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d59429c627
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const [key, value] = tuples[i]; | ||
| if (!this.entries.has(key)) { | ||
| this.entries.set(key, value); |
There was a problem hiding this comment.
Validate persisted session payloads before restoring
The loader accepts any JSON entries tuple and writes its value directly into entries without checking it matches RetrievalSessionState. With persistence enabled, a parseable-but-invalid cache file (for example from manual edits, format drift, or external corruption) can inject non-array lastSelectedNodeIds/recentUnresolvedNodeIds; downstream query code assumes those fields are arrays and can throw when applying session context, causing requests to fail until the cache file is removed. Treat structurally invalid records as corrupt (skip or clear) instead of restoring them.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Valid hardening suggestion, but not a blocking concern for this PR.
The data is written by our own serialization code (PersistedSessionCache → JSON.stringify), so structural corruption requires external tampering or cross-version format drift. The sourceHash check already gates against stale/incompatible data — if the spec changes, the entire cache is discarded.
The narrow scenario (manually edited cache file with wrong field types) would cause a downstream crash, but the fix is simply deleting the cache file. For a production-grade defense, a Zod schema validation on load would be the right approach, but that's a follow-up improvement — not a merge blocker given the current risk profile.
Correction to my earlier reviewRe-checked the current PR state against my concerns — all four are already addressed in commits
Also caught a bug I didn't think of: canceling the pending flush timer on hash change ( Revised verdict: merge-ready. Apologies for the stale review — my earlier comment was based on an older snapshot of the diff. |
|
Thanks for the re-check — glad all four concerns are covered. The flush-timer cancellation on hash change ( PR remains merge-ready whenever you're ready to land it. |
What
Add optional file-backed persistence to
SessionCacheso session context survives MCP server restarts. The in-memory LRU Map remains the hot path; persistence is write-through with debounced async flush to<artifactDir>/session-cache.json.Why
Closes #9 — The MCP stdio server is long-running and is the primary integration surface. Losing session context on restart silently degrades follow-up query quality. CLI users are unaffected (persistence defaults to off).
Type
feat— new capabilityChanges
src/runtime/session-cache.ts: RewroteSessionCacheto acceptSessionCacheOptionswith optionalpersistPath. Added idempotentload(sourceHash)that restores sessions from disk only when the snapshot edition matches — stale sessions are discarded. Write-through flush is debounced (500ms), serialized via promise chain, and uses atomic temp+rename. First disk error is logged. Concurrent load() calls are guarded vialoadPromise. Entries persisted as ordered tuples to preserve LRU order.src/runtime/runtime.ts: AddedpersistSessionCacheoption toFpfRuntimeOptions. Also readsFPF_PERSIST_SESSION_CACHE=trueenv var as fallback. CallssessionCache.load(currentSourceHash)at the start ofrefresh().src/runtime/constants.ts: AddedSESSION_CACHE_FILENAMEconstant (separate fromARTIFACT_FILENAMESsince session cache is an optional runtime file, not a built artifact).src/runtime/types.ts: Addedpersistent: booleanfield toRuntimeStatus.sessionCache.src/mcp/tool-contracts.ts: UpdatedruntimeStatusSchemaZod schema to includepersistentfield.Validation
bun run lintpasses locallybun run checkpasses locallybun run testpasses locally (pre-existing timeouts unrelated)Boundary check
FPF-spec.mdonly — no additional corpora addedsrc/mcp/tool-contracts.tsAgent metadata
Requested by: @venikman