Skip to content

Feature suggestion: Smart context management for long-running implementations #158

@johnkattenhorn

Description

@johnkattenhorn

Problem

When running SESSION_CONTINUITY=true (the default) on projects with multiple epics and many stories, code quality degrades over time as the context window fills with stale reasoning from earlier iterations. The AI starts making contradictory decisions and losing track of architectural patterns established in earlier stories.

The alternative — SESSION_CONTINUITY=false — avoids this but introduces a different problem: every loop starts completely fresh, spending significant time re-reading specs, re-discovering project state, and re-orienting. This leads to frequent timeouts and wasted API calls on orientation rather than implementation.

Use case

Multi-epic projects where individual epics vary from 3 to 15+ stories. I was manually working around this by using /clear between epics so related stories shared context, but this doesn't scale — epic size is non-deterministic and some are large enough that context still degrades within a single epic.

Proposed approach

We've been experimenting with automatic session resets at natural boundaries combined with rich checkpoint files to bridge fresh sessions:

  1. Epic-boundary reset (SESSION_RESET_ON_EPIC=true) — when the current story belongs to a different epic than the previous loop, Ralph writes a checkpoint and starts a fresh session. Stories within the same epic stay in one session since they share domain context.

  2. Interval safety valve (SESSION_RESET_INTERVAL=8) — forces a fresh session every N loops within the same epic, preventing degradation in large epics with many stories.

  3. Rich checkpoints — on every reset, Ralph writes @loop_checkpoint.md with a structured summary (~2000 chars): progress count, current epic, next task, recently modified files, recent commits, test/quality gate status, and last work summary. The fresh session receives this via --append-system-prompt instead of the normal 500-char context, avoiding the slow re-orientation problem.

The detection works by extracting the epic number from the first unchecked story ID in @fix_plan.md (e.g., "Story 2.1" → epic "2") and comparing it to the previous loop's epic stored in .ralph/.current_epic.

Implementation

We've prototyped this in our fork: https://github.com/johnkattenhorn/bmalph

Changes are contained in:

  • ralph/ralph_loop.sh — 6 new functions (~180 lines) + wiring into execute_claude_code()
  • ralph/templates/ralphrc.template — 2 new config options
  • tests/bash/ralph_loop.bats — 18 new BATS tests (all passing)

The implementation is fully backwards-compatible — existing .ralphrc files work unchanged, and the defaults are designed to improve the current behaviour without requiring configuration.

Questions

  • Does this direction fit the project's roadmap?
  • Are there other approaches you've considered for context management?
  • Happy to open a PR if this looks useful — or adjust the approach based on your feedback.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions