-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Feature Description
Automatically compact or summarize older parts of long conversations to stay within context window limits while preserving important information.
Problem/Motivation
AWS Strands agents have context window limits (e.g., 200K tokens for Claude Sonnet 4.5). During long sessions:
- Conversations can exceed the context limit
- Users may lose access to older parts of the conversation
- No warning when approaching limits
- No automatic management of conversation length
Proposed Solution
Core Features
- Context monitoring - Track current conversation token usage
- Limit warnings - Alert users when approaching context limits (80%, 90%, 95%)
- Manual compaction -
compactcommand to summarize older messages - Auto-compaction - Optional automatic compaction when reaching threshold
- Smart summarization - Preserve key information while reducing token count
Implementation Options
Option A: Rolling Window
- Keep last N messages in full detail
- Summarize or remove older messages
- Simple, predictable behavior
Option B: Intelligent Summarization
- Use the agent itself to summarize old conversation chunks
- Preserve important context (code, decisions, key facts)
- More sophisticated but higher quality
Option C: Hybrid
- Keep recent messages (last 20-30) in full
- Summarize middle sections in chunks
- Drop very old messages beyond threshold
User Experience
# During conversation - warning appears
⚠️ Context usage: 180K / 200K tokens (90%)
Consider using 'compact' command to summarize older messages.
# Manual compaction
You: compact
Compacting conversation (keeping last 30 messages, summarizing 50 older messages)...
✓ Reduced from 180K to 95K tokens (47% reduction)
✓ Preserved 30 recent messages + summary of earlier conversation
# View context status
You: context
Current usage: 95K / 200K tokens (47%)
Messages: 80 total (30 full + 1 summary block)
Oldest message: 2 hours agoConfiguration
# In ~/.chatrc
context:
max_tokens: 200000 # Model's context limit
warning_thresholds: [0.8, 0.9, 0.95] # Show warnings at 80%, 90%, 95%
auto_compact: false # Enable automatic compaction
auto_compact_threshold: 0.85 # Compact at 85% if auto enabled
preserve_recent: 30 # Always keep last N messages in full
compaction_method: hybrid # rolling | summarize | hybridBenefits
- ✅ Never hit context limits unexpectedly
- ✅ Maintain usable conversation history
- ✅ Clear visibility into context usage
- ✅ User control over compaction strategy
- ✅ Preserve important information
Related Commands
context- Show current context usage and statisticscompact- Manually trigger compactioncompact --preview- Preview what would be compactedcompact --method=<rolling|summarize>- Choose compaction strategy
Technical Considerations
Token Counting:
- Need accurate token counting for current conversation
- Use tiktoken or similar for Claude/GPT models
- Track running total as conversation progresses
Summarization Quality:
- Test different summarization prompts
- Ensure key information preserved (code, decisions, facts)
- Include metadata (timestamp, message count) in summaries
Backward Compatibility:
- Make all features opt-in initially
- Graceful degradation if token counting unavailable
- Don't break existing sessions
Edge Cases:
- Very first message after compaction (context reset)
- Session resume after compaction
- Saving/loading compacted sessions
Priority
- Critical
- High - Prevents frustrating context limit errors
- Medium
- Low
Dependencies
- Token counting library (tiktoken, anthropic tokenizer)
- Optional: summarization prompt engineering
- Configuration system (already exists)
Testing Plan
- Test with conversations of various lengths
- Verify token counting accuracy
- Test summarization quality
- Ensure session save/resume works with compaction
- Performance test with very long conversations
Future Enhancements
- Semantic chunking (group related messages)
- Important message pinning (never compact)
- Compaction history tracking
- Export compacted conversations with summaries
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request