[FEATURE] Conversation compaction to manage context limits

## Feature Description
Automatically compact or summarize older parts of long conversations to stay within context window limits while preserving important information.

## Problem/Motivation
AWS Strands agents have context window limits (e.g., 200K tokens for Claude Sonnet 4.5). During long sessions:
- Conversations can exceed the context limit
- Users may lose access to older parts of the conversation
- No warning when approaching limits
- No automatic management of conversation length

## Proposed Solution

### Core Features
1. **Context monitoring** - Track current conversation token usage
2. **Limit warnings** - Alert users when approaching context limits (80%, 90%, 95%)
3. **Manual compaction** - `compact` command to summarize older messages
4. **Auto-compaction** - Optional automatic compaction when reaching threshold
5. **Smart summarization** - Preserve key information while reducing token count

### Implementation Options

**Option A: Rolling Window**
- Keep last N messages in full detail
- Summarize or remove older messages
- Simple, predictable behavior

**Option B: Intelligent Summarization**
- Use the agent itself to summarize old conversation chunks
- Preserve important context (code, decisions, key facts)
- More sophisticated but higher quality

**Option C: Hybrid**
- Keep recent messages (last 20-30) in full
- Summarize middle sections in chunks
- Drop very old messages beyond threshold

### User Experience

```bash
# During conversation - warning appears
⚠️  Context usage: 180K / 200K tokens (90%)
Consider using 'compact' command to summarize older messages.

# Manual compaction
You: compact
Compacting conversation (keeping last 30 messages, summarizing 50 older messages)...
✓ Reduced from 180K to 95K tokens (47% reduction)
✓ Preserved 30 recent messages + summary of earlier conversation

# View context status
You: context
Current usage: 95K / 200K tokens (47%)
Messages: 80 total (30 full + 1 summary block)
Oldest message: 2 hours ago
```

### Configuration

```yaml
# In ~/.chatrc
context:
  max_tokens: 200000          # Model's context limit
  warning_thresholds: [0.8, 0.9, 0.95]  # Show warnings at 80%, 90%, 95%
  auto_compact: false         # Enable automatic compaction
  auto_compact_threshold: 0.85  # Compact at 85% if auto enabled
  preserve_recent: 30         # Always keep last N messages in full
  compaction_method: hybrid   # rolling | summarize | hybrid
```

## Benefits
- ✅ Never hit context limits unexpectedly
- ✅ Maintain usable conversation history
- ✅ Clear visibility into context usage
- ✅ User control over compaction strategy
- ✅ Preserve important information

## Related Commands

- `context` - Show current context usage and statistics
- `compact` - Manually trigger compaction
- `compact --preview` - Preview what would be compacted
- `compact --method=<rolling|summarize>` - Choose compaction strategy

## Technical Considerations

**Token Counting:**
- Need accurate token counting for current conversation
- Use tiktoken or similar for Claude/GPT models
- Track running total as conversation progresses

**Summarization Quality:**
- Test different summarization prompts
- Ensure key information preserved (code, decisions, facts)
- Include metadata (timestamp, message count) in summaries

**Backward Compatibility:**
- Make all features opt-in initially
- Graceful degradation if token counting unavailable
- Don't break existing sessions

**Edge Cases:**
- Very first message after compaction (context reset)
- Session resume after compaction
- Saving/loading compacted sessions

## Priority
- [ ] Critical
- [x] High - Prevents frustrating context limit errors
- [ ] Medium
- [ ] Low

## Dependencies
- Token counting library (tiktoken, anthropic tokenizer)
- Optional: summarization prompt engineering
- Configuration system (already exists)

## Testing Plan
1. Test with conversations of various lengths
2. Verify token counting accuracy
3. Test summarization quality
4. Ensure session save/resume works with compaction
5. Performance test with very long conversations

## Future Enhancements
- Semantic chunking (group related messages)
- Important message pinning (never compact)
- Compaction history tracking
- Export compacted conversations with summaries

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Conversation compaction to manage context limits #48

Feature Description

Problem/Motivation

Proposed Solution

Core Features

Implementation Options

User Experience

Configuration

Benefits

Related Commands

Technical Considerations

Priority

Dependencies

Testing Plan

Future Enhancements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Conversation compaction to manage context limits #48

Description

Feature Description

Problem/Motivation

Proposed Solution

Core Features

Implementation Options

User Experience

Configuration

Benefits

Related Commands

Technical Considerations

Priority

Dependencies

Testing Plan

Future Enhancements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions