Skip to content

[ENH] Explicit context compression in MCC build_context #21

@OppaAI

Description

@OppaAI

Summary

MCC currently concatenates WMC turns + EMC episodes without explicit compression or priority ordering. Under heavy use this could exceed Cosmos's 2048 token limit.

Current Behavior

context = [emc_system_message] + wmc_turns
# No compression, no priority ordering, no overflow check

Proposed Enhancement

Add explicit context compression in build_context():

# Priority ordering:
# 1. Most recent WMC turns (highest priority)
# 2. Highest-relevance EMC episodes (by similarity score)
# 3. Drop lowest priority if total exceeds 80% of context limit

def build_context(self, user_input: str) -> list[dict]:
    wmc_turns = self.wmc.get_event_segments()
    emc_results = self.emc.search(user_input, top_k=EMC_TOP_K)
    
    # Budget-aware assembly
    total_chunks = 0
    context = []
    
    # Add WMC turns newest first until budget
    for turn in reversed(wmc_turns):
        cost = _estimate_chunks(turn["content"])
        if total_chunks + cost > CONTEXT_CHUNK_LIMIT:
            break
        context.insert(0, turn)
        total_chunks += cost
    
    # Add EMC episodes by relevance until budget
    for ep in sorted(emc_results, key=lambda x: x["similarity"], reverse=True):
        cost = _estimate_chunks(ep["content"])
        if total_chunks + cost > CONTEXT_CHUNK_LIMIT:
            break
        # inject as system message
        total_chunks += cost

Impact

  • Context window never overflows
  • Most relevant content always included
  • Graceful degradation when context is large

Notes

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

Status

Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions