As the number of conversation turns increases, events accumulated in a Session continue to grow, leading to excessively long context and increased token consumption. Session Summarizer intelligently compresses historical conversations into summaries, effectively controlling session size while preserving key context. It is an essential core component in TRPC Agent for long-conversation scenarios.
Session Summarizer intelligently analyzes conversation history and summarizes older conversation events into concise summaries, thereby:
- Session Compression: Compresses long conversation history into concise summaries
- Reduced Token Usage: Reduces token consumption and saves costs
- Preserves Important Context: Retains key information and decisions
- Improved Performance: Reduces the number of events to process
The main summarizer class responsible for the core logic of session compression.
A data structure representing a session summary, containing summary information and metadata.
The session summary manager responsible for automatically triggering and managing summarization at the SessionService level.
from trpc_agent_sdk.sessions import SessionSummarizer, set_summarizer_conversation_threshold
from trpc_agent_sdk.models import OpenAIModel
# Create an LLM model
model = OpenAIModel(
model_name="deepseek-chat",
api_key="your-api-key",
base_url="https://api.deepseek.com/v1"
)
# Summarize after every summarizer_count conversation turns
# If summarizer_count is set to 3, summarization is performed after every 3 turns
summarizer_count = 3
# Create the summarizer
summarizer = SessionSummarizer(
model=model,
# If check_summarizer_functions is not set, the default is set_summarizer_conversation_threshold(100)
# Summarization is triggered when the check functions in check_summarizer_functions return True
# When multiple check functions exist, AND logic is used by default (summarization occurs only when all functions return True)
check_summarizer_functions=[
set_summarizer_conversation_threshold(summarizer_count), # Conversation count check function, summarizes after every summarizer_count turns
# set_summarizer_time_interval_threshold(10), # Time check function, summarizes every 10 seconds
# set_summarizer_token_threshold(1000), # Token check function, summarizes every 1000 tokens
# set_summarizer_events_count_threshold(30), # Event count check function, summarizes every 30 events
# set_summarizer_important_content_threshold(), # Important content check function, determines whether to summarize based on content importance
# set_summarizer_check_functions_by_and( # Combined check function with AND logic, triggers summarization when all check functions return True
# set_summarizer_conversation_threshold(1),
# set_summarizer_time_interval_threshold(10),
# set_summarizer_token_threshold(1000),
# set_summarizer_important_content_threshold(),
# ),
# set_summarizer_check_functions_by_or( # Combined check function with OR logic, triggers summarization when any check function returns True
# set_summarizer_conversation_threshold(1),
# set_summarizer_time_interval_threshold(10),
# )
],
max_summary_length=600, # Maximum length of the summary text, default is 1000, truncated with ... if exceeded
keep_recent_count=4, # Number of recent conversation turns to keep, default is 10
)Use SummarizerSessionManager with SessionService to enable automatic summarization in the Runner.
Complete Example: Refer to examples/session_summarizer/run_agent.py
from trpc_agent_sdk.sessions import SummarizerSessionManager, InMemorySessionService
from trpc_agent_sdk.runners import Runner
# Create SummarizerSessionManager
summarizer_manager = SummarizerSessionManager(
model=model,
summarizer=summarizer,
auto_summarize=True, # Default is True; if set to False, automatic summarization is disabled
)
# Use with SessionService
session_service = InMemorySessionService(summarizer_manager=summarizer_manager)
# Create Runner
runner = Runner(
app_name=app_name,
agent=agent,
session_service=session_service
)
# Run the Agent (summarization is triggered automatically)
for i, user_input in enumerate(conversations):
await run_agent(runner=runner, user_id=user_id, session_id=session_id, user_input=user_input)
# Summarization should be triggered after every summarizer_count turns
if i % summarizer_count == 0:
session = await session_service.get_session(
app_name=app_name,
user_id=user_id,
session_id=session_id
)
if session:
# Get the summary content
summary = await session_service.summarizer_manager.get_session_summary(session)
if summary:
print(f" - Summary text: {summary.summary_text[:100]}...")
print(f" - Original event count: {summary.original_event_count}")
print(f" - Compressed event count: {summary.compressed_event_count}")
print(f" - Compression ratio: {summary.get_compression_ratio():.1f}%")Workflow:
- Automatic Triggering: After every N conversation turns,
SummarizerSessionManagerautomatically checks whether summarization is needed - Summary Generation: Uses the LLM to compress historical conversations into concise summaries
- Event Compression: Retains the most recent N conversation turns and replaces older conversations with summary text
- Session Update: Updates the event list in the Session
Summary Content Usage:
After each summarization, the content summary.summary_text is injected into the corresponding request prompt in subsequent conversations. This process is transparent to the user.
Complete Example: Refer to examples/session_summarizer/run_agent.py
import time
# Build events in the session
session = await create_test_session_with_events(session_service, app_name, user_id, session_id)
# Force manual summarization (force=True bypasses trigger conditions)
await session_service.summarizer_manager.create_session_summary(session, force=True)
if session:
summary = await session_service.summarizer_manager.get_session_summary(session)
if summary:
print(f" - Summary text: {summary.summary_text[:100]}...")
print(f" - Summary time: {time.ctime(summary.summary_timestamp)}")
print(f" - Original event count: {summary.original_event_count}")
print(f" - Compressed event count: {summary.compressed_event_count}")
print(f" - Compression ratio: {summary.get_compression_ratio():.1f}%")| Parameter | Type | Default | Description |
|---|---|---|---|
model |
LLMModel | Required | LLM model used for generating summaries |
check_summarizer_functions |
List[CheckSummarizerFunction] | [set_summarizer_conversation_threshold(100)] |
List of check functions that trigger summarization. When multiple check functions exist, AND logic is used by default, meaning summarization occurs only when all functions return True |
max_summary_length |
int | 1000 | Maximum length of the generated summary |
keep_recent_count |
int | 10 | Number of recent events to keep after compression (counted by turns; each turn typically contains 2 events: a user message and an assistant response) |
summarizer_prompt |
str | DEFAULT_SUMMARIZER_PROMPT | Custom summary prompt template |
| Parameter | Type | Default | Description |
|---|---|---|---|
model |
LLMModel | Required | LLM model used for generating summaries |
summarizer |
SessionSummarizer | None | Summarizer instance; if not provided, one is created with default configuration |
auto_summarize |
bool | True | Whether to enable automatic summarization; if set to False, automatic summarization is disabled |
summarizer = SessionSummarizer(
model=model,
check_summarizer_functions=[set_summarizer_conversation_threshold(20)], # More frequent summarization
keep_recent_count=5, # Keep fewer events
)summarizer = SessionSummarizer(
model=model,
check_summarizer_functions=[set_summarizer_conversation_threshold(50)], # Summarize after more conversation turns
keep_recent_count=15, # Keep more context
max_summary_length=1500, # Longer summary
)summarizer = SessionSummarizer(
model=model,
check_summarizer_functions=[set_summarizer_events_count_threshold(15)], # Quick summarization
keep_recent_count=3, # Minimum retention
)Certain events can be marked to skip summarization:
from trpc_agent_sdk.types import EventActions
# Create an event that skips summarization
event = Event(
invocation_id="inv_123",
author="system",
content=Content(parts=[Part.from_text("Debug information")]),
actions=EventActions(skip_summarization=True) # Skip summarization
)# Get summarizer configuration info
metadata = summarizer.get_summary_metadata()
print(f"Model name: {metadata['model_name']}")
print(f"Retained event count: {metadata['keep_recent_count']}")from trpc_agent_sdk.sessions import SessionSummary
# Get the summary object
summary = await session_service.summarizer_manager.get_session_summary(session)
# Get the compression ratio
compression_ratio = summary.get_compression_ratio()
print(f"Compression ratio: {compression_ratio:.1f}%")
# Convert to dictionary
summary_dict = summary.to_dict()In multi-Agent scenarios, summarization covers data produced by all Agents. However, different Agents may produce different volumes of data, and the business logic may require summarizing only data produced by a specific Agent.
Complete Implementation: Refer to examples/session_summarizer/agent/filters.py
Usage:
from trpc_agent_sdk.filter import BaseFilter
from trpc_agent_sdk.sessions import SessionSummarizer
from trpc_agent_sdk.context import get_invocation_ctx
class AgentSessionSummarizerFilter(BaseFilter):
"""Agent session summarizer filter."""
def __init__(self, model: OpenAIModel):
super().__init__()
# Create the summarizer
self.summarizer = SessionSummarizer(
model=model,
max_summary_length=600,
keep_recent_count=4, # Number of recent conversation turns to keep, default is 10
)
async def _after_every_stream(self, ctx: AgentContext, req: Any, rsp: FilterResult) -> None:
"""Check whether summarization is needed after each streaming response"""
# The current agent stream returns one event per response; rsp is of type FilterResult, where rsp.rsp is of type Event
if not rsp.rsp.partial:
events = ctx.metadata.get("events", [])
conversation_text = self.summarizer._extract_conversation_text(events)
# Trigger summarization when conversation text exceeds 12KB
if len(conversation_text) > 12 * 1024:
await self._do_summarize(ctx)
# Cache the executed events in the context
if "events" not in ctx.metadata:
ctx.metadata["events"] = []
ctx.metadata["events"].append(rsp.rsp)
async def _after(self, ctx: AgentContext, req: Any, rsp: FilterResult):
"""Post-processing after the entire agent execution completes"""
await self._do_summarize(ctx)
async def _do_summarize(self, ctx: AgentContext):
"""Perform summarization"""
invocation_ctx: InvocationContext = get_invocation_ctx()
# Get the events produced by this agent
events = ctx.metadata.pop("events", [])
# In multi-Agent concurrent execution, a coroutine lock is needed here to ensure ordering
# Async network operations may yield the coroutine, causing ordering issues
# Remove the events retained by this agent from the global session
for event in events:
if event in invocation_ctx.session.events:
invocation_ctx.session.events.remove(event)
session_id = invocation_ctx.session.id
conversation_text = self.summarizer._extract_conversation_text(events)
# Summarize the events produced by this agent
# create_session_summary_by_events is specifically designed for Agent-level summarization
summary_text, compressed_events = await self.summarizer.create_session_summary_by_events(
events, session_id, ctx=invocation_ctx
)
# Add the compressed events back to the session
if compressed_events:
invocation_ctx.session.events.extend(compressed_events)
# Use in an Agent
def create_agent():
agent = LlmAgent(
name="analyze",
model=model,
description="Tool for analyzing strategies",
tools=[log_set, metric_set],
filters=[AgentSessionSummarizerFilter(model)], # Configure filter
# ...
)
return agentSummarization using the Filter approach:
- Record Events: Record the events produced by this Agent
- Event Isolation: Remove these events from the global session (to avoid conflicts with SessionService-level summarization)
- Perform Summarization: Summarize the events
- Event Replacement: Append the summarized events back to the global session
Comparison of the Two Summarization Approaches:
| Feature | SessionService-Level Summarization | Agent-Level Summarization |
|---|---|---|
| Trigger Timing | After every N conversation turns | After each Agent execution or when text exceeds a threshold |
| Summarization Scope | All events in the entire Session | Events produced by a single Agent |
| Applicable Scenarios | Single-Agent scenarios | Multi-Agent collaboration scenarios |
| Configuration Method | SessionService initialization | Agent Filter configuration |
| Advantages | Simple to use, automatically managed | Finer-grained control, supports multiple Agents |
The summarizer is triggered when user-defined trigger conditions are met. The framework provides several built-in trigger conditions:
set_summarizer_conversation_threshold(conversation_count): Sets the conversation count threshold. Summarization is performed after the conversation count reachesconversation_count. Defaultconversation_countis 100set_summarizer_token_threshold(token_count): Sets the session token threshold. Summarization is performed after the token count reachestoken_countset_summarizer_events_count_threshold(event_count): Sets the event count threshold. Summarization is performed after the event count reachesevent_count. Defaultevent_countis 30set_summarizer_time_interval_threshold(time_interval): Sets the time interval threshold. Summarization is performed after the conversation interval reachestime_interval. Defaulttime_intervalis 300s (5 minutes)set_summarizer_important_content_threshold(important_content_count): Triggers summarization when any message part's stripped text length (after removing leading and trailing whitespace) exceedsimportant_content_countcharacters. Defaultimportant_content_countis 10set_summarizer_check_functions_by_and(funcs: list[CheckSummarizerFunction]): Combined check function. Summarization is performed when all functions infuncsreturn True (AND logic)set_summarizer_check_functions_by_or(funcs: list[CheckSummarizerFunction]): Combined check function. Summarization is performed when any function infuncsreturns True (OR logic)
Trigger Logic:
- When multiple check functions exist, AND logic is used by default, meaning summarization occurs only when all functions return True
- You can explicitly specify the logic using
set_summarizer_check_functions_by_andorset_summarizer_check_functions_by_or
Summary generation uses the default prompt template:
Please summarize the following conversation, focusing on:
1. Key decisions made
2. Important information shared
3. Actions taken or planned
4. Context that should be remembered for future interactions
Keep the summary concise but comprehensive. Focus on what would be most important to remember for continuing the conversation.
Conversation:
{conversation_text}
Summary:
Custom Prompt Template:
To replace the default prompt template, use the following approach:
from textwrap import dedent
your_summarizer_prompt = dedent("""\
Please summarize the following conversation, focusing on:
1. Key decisions
2. Important information
3. Action plans
Conversation:
{conversation_text}
Summary:""")
# conversation_text represents the conversation content; this placeholder is required
summarizer = SessionSummarizer(
model=model,
summarizer_prompt=your_summarizer_prompt,
# ...
)Complete Example Output: Refer to examples/session_summarizer/README.md
After turn 4 → Summarization triggered (configured with set_summarizer_conversation_threshold(3))
After turn 7 → Summarization triggered
After turn 13 → Summarization triggered
Explanation:
- Configured with
set_summarizer_conversation_threshold(3), summarization is triggered after every 3 conversation turns - Summarization is automatically performed when the conversation turn count reaches the threshold
| Turn | Original Event Count | Compressed Event Count | Compression Ratio |
|---|---|---|---|
| Turn 4 | 8 | 5 | 37.5% |
| Turn 7 | 8 | 5 | 37.5% |
| Turn 13 | 13 | 5 | 61.5% |
| Manual Summary | 39 | 5 | 87.2% |
Explanation:
keep_recent_count=4is configured to retain the most recent 4 conversation turns (8 events: 4 turns × 2 events/turn)- Older conversations are compressed into summary text
- The compression ratio gradually increases as the conversation progresses
The summary text contains:
- ✅ Key Decisions: User's learning choices and plans
- ✅ Important Information: Python concepts and knowledge points
- ✅ Action Plans: Project practice and learning paths
Explanation:
- The LLM-generated summary retains the core information of the conversation
- The summary text is clearly formatted, facilitating subsequent retrieval and usage
- Adjust the conversation count in
set_summarizer_conversation_thresholdbased on conversation frequency - Adjust
keep_recent_countbased on memory constraints - Adjust
max_summary_lengthbased on model capabilities
- Use
skip_summarizationto mark unimportant debug information - Filter out system events before summarization
- Preserve user intent and key decisions
- Choose an appropriate model to balance quality and cost
- Implement summary caching to reduce redundant computation
- Monitor API call frequency and costs
- Use Agent-level summarization (Filter approach) to avoid conflicts
- Configure different summarization strategies for different Agents
- Ensure concurrency safety by adding coroutine locks when necessary
A: The summarizer is specifically designed to preserve key information, including decisions, important data, and context. It is recommended to retain sufficient recent events via the keep_recent_count parameter.
A: Adjust the set_summarizer_conversation_threshold parameter to control summarization frequency, and use skip_summarization to mark events that should not be summarized.
A: The summarizer includes error handling mechanisms. On failure, it returns the original session without affecting the normal conversation flow.
A: Summary quality can be evaluated using metrics such as compression ratio, information coverage, and user feedback.
A: Perform the following checks:
- Verify that the API key is correct
- Confirm that the network connection is functioning properly
- Verify that the model name is correct
A: Solutions:
- Adjust the
max_summary_lengthparameter - Use a higher-quality model (e.g., GPT-4)
- Ensure the conversation content contains sufficient information
- Customize the
summarizer_prompttemplate
A: Solutions:
- Adjust the
keep_recent_countparameter - Lower the conversation summarization threshold set by
set_summarizer_conversation_thresholdfor more frequent summarization - Check whether too many events are marked to skip summarization
A: Solution: Refer to 4. Agent-Level Summarization (Filter Approach) in the Advanced Features section. Use AgentSessionSummarizerFilter for summarization within an Agent Filter.
Session Summarizer references Agno summarizer.py as a starting point, with the following key differences:
- Data Structure: TRPC Agent uses a more complex Event structure
- Model Invocation: Uses LlmRequest and generate_async
- Integration: Deep integration with Session Service
- Configuration Options: Provides more customization options
- Multi-Agent Support: Supports Agent-level summarization (Filter approach)
See the complete summarization usage examples:
- 📁 Example Code:
examples/session_summarizer/run_agent.py - 📁 Example Documentation:
examples/session_summarizer/README.md - 📁 Agent Filter Implementation:
examples/session_summarizer/agent/filters.py
The examples demonstrate two summarization approaches:
- SessionService-Level Summarization: Uses
SummarizerSessionManagerfor automatic summarization at the session service level - Agent-Level Summarization: Uses
AgentSessionSummarizerFilterfor summarization within an Agent Filter
Both approaches can be combined. Choose the most suitable approach based on your actual requirements.