dnd-context-compressor

Context compression pipeline for running LLM-powered D&D multiplayer sessions on a Raspberry Pi 4 (8GB).

1 LLM call per round, no matter how many players.

The Problem

Running a 5-player D&D session with local LLMs (Ollama) on a Pi 4 means:

Small context windows (~1100 tokens for phi3-mini)
Slow generation (~12s per call)
5 sequential LLM calls = 60 seconds per round = unplayable

The Solution

This library compresses everything so the DM makes one merged LLM call per round:

Players act → Collect all → Compress → 1 LLM call → Pre-gen A/B → Broadcast → Whispers → Wait

5 Compression Strategies (layered, cheapest first)

#	Strategy	Cost	Description
1	TRIM	Free	Sliding window — drop old context
2	SKELETON	Free	Strip narration, keep state-changing facts
3	SUMMARIZE	1 cheap LLM call	Condense old exchanges to 1-2 sentences
4	MERGE	Free	Combine N player actions into 1 paragraph
5	FORGET	Free	Nuclear reset — only keep current scene

Token Budget

Component	Tokens
Context (compressed history + actions)	~800
Generation (DM response)	~300
Total	~1,100

Components

`ContextCompressor` — Token budget engine

Applies compression strategies automatically to stay within budget.

`TurnCollector` — Async action batching

Waits for all players (60s timeout), auto-passes stragglers, merges everything into one prompt.

`DMPipeline` — Main orchestrator

The full round loop: compress → DM → pre-gen → broadcast → whispers → wait.

`PreGenerator` — A/B options

Pre-generates choice options while players read narration. Hides LLM latency.

`WhisperChannel` — Private messages

Player secrets, patron voices, trap notices, DM notes — all routed privately.

Quick Start

pip install dnd-context-compressor

Solo Play

from dnd_compressor import DMPipeline, TurnCollector, PlayerAction, ActionType

pipeline = DMPipeline()
pipeline.initialize_sync()

# Player acts
action = PlayerAction(
    player_name="Kael",
    player_class="Fighter",
    action_type=ActionType.ATTACK,
    action_text="I swing my greatsword at the goblin",
    target="Goblin Chief",
)

result = pipeline.process_round_sync([action])
print(result.dm_narration)
print(f"A) {result.option_a}")
print(f"B) {result.option_b}")

Multiplayer (async server)

import asyncio
from dnd_compressor import DMPipeline, TurnCollector, PlayerAction, ActionType

async def main():
    pipeline = DMPipeline()
    await pipeline.initialize(
        broadcast_fn=send_to_all_players,  # your websocket broadcast
        whisper_fn=send_private_message,   # your private msg sender
    )

    collector = pipeline.collector
    collector.start_round(["Kael", "Thalia", "Gruk"])

    # Players submit actions via websocket...
    # collector.submit_action(action)

    # Wait for all (60s timeout, auto-pass stragglers)
    actions = await collector.wait_for_all()

    # One LLM call for everything
    result = await pipeline.process_round(actions)
    # Broadcast + whispers happen automatically via callbacks

asyncio.run(main())

Secret Actions

# Rogue tries to steal something — only DM knows
secret = PlayerAction(
    player_name="Kael",
    player_class="Rogue",
    action_type=ActionType.WHISPER,
    action_text="I secretly pocket the enchanted gem",
    is_private=True,
)

# DM processes it privately, other players see:
# "Kael hesitates, examining the treasure hoard..."

Configuration

from dnd_compressor.pipeline import LLMConfig, DMPipeline

config = LLMConfig(
    base_url="http://localhost:11434",  # Ollama
    model="phi3:mini",                  # Best for Pi 4
    max_gen_tokens=300,
    temperature=0.8,
    num_ctx=1100,                       # 800 context + 300 gen
)

pipeline = DMPipeline(llm_config=config)

Supported Models (Pi 4 8GB)

Model	Speed	Quality	Recommended
phi3:mini	~12s/call	Good	Default
mistral:7b-q4	~18s/call	Great	For narrative-heavy
llama3:8b-q4	~20s/call	Best	If you can wait
tinyllama	~5s/call	Basic	Speed priority

Architecture

┌─────────────────────────────────────────────────┐
│                  DMPipeline                      │
│                                                  │
│  ┌──────────┐  ┌───────────┐  ┌──────────────┐ │
│  │  Turn     │  │  Context  │  │    Pre-      │ │
│  │ Collector │→ │Compressor │→ │  Generator   │ │
│  └──────────┘  └───────────┘  └──────────────┘ │
│       ↑              │              │            │
│       │              ↓              ↓            │
│  Players        Ollama (1       A/B Options     │
│  (1-5)          call/round)     (async)         │
│       ↑              │              │            │
│       │              ↓              │            │
│  ┌──────────┐  ┌───────────┐       │            │
│  │ Whisper  │← │ Broadcast │←──────┘            │
│  │ Channel  │  │  (public) │                    │
│  └──────────┘  └───────────┘                    │
└─────────────────────────────────────────────────┘

Development

git clone https://github.com/TCMG-v1/dnd-context-compressor.git
cd dnd-context-compressor
pip install -e ".[dev]"
pytest

License

MIT — Created by TCMG-v1 · Co-created with Claude, Grok, Perplexity

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src/dnd_compressor		src/dnd_compressor
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dnd-context-compressor

The Problem

The Solution

5 Compression Strategies (layered, cheapest first)

Token Budget

Components

`ContextCompressor` — Token budget engine

`TurnCollector` — Async action batching

`DMPipeline` — Main orchestrator

`PreGenerator` — A/B options

`WhisperChannel` — Private messages

Quick Start

Solo Play

Multiplayer (async server)

Secret Actions

Configuration

Supported Models (Pi 4 8GB)

Architecture

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dnd-context-compressor

The Problem

The Solution

5 Compression Strategies (layered, cheapest first)

Token Budget

Components

ContextCompressor — Token budget engine

TurnCollector — Async action batching

DMPipeline — Main orchestrator

PreGenerator — A/B options

WhisperChannel — Private messages

Quick Start

Solo Play

Multiplayer (async server)

Secret Actions

Configuration

Supported Models (Pi 4 8GB)

Architecture

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`ContextCompressor` — Token budget engine

`TurnCollector` — Async action batching

`DMPipeline` — Main orchestrator

`PreGenerator` — A/B options

`WhisperChannel` — Private messages

Packages