The AI Contract Runtime
First-class, versioned, testable interfaces and deterministic runtimes for AI components.
Quickstart β’ Features β’ Examples β’ Documentation
AI bugs are non-reproducible. LLM-powered features are a nightmare to test, debug, and audit:
- π Non-reproducible bugs β randomness, model updates, temperature all conspire against you
- π§ͺ Can't unit-test AI logic β no clear input/output boundaries, no determinism
- π₯ Silent failures β prompt template or memory schema changes break things without warning
- π No compliance trail β who called what LLM, with what data, when?
- π Slow feedback loops β no way to quickly iterate on AI behavior changes
Pactum introduces AI Contracts β declarative, versioned, enforceable interfaces for AI components:
from pactum import contract, PactRuntime, MemorySchema
@contract(
name="customer_support_reply:v1",
inputs={"query": str, "customer_id": str},
outputs={"reply": str, "intent": str},
memory=MemorySchema(keys={"customer_profile": {"type": "json", "version": 1}}),
allowed_tools=["kb_retriever", "crm_get"],
nondet_budget={"tokens": 8}
)
def support_reply(ctx, inputs):
snippets = ctx.tools.kb_retriever(inputs["query"], top_k=3)
prompt = f"Query: {inputs['query']}\nContext: {snippets}\nAnswer concisely."
result = ctx.llm.complete(prompt, temperature=0.7, max_tokens=256)
return {"reply": result.text, "intent": result.classification}
# Run with full tracing and snapshotting
runtime = PactRuntime(seed=42)
result = runtime.run(support_reply, {"query": "Where's my order?", "customer_id": "C-12345"})
print(f"Snapshot: {result.snapshot_id}") # Replay this anytime!| Feature | Description |
|---|---|
| π AI Contracts | Declarative input/output schemas, memory schemas, tool access controls |
| π― Deterministic Runtime | Seeded PRNG, token-level tracing, reproducible execution |
| πΈ Snapshot Store | Content-addressed (SHA-256), Git-friendly execution snapshots |
| π Replay Engine | Deterministic replay from any snapshot β reproduce any bug |
| π§ͺ Test Harness | Mock generation, regression testing from snapshots |
| π Fuzzing | Auto-generate random inputs to find contract violations |
| π Plugin System | Hooks for LLMs, tools, memory, validators (before_run, after_run, etc.) |
| π₯οΈ CLI | init, run, replay, test, mock, fuzz β everything from the terminal |
pip install -e ".[dev]"pactum init --name my-ai-projectThis creates:
pactum.yamlβ configuration.pactum/snapshots/β snapshot storagecontracts/example.pyβ starter contracttests/test_example.pyβ starter test
pactum run contracts/example.py:hello_world --input-file input.json --seed 42pactum replay --snapshot 9f2c3apactum test --cipactum fuzz contracts/example.py:hello_world --iterations 1000 --seed 42ββββββββββββββββββββ
β AI Component β
β (@contract) β
ββββββββββ¬ββββββββββ
β
ββββββββββΌββββββββββ
β Pactum Runtime β β Seeded PRNG, Tool Access, Tracer
β β
β βββββββββββββββ β
β β Validator β β β Input/Output, Memory, Tools, Budget
β βββββββββββββββ β
β βββββββββββββββ β
β β Tracer β β β Token-level event recording
β βββββββββββββββ β
ββββββββββ¬ββββββββββ
β
ββββββββββΌββββββββββ
β Snapshot Store β β Content-addressed (SHA-256)
β (.pactum/) β
ββββββββββ¬ββββββββββ
β
ββββββββββΌββββββββββββββββββββββββββββββ
β Replay Engine / Test Harness β
β Mocks Β· Fuzzing Β· CI Integration β
ββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββΌββββββββββββββββββββββββββββββ
β Plugins β
β LLM Β· Memory Β· Tool Β· Validator β
ββββββββββββββββββββββββββββββββββββββββ
pactum/
βββ core/ # Runtime, tracer, contract DSL, validator
βββ cli/ # Command-line interface
βββ plugins/ # LLM adapters, tools, memory, plugin base
βββ snapshot/ # Content-addressed snapshot store
βββ testing/ # Test harness, mock generator, fuzzer
examples/
βββ support-bot/ # Customer support contract example
βββ rag-app/ # RAG retrieval example
docs/
βββ concepts/ # What are contracts, runtime, snapshots
βββ cli/ # CLI reference
βββ api/ # Python SDK reference
tests/ # Full test suite
from pactum import contract, PactRuntime, MemorySchema
from pactum.plugins.llm_adapter import StubAdapter
from pactum.plugins.tool_adapter import ToolRegistry
@contract(
name="support:v1",
inputs={"query": str, "customer_id": str},
outputs={"reply": str, "intent": str},
allowed_tools=["kb_retriever"],
nondet_budget={"tokens": 256},
)
def support_reply(ctx, inputs):
docs = ctx.tools.kb_retriever(inputs["query"])
result = ctx.llm.complete(f"Answer: {inputs['query']}\nContext: {docs}")
return {"reply": result.text, "intent": "general"}
# Test with stub adapter β no API key needed!
tools = ToolRegistry()
tools.register("kb_retriever", lambda q, **kw: ["FAQ: Check tracking page."])
runtime = PactRuntime(
llm_adapter=StubAdapter(default_response="Your order is on its way!"),
tool_registry=tools,
seed=42,
)
result = runtime.run(support_reply, {"query": "Where's my order?", "customer_id": "C-123"})
print(result.outputs) # {"reply": "Your order is on its way!", "intent": "general"}
print(result.snapshot_id) # Replay this anytime!from pactum.plugins.base import PactumPlugin
class MetricsPlugin(PactumPlugin):
@property
def name(self):
return "metrics"
def after_run(self, ctx, inputs, outputs, trace):
tokens = sum(e.get("token_count", 0) or 0 for e in trace)
print(f"Total tokens used: {tokens}")Plugin Hooks:
before_run(context, inputs)β Before contract executionafter_run(context, inputs, outputs, trace)β After executionon_trace(event)β On each trace eventon_snapshot_commit(snapshot_id, data)β When snapshot is stored
Contributions are welcome! See our docs for architecture details.
# Install in dev mode
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# Run with coverage
pytest tests/ -v --cov=pactum --cov-report=term-missingMIT License β see LICENSE for details.
Pactum β Build AI components that are observable, reproducible, testable, and safe.