Skip to content

TestCase

gitpavleenbali edited this page Feb 17, 2026 · 2 revisions

TestCase

A TestCase represents a single evaluation scenario for testing agent behavior.

Import

from pyai.evaluation import TestCase

Constructor

TestCase(
    input: str,                      # Input prompt/question
    expected_output: str = None,     # Expected response (optional)
    criteria: list[str] = None,      # Criteria to evaluate
    metadata: dict = None,           # Additional metadata
    name: str = None,                # Test case name
    tags: list[str] = None,          # Tags for filtering
    timeout: float = 30.0            # Timeout in seconds
)

Creating Test Cases

Basic Test Case

test = TestCase(
    input="What is the capital of France?",
    expected_output="Paris",
    criteria=["accuracy"]
)

Open-Ended Test Case

test = TestCase(
    input="Write a haiku about coding",
    expected_output=None,  # No exact expected output
    criteria=["creativity", "format"]
)

With Metadata

test = TestCase(
    input="Solve: 15 * 7 + 3",
    expected_output="108",
    name="math_test_001",
    tags=["math", "arithmetic"],
    metadata={
        "difficulty": "easy",
        "category": "multiplication"
    }
)

Class Methods

from_dict()

Create from dictionary:

data = {
    "input": "What is 2+2?",
    "expected_output": "4",
    "criteria": ["accuracy"]
}
test = TestCase.from_dict(data)

from_yaml()

Load from YAML file:

# test_case.yaml
input: "Summarize this article"
expected_output: null
criteria:
  - relevance
  - conciseness
tags:
  - summarization
test = TestCase.from_yaml("test_case.yaml")

Batch Creation

# Create multiple test cases
test_cases = [
    TestCase(input="2+2", expected_output="4"),
    TestCase(input="3*4", expected_output="12"),
    TestCase(input="10/2", expected_output="5"),
]

# Or from a list of dicts
test_cases = TestCase.batch_create([
    {"input": "Hello", "expected_output": "Hi"},
    {"input": "Goodbye", "expected_output": "Bye"},
])

Properties

Property Type Description
input str The input prompt
expected_output str Expected response
criteria list Evaluation criteria
name str Test case identifier
tags list Categorization tags
metadata dict Additional data

See Also

🧠 PYAI Wiki

Home


πŸš€ Getting Started


πŸ’‘ Core Concepts


🎯 One-Liner APIs


πŸ€– Agent Framework


πŸ”— Multi-Agent


πŸ› οΈ Tools & Skills


🏒 Enterprise


πŸŽ™οΈ Voice


πŸ–ΌοΈ Multimodal


πŸ“Š Vector DB


🌐 OpenAPI


πŸ”Œ Plugins


🀝 A2A Protocol


πŸ”’ Security


πŸ“š Reference


Intelligence, Embedded.

Clone this wiki locally