A professional test case generation tool for AI systems. Generates diverse inputs, edge cases, and adversarial test cases to evaluate AI robustness, with export capabilities for documentation and analysis.
- Features
- Installation
- Quick Start
- Usage
- Test Case Categories
- Export Formats
- Project Structure
- API Reference
- License
| Feature | Description |
|---|---|
| Diverse Input Generation | Multilingual text, varying formats, tones, lengths, and topics |
| Edge Case Detection | Empty inputs, Unicode boundaries, numeric limits, encoding variations |
| Adversarial Testing | Prompt injection tests, semantic traps, reasoning challenges |
| Multiple Export Formats | JSON and CSV output with full metadata |
| Web Interface | Modern, responsive UI with dark/light theme support |
| Command Line Tool | Scriptable CLI for automation and CI/CD integration |
| Python API | Programmatic access for custom integrations |
| Severity Classification | Low, Medium, High, and Critical severity levels |
| Filtering & Search | Filter by category, severity, or search by content |
- Python 3.10 or higher
# Clone or navigate to the project directory
cd redboxgenerator
# Install dependencies
pip install -r requirements.txtpython app.pyOpen http://localhost:5000 in your browser.
# Generate all test case types
python main.py generate --all -n 20 -o test_cases.json
# View available options
python main.py --help- Start the server:
python app.py - Open http://localhost:5000
- Select test case categories (Diverse, Edge Cases, Adversarial)
- Set the number of cases per category
- Optionally filter by severity level
- Click "Generate Test Cases"
- View results in the Results tab
- Export to JSON or CSV from the Export tab
Interface Features:
- Dark/Light theme toggle
- Real-time statistics dashboard
- Searchable and filterable results table
- Detailed view modal for each test case
- One-click export to JSON or CSV
# Generate all categories (default: 10 per category)
python main.py generate --all -o output.json
# Generate specific categories
python main.py generate --diverse --edge-cases -n 25 -o tests.json
# Generate only adversarial tests
python main.py generate --adversarial -n 50 -o adversarial.json
# Filter by severity
python main.py generate --all --severity critical -o critical_tests.json
# Export to CSV format
python main.py generate --all -n 20 -f csv -o tests.csv
# Include summary statistics
python main.py generate --all -n 30 -o tests.json --summary
# Split output by category
python main.py generate --all --split-by-category -o output/tests.json
# List available test types
python main.py listCLI Options:
| Option | Description |
|---|---|
--all |
Generate all test case categories |
--diverse |
Generate diverse input tests |
--edge-cases |
Generate edge case tests |
--adversarial |
Generate adversarial input tests |
-n, --count |
Number of test cases per category (default: 10) |
-o, --output |
Output file path (required) |
-f, --format |
Output format: json or csv (default: json) |
--severity |
Filter by severity: low, medium, high, critical |
--split-by-category |
Create separate files per category |
--include-metadata |
Include metadata in output |
--summary |
Generate additional summary file |
from redbox_generator import TestCaseGenerator, CSVExporter, JSONExporter
# Initialize generator
generator = TestCaseGenerator()
# Generate all test case types
all_cases = generator.generate_all(count_per_category=15)
# Or generate specific categories
diverse_cases = generator.generate_diverse(count=20)
edge_cases = generator.generate_edge_cases(count=20)
adversarial_cases = generator.generate_adversarial(count=20)
# Get summary statistics
summary = generator.get_summary()
print(f"Total: {summary['total']}")
print(f"By category: {summary['by_category']}")
# Filter results
from redbox_generator.generators.base import TestCaseCategory, TestCaseSeverity
critical_cases = generator.filter_by_severity(TestCaseSeverity.CRITICAL)
adversarial_only = generator.filter_by_category(TestCaseCategory.ADVERSARIAL)
# Export to JSON
json_exporter = JSONExporter()
json_exporter.export(all_cases, "output/test_cases.json")
json_exporter.export_summary(all_cases, "output/summary.json")
# Export to CSV
csv_exporter = CSVExporter()
csv_exporter.export(all_cases, "output/test_cases.csv")
# Export by category (creates separate files)
json_exporter.export_by_category(all_cases, "output/by_category/")Tests AI handling of varied input types:
| Subcategory | Description |
|---|---|
| Multilingual | Non-English languages, mixed scripts |
| Format Variety | Questions, instructions, conversational styles |
| Length Variety | Single words to long paragraphs |
| Tone Variety | Formal, informal, technical communication |
| Topic Variety | Wide range of subject matters |
| Mixed Content | Code blocks, URLs, mathematical expressions |
Tests boundary conditions and unusual inputs:
| Subcategory | Description |
|---|---|
| Empty Input | Empty strings, whitespace variations |
| Special Characters | Symbols, emojis, combining characters |
| Boundary Length | Very short to very long inputs |
| Unicode | Zero-width characters, BOM, emoji sequences |
| Numeric | Large numbers, scientific notation, NaN |
| Format | Markdown, HTML, SQL, JSON embedded content |
| Encoding | URL encoding, Base64, HTML entities |
| Null-like | Strings like "null", "None", "undefined" |
Tests AI resilience to manipulation attempts:
| Subcategory | Description |
|---|---|
| Prompt Structure | Instruction override attempts, role injection |
| Context Confusion | False memory claims, nonexistent context |
| Instruction Conflicts | Contradictory or impossible requests |
| Format Manipulation | Unusual output format requests |
| Semantic Traps | Self-referential paradoxes, negation traps |
| Reasoning Challenges | Logic puzzles, cognitive bias tests |
| Ambiguity | Syntactically or semantically ambiguous input |
| Consistency | Requests for contradictory responses |
| Boundary Probing | Attempts to find guideline edges |
| Misdirection | Roleplay bypasses, emotional manipulation |
{
"generator": "RedBox Generator",
"version": "1.0.0",
"generated_at": "2024-01-15T10:30:00.000000",
"total_count": 30,
"test_cases": [
{
"id": "a1b2c3d4",
"name": "Multilingual Input #1",
"description": "Tests AI handling of non-English input",
"category": "diverse",
"subcategory": "multilingual",
"input_data": "Bonjour, comment ca va?",
"expected_behavior": "Should handle multilingual input gracefully",
"severity": "medium",
"tags": ["multilingual", "language", "i18n"],
"created_at": "2024-01-15T10:30:00.000000",
"metadata": {}
}
]
}| Column | Description |
|---|---|
| id | Unique test case identifier |
| name | Test case name |
| description | Detailed description |
| category | diverse, edge_case, or adversarial |
| subcategory | Specific test type |
| input_data | The test input to use |
| expected_behavior | Expected AI response behavior |
| severity | low, medium, high, or critical |
| tags | Comma-separated tags |
| created_at | Generation timestamp |
redboxgenerator/
├── app.py # Flask web application
├── main.py # CLI entry point
├── requirements.txt # Python dependencies
├── README.md # Documentation
│
├── redbox_generator/ # Core library
│ ├── __init__.py
│ ├── cli.py # Command-line interface
│ │
│ ├── generators/ # Test case generators
│ │ ├── __init__.py
│ │ ├── base.py # Base classes and main generator
│ │ ├── diverse.py # Diverse input generator
│ │ ├── edge_cases.py # Edge case generator
│ │ └── adversarial.py # Adversarial input generator
│ │
│ └── exporters/ # Export functionality
│ ├── __init__.py
│ ├── csv_exporter.py # CSV export
│ └── json_exporter.py # JSON export
│
├── templates/ # HTML templates
│ └── index.html
│
├── static/ # Static assets
│ ├── css/
│ │ └── style.css
│ └── js/
│ └── app.js
│
└── output/ # Generated test files
Main class for generating test cases.
generator = TestCaseGenerator()
# Methods
generator.generate_all(count_per_category=10) # Generate all types
generator.generate_diverse(count=10) # Diverse inputs only
generator.generate_edge_cases(count=10) # Edge cases only
generator.generate_adversarial(count=10) # Adversarial only
generator.get_all_cases() # Get all generated cases
generator.filter_by_category(category) # Filter by category
generator.filter_by_severity(severity) # Filter by severity
generator.get_summary() # Get statistics
generator.clear() # Clear all casesData class representing a single test case.
@dataclass
class TestCase:
id: str # Unique identifier
name: str # Test case name
description: str # Description
category: TestCaseCategory # diverse, edge_case, adversarial
subcategory: str # Specific type
input_data: Any # Test input
expected_behavior: str # Expected behavior
severity: TestCaseSeverity # low, medium, high, critical
tags: list[str] # Tags for filtering
created_at: str # ISO timestamp
metadata: dict # Additional metadata# JSON Exporter
json_exporter = JSONExporter(indent=2, ensure_ascii=False)
json_exporter.export(cases, "output.json")
json_exporter.export_summary(cases, "summary.json")
json_exporter.export_by_category(cases, "output_dir/")
json_exporter.export_by_severity(cases, "output_dir/")
json_exporter.to_json_string(cases)
# CSV Exporter
csv_exporter = CSVExporter(fields=None) # None uses defaults
csv_exporter.export(cases, "output.csv")
csv_exporter.export_summary(cases, "summary.csv")
csv_exporter.export_by_category(cases, "output_dir/")MIT License
Copyright (c) 2024 RedBox Generator
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.