RedBox Generator

A professional test case generation tool for AI systems. Generates diverse inputs, edge cases, and adversarial test cases to evaluate AI robustness, with export capabilities for documentation and analysis.

Features

Feature	Description
Diverse Input Generation	Multilingual text, varying formats, tones, lengths, and topics
Edge Case Detection	Empty inputs, Unicode boundaries, numeric limits, encoding variations
Adversarial Testing	Prompt injection tests, semantic traps, reasoning challenges
Multiple Export Formats	JSON and CSV output with full metadata
Web Interface	Modern, responsive UI with dark/light theme support
Command Line Tool	Scriptable CLI for automation and CI/CD integration
Python API	Programmatic access for custom integrations
Severity Classification	Low, Medium, High, and Critical severity levels
Filtering & Search	Filter by category, severity, or search by content

Installation

Prerequisites

Python 3.10 or higher

Setup

# Clone or navigate to the project directory
cd redboxgenerator

# Install dependencies
pip install -r requirements.txt

Quick Start

Web Interface

python app.py

Open http://localhost:5000 in your browser.

Command Line

# Generate all test case types
python main.py generate --all -n 20 -o test_cases.json

# View available options
python main.py --help

Usage

Web Interface

Start the server: python app.py
Open http://localhost:5000
Select test case categories (Diverse, Edge Cases, Adversarial)
Set the number of cases per category
Optionally filter by severity level
Click "Generate Test Cases"
View results in the Results tab
Export to JSON or CSV from the Export tab

Interface Features:

Dark/Light theme toggle
Real-time statistics dashboard
Searchable and filterable results table
Detailed view modal for each test case
One-click export to JSON or CSV

Command Line Interface

# Generate all categories (default: 10 per category)
python main.py generate --all -o output.json

# Generate specific categories
python main.py generate --diverse --edge-cases -n 25 -o tests.json

# Generate only adversarial tests
python main.py generate --adversarial -n 50 -o adversarial.json

# Filter by severity
python main.py generate --all --severity critical -o critical_tests.json

# Export to CSV format
python main.py generate --all -n 20 -f csv -o tests.csv

# Include summary statistics
python main.py generate --all -n 30 -o tests.json --summary

# Split output by category
python main.py generate --all --split-by-category -o output/tests.json

# List available test types
python main.py list

CLI Options:

Option	Description
`--all`	Generate all test case categories
`--diverse`	Generate diverse input tests
`--edge-cases`	Generate edge case tests
`--adversarial`	Generate adversarial input tests
`-n, --count`	Number of test cases per category (default: 10)
`-o, --output`	Output file path (required)
`-f, --format`	Output format: `json` or `csv` (default: json)
`--severity`	Filter by severity: low, medium, high, critical
`--split-by-category`	Create separate files per category
`--include-metadata`	Include metadata in output
`--summary`	Generate additional summary file

Python API

from redbox_generator import TestCaseGenerator, CSVExporter, JSONExporter

# Initialize generator
generator = TestCaseGenerator()

# Generate all test case types
all_cases = generator.generate_all(count_per_category=15)

# Or generate specific categories
diverse_cases = generator.generate_diverse(count=20)
edge_cases = generator.generate_edge_cases(count=20)
adversarial_cases = generator.generate_adversarial(count=20)

# Get summary statistics
summary = generator.get_summary()
print(f"Total: {summary['total']}")
print(f"By category: {summary['by_category']}")

# Filter results
from redbox_generator.generators.base import TestCaseCategory, TestCaseSeverity

critical_cases = generator.filter_by_severity(TestCaseSeverity.CRITICAL)
adversarial_only = generator.filter_by_category(TestCaseCategory.ADVERSARIAL)

# Export to JSON
json_exporter = JSONExporter()
json_exporter.export(all_cases, "output/test_cases.json")
json_exporter.export_summary(all_cases, "output/summary.json")

# Export to CSV
csv_exporter = CSVExporter()
csv_exporter.export(all_cases, "output/test_cases.csv")

# Export by category (creates separate files)
json_exporter.export_by_category(all_cases, "output/by_category/")

Test Case Categories

Diverse Inputs

Tests AI handling of varied input types:

Subcategory	Description
Multilingual	Non-English languages, mixed scripts
Format Variety	Questions, instructions, conversational styles
Length Variety	Single words to long paragraphs
Tone Variety	Formal, informal, technical communication
Topic Variety	Wide range of subject matters
Mixed Content	Code blocks, URLs, mathematical expressions

Edge Cases

Tests boundary conditions and unusual inputs:

Subcategory	Description
Empty Input	Empty strings, whitespace variations
Special Characters	Symbols, emojis, combining characters
Boundary Length	Very short to very long inputs
Unicode	Zero-width characters, BOM, emoji sequences
Numeric	Large numbers, scientific notation, NaN
Format	Markdown, HTML, SQL, JSON embedded content
Encoding	URL encoding, Base64, HTML entities
Null-like	Strings like "null", "None", "undefined"

Adversarial Inputs

Tests AI resilience to manipulation attempts:

Subcategory	Description
Prompt Structure	Instruction override attempts, role injection
Context Confusion	False memory claims, nonexistent context
Instruction Conflicts	Contradictory or impossible requests
Format Manipulation	Unusual output format requests
Semantic Traps	Self-referential paradoxes, negation traps
Reasoning Challenges	Logic puzzles, cognitive bias tests
Ambiguity	Syntactically or semantically ambiguous input
Consistency	Requests for contradictory responses
Boundary Probing	Attempts to find guideline edges
Misdirection	Roleplay bypasses, emotional manipulation

Export Formats

JSON Output

{
  "generator": "RedBox Generator",
  "version": "1.0.0",
  "generated_at": "2024-01-15T10:30:00.000000",
  "total_count": 30,
  "test_cases": [
    {
      "id": "a1b2c3d4",
      "name": "Multilingual Input #1",
      "description": "Tests AI handling of non-English input",
      "category": "diverse",
      "subcategory": "multilingual",
      "input_data": "Bonjour, comment ca va?",
      "expected_behavior": "Should handle multilingual input gracefully",
      "severity": "medium",
      "tags": ["multilingual", "language", "i18n"],
      "created_at": "2024-01-15T10:30:00.000000",
      "metadata": {}
    }
  ]
}

CSV Output

Column	Description
id	Unique test case identifier
name	Test case name
description	Detailed description
category	diverse, edge_case, or adversarial
subcategory	Specific test type
input_data	The test input to use
expected_behavior	Expected AI response behavior
severity	low, medium, high, or critical
tags	Comma-separated tags
created_at	Generation timestamp

Project Structure

redboxgenerator/
├── app.py                      # Flask web application
├── main.py                     # CLI entry point
├── requirements.txt            # Python dependencies
├── README.md                   # Documentation
│
├── redbox_generator/           # Core library
│   ├── __init__.py
│   ├── cli.py                  # Command-line interface
│   │
│   ├── generators/             # Test case generators
│   │   ├── __init__.py
│   │   ├── base.py             # Base classes and main generator
│   │   ├── diverse.py          # Diverse input generator
│   │   ├── edge_cases.py       # Edge case generator
│   │   └── adversarial.py      # Adversarial input generator
│   │
│   └── exporters/              # Export functionality
│       ├── __init__.py
│       ├── csv_exporter.py     # CSV export
│       └── json_exporter.py    # JSON export
│
├── templates/                  # HTML templates
│   └── index.html
│
├── static/                     # Static assets
│   ├── css/
│   │   └── style.css
│   └── js/
│       └── app.js
│
└── output/                     # Generated test files

API Reference

TestCaseGenerator

Main class for generating test cases.

generator = TestCaseGenerator()

# Methods
generator.generate_all(count_per_category=10)    # Generate all types
generator.generate_diverse(count=10)              # Diverse inputs only
generator.generate_edge_cases(count=10)           # Edge cases only
generator.generate_adversarial(count=10)          # Adversarial only
generator.get_all_cases()                         # Get all generated cases
generator.filter_by_category(category)            # Filter by category
generator.filter_by_severity(severity)            # Filter by severity
generator.get_summary()                           # Get statistics
generator.clear()                                 # Clear all cases

TestCase

Data class representing a single test case.

@dataclass
class TestCase:
    id: str                      # Unique identifier
    name: str                    # Test case name
    description: str             # Description
    category: TestCaseCategory   # diverse, edge_case, adversarial
    subcategory: str             # Specific type
    input_data: Any              # Test input
    expected_behavior: str       # Expected behavior
    severity: TestCaseSeverity   # low, medium, high, critical
    tags: list[str]              # Tags for filtering
    created_at: str              # ISO timestamp
    metadata: dict               # Additional metadata

Exporters

# JSON Exporter
json_exporter = JSONExporter(indent=2, ensure_ascii=False)
json_exporter.export(cases, "output.json")
json_exporter.export_summary(cases, "summary.json")
json_exporter.export_by_category(cases, "output_dir/")
json_exporter.export_by_severity(cases, "output_dir/")
json_exporter.to_json_string(cases)

# CSV Exporter
csv_exporter = CSVExporter(fields=None)  # None uses defaults
csv_exporter.export(cases, "output.csv")
csv_exporter.export_summary(cases, "summary.csv")
csv_exporter.export_by_category(cases, "output_dir/")

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RedBox Generator

Table of Contents

Features

Installation

Prerequisites

Setup

Quick Start

Web Interface

Command Line

Usage

Web Interface

Command Line Interface

Python API

Test Case Categories

Diverse Inputs

Edge Cases

Adversarial Inputs

Export Formats

JSON Output

CSV Output

Project Structure

API Reference

TestCaseGenerator

TestCase

Exporters

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.claude		.claude
__pycache__		__pycache__
output		output
redbox_generator		redbox_generator
static		static
templates		templates
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt
tests.csv		tests.csv

antonyga/redboxgenerator

Folders and files

Latest commit

History

Repository files navigation

RedBox Generator

Table of Contents

Features

Installation

Prerequisites

Setup

Quick Start

Web Interface

Command Line

Usage

Web Interface

Command Line Interface

Python API

Test Case Categories

Diverse Inputs

Edge Cases

Adversarial Inputs

Export Formats

JSON Output

CSV Output

Project Structure

API Reference

TestCaseGenerator

TestCase

Exporters

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages