Skip to content

VBVR-DataFactory/Multi-21_chained_math_calculation_data-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-21: Chained Math Calculation Data Generator

Generates synthetic datasets for chained arithmetic completion. The agent must compute multiple sequentially dependent arithmetic expressions, where each line's result becomes a substrate (or operand) for subsequent lines, requiring faithful arithmetic execution and write-only completion.

Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.


📌 Basic Information

Property Value
Task ID Multi-21
Task Chained Math Calculation
Category Algorithmic Execution
Resolution 1024×1024 px
FPS 16 fps
Duration varies
Output PNG images + MP4 video

🚀 Usage

Installation

# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/Multi-21_chained_math_calculation_data-generator.git
cd Multi-21_chained_math_calculation_data-generator

# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Generate Data

# Generate 50 samples
python examples/generate.py --num-samples 50

# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42

# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset

# Without videos (faster, images only)
python examples/generate.py --num-samples 50 --no-videos

Command-Line Options

Argument Description
--num-samples Number of tasks to generate (required)
--output Output directory (default: data/questions)
--seed Random seed for reproducibility
--no-videos Skip video generation (images only)

📖 Task Example

Prompt

[Scenario] The image displays a sequence of stacked addition problems from top to bottom. The first problem has two operands, and subsequent problems have one missing operand.
[Rules]
1. The equations must be solved sequentially from top to bottom.
2. The left operand of each row (after the first) is exactly the calculated answer from the previous row.
[Task] Generate a video animating the step-by-step calculation of these chained equations from top to bottom. Fill in the correct intermediate result and the final answer after each '=' sign while leaving all other text unchanged.

Visual

Initial Frame
Math expressions with empty results
Animation
Each line filled with computed result
Final Frame
All expressions completed

📖 Task Description

Objective

Compute each arithmetic expression in the displayed text and fill in the result after the equals sign, without modifying any other characters in the text.

Task Setup

  • Expressions: Multiple arithmetic lines (e.g., 123 + 456 = ?).
  • Operators: Mix of +, -, ×, ÷, possibly with parentheses.
  • Number range: Multi-digit operands (chosen for non-trivial mental computation).
  • Strict typography: Only the answer slot is mutated; surrounding text, formatting, and unrelated lines must remain pixel-stable.
  • Video reveal: Each expression's answer is filled in sequentially, providing a per-line correctness signal.

Key Features

  • Faithful arithmetic execution: Tests pure computational fidelity (no semantic ambiguity, no plan).
  • Write-stability constraint: The model must update only the answer slots, treating the rest of the frame as immutable — a strong test of selective generation.
  • Sequential dependency option: When chained, each line's result feeds the next, so an early arithmetic mistake propagates and is observable.
  • Frame-by-frame evaluation: The video isolates each fill-in step, supporting fine-grained scoring.

📦 Data Format

data/questions/Multi-21_chained_math_calculation_data-generator_task/Multi-21_chained_math_calculation_data-generator_00000000/
├── first_frame.png            # Expressions with blank '=' slots
├── final_frame.png            # All slots filled with correct results
├── prompt.txt                 # Task instruction
├── ground_truth.mp4           # Animation of stepwise filling
└── question_metadata.json     # Standardized VBVR task metadata

File specifications:

  • Images: 1024×1024 PNG format
  • Video: MP4 format, 16 fps, H.264 + yuv420p
  • Metadata: VBVR canonical schema with task_id, vbvr_task_code, media, parameters

🏷️ Tags

arithmetic math-calculation chained-computation algorithmic-execution write-stability multi-step-reasoning


Part of the 36-Task Long-Horizon Multi-Step Reasoning Benchmark.

About

Multi-21: chained math calculation data generator — Algorithmic Execution domain of the 36-task Long-Horizon Multi-Step Reasoning Benchmark.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages