Skip to content

VBVR-DataFactory/Multi-23_chained_code_pipeline_data-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-23: Chained Code Pipeline Data Generator

Generates synthetic datasets for chained-function code execution. The agent must visually simulate a multi-stage code pipeline where the output of each function becomes the input to the next — testing long-horizon symbolic execution where errors cascade through stages.

Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.


📌 Basic Information

Property Value
Task ID Multi-23
Task Chained Code Pipeline
Category Algorithmic Execution
Resolution 1024×1024 px
FPS 16 fps
Duration varies
Output PNG images + MP4 video

🚀 Usage

Installation

# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/Multi-23_chained_code_pipeline_data-generator.git
cd Multi-23_chained_code_pipeline_data-generator

# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Generate Data

# Generate 50 samples
python examples/generate.py --num-samples 50

# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42

# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset

# Without videos (faster, images only)
python examples/generate.py --num-samples 50 --no-videos

Command-Line Options

Argument Description
--num-samples Number of tasks to generate (required)
--output Output directory (default: data/questions)
--seed Random seed for reproducibility
--no-videos Skip video generation (images only)

📖 Task Example

Prompt

[Scenario] The image displays a sequential code execution pipeline consisting of three Python functions (A, B, and C) and an initial input value.
[Rules]
1. The initial input value must be processed by Function A.
2. The output of Function A becomes the input to Function B.
3. The output of Function B becomes the input to Function C.
[Task] Generate a video simulating the chained execution of the pipeline. Animate the data packet passing through each function block, updating its value according to the code, and display the final computed output in the terminal.

Visual

Initial Frame
Three function blocks + initial input value
Animation
Data packet flows A → B → C with value mutation
Final Frame
Final terminal output value

📖 Task Description

Objective

Mentally execute a 3-stage code pipeline: feed the initial input through Function A → take its output and feed it to Function B → take that output and feed it to Function C → display the final result. Animate the data packet moving along the pipeline at each stage.

Task Setup

  • Pipeline: Three function blocks (A, B, C) with displayed source code (e.g., lambda x: x*2, lambda x: x+1).
  • Initial value: A printed input value entering Function A.
  • Data packet: A visually moving token that carries the current value, mutating as it transits each function.
  • Terminal output: A display panel that prints the final value at the end of the pipeline.
  • Composition: Final output = C(B(A(input))).

Key Features

  • Long-horizon symbolic execution: Multi-stage compositions where models must track value mutation through 3+ functions.
  • Error cascade: A miscomputation at stage A propagates through B and C — the final answer encodes accumulated correctness.
  • Visual data flow: Each transit is rendered explicitly, providing per-stage intermediate state for fine-grained evaluation.
  • Code-as-input: The model must read printed source code and interpret it symbolically — bridging visual perception and formal computation.

📦 Data Format

data/questions/Multi-23_chained_code_pipeline_data-generator_task/Multi-23_chained_code_pipeline_data-generator_00000000/
├── first_frame.png            # Pipeline + initial input
├── final_frame.png            # Pipeline complete + terminal output
├── prompt.txt                 # Task instruction
├── ground_truth.mp4           # Animation of A → B → C execution
└── question_metadata.json     # Standardized VBVR task metadata

File specifications:

  • Images: 1024×1024 PNG format
  • Video: MP4 format, 16 fps, H.264 + yuv420p
  • Metadata: VBVR canonical schema with task_id, vbvr_task_code, media, parameters

🏷️ Tags

code-execution chained-pipeline function-composition symbolic-execution algorithmic-execution long-horizon multi-step-reasoning


Part of the 36-Task Long-Horizon Multi-Step Reasoning Benchmark.

About

Multi-23: chained code pipeline data generator — Algorithmic Execution domain of the 36-task Long-Horizon Multi-Step Reasoning Benchmark.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages