Generates synthetic datasets for chained-function code execution. The agent must visually simulate a multi-stage code pipeline where the output of each function becomes the input to the next — testing long-horizon symbolic execution where errors cascade through stages.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | Multi-23 |
| Task | Chained Code Pipeline |
| Category | Algorithmic Execution |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | varies |
| Output | PNG images + MP4 video |
# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/Multi-23_chained_code_pipeline_data-generator.git
cd Multi-23_chained_code_pipeline_data-generator
# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .# Generate 50 samples
python examples/generate.py --num-samples 50
# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset
# Without videos (faster, images only)
python examples/generate.py --num-samples 50 --no-videos| Argument | Description |
|---|---|
--num-samples |
Number of tasks to generate (required) |
--output |
Output directory (default: data/questions) |
--seed |
Random seed for reproducibility |
--no-videos |
Skip video generation (images only) |
[Scenario] The image displays a sequential code execution pipeline consisting of three Python functions (A, B, and C) and an initial input value.
[Rules]
1. The initial input value must be processed by Function A.
2. The output of Function A becomes the input to Function B.
3. The output of Function B becomes the input to Function C.
[Task] Generate a video simulating the chained execution of the pipeline. Animate the data packet passing through each function block, updating its value according to the code, and display the final computed output in the terminal.
![]() |
![]() |
![]() |
| Initial Frame Three function blocks + initial input value |
Animation Data packet flows A → B → C with value mutation |
Final Frame Final terminal output value |
Mentally execute a 3-stage code pipeline: feed the initial input through Function A → take its output and feed it to Function B → take that output and feed it to Function C → display the final result. Animate the data packet moving along the pipeline at each stage.
- Pipeline: Three function blocks (A, B, C) with displayed source code (e.g.,
lambda x: x*2,lambda x: x+1). - Initial value: A printed input value entering Function A.
- Data packet: A visually moving token that carries the current value, mutating as it transits each function.
- Terminal output: A display panel that prints the final value at the end of the pipeline.
- Composition: Final output =
C(B(A(input))).
- Long-horizon symbolic execution: Multi-stage compositions where models must track value mutation through 3+ functions.
- Error cascade: A miscomputation at stage A propagates through B and C — the final answer encodes accumulated correctness.
- Visual data flow: Each transit is rendered explicitly, providing per-stage intermediate state for fine-grained evaluation.
- Code-as-input: The model must read printed source code and interpret it symbolically — bridging visual perception and formal computation.
data/questions/Multi-23_chained_code_pipeline_data-generator_task/Multi-23_chained_code_pipeline_data-generator_00000000/
├── first_frame.png # Pipeline + initial input
├── final_frame.png # Pipeline complete + terminal output
├── prompt.txt # Task instruction
├── ground_truth.mp4 # Animation of A → B → C execution
└── question_metadata.json # Standardized VBVR task metadata
File specifications:
- Images: 1024×1024 PNG format
- Video: MP4 format, 16 fps, H.264 + yuv420p
- Metadata: VBVR canonical schema with
task_id,vbvr_task_code,media,parameters
code-execution chained-pipeline function-composition symbolic-execution algorithmic-execution long-horizon multi-step-reasoning
Part of the 36-Task Long-Horizon Multi-Step Reasoning Benchmark.


