[FEATURE] Reusable YAML/JSON-based processing pipeline

**Feature description**
Introduce reusable YAML/JSON-based processing pipeline presets that allow users to define and save multi-step workflows. This would let users execute common processing chains with a single command instead of manually repeating each operation every time.

Example configuration:
```yaml
pipeline:
  - extract_frames
  - remove_background
  - resize: 512x512
  - convert: webp
```
This feature would make the project significantly more scalable and production-ready for media preprocessing and AI dataset engineering workflows.
---
**Problem this solves**
Currently, users need to manually repeat the same sequence of operations for every dataset or media batch. This becomes inefficient and error-prone when working with large-scale preprocessing workflows or iterative experimentation.

For example, users preparing AI training datasets may repeatedly:
* extract frames
* resize images
* remove backgrounds
* convert formats

Manually running these steps every time reduces reproducibility and slows down workflow automation.
---
**Proposed solution**
Add support for reusable pipeline configuration files using YAML or JSON.

Possible implementation idea:
* Create a `pipelines/` directory for user-defined presets
* Add CLI support such as:

```bash
reframe run pipeline.yaml
```
* Parse pipeline steps sequentially
* Allow parameterized operations
* Validate configs before execution
* Provide execution logs and step-level error reporting

Potential future enhancements:
* conditional steps
* parallel execution
* pipeline templates
* plugin/custom processor support

---
**Alternatives considered**
An alternative approach would be shell scripts or manually chaining CLI commands together. However:
* scripts are less portable
* harder to validate
* difficult for non-technical users
* lack standardized structure

A built-in pipeline system would provide a cleaner and more maintainable workflow experience.
---

**Additional context**
This feature could significantly improve usability for:
* AI dataset preparation
* bulk media processing
* automated preprocessing workflows
* reproducible experiments

It would also make the project more attractive for production and research use cases where repeatable processing pipelines are essential.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Reusable YAML/JSON-based processing pipeline #791

This feature would make the project significantly more scalable and production-ready for media preprocessing and AI dataset engineering workflows.

Manually running these steps every time reduces reproducibility and slows down workflow automation.

A built-in pipeline system would provide a cleaner and more maintainable workflow experience.

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[FEATURE] Reusable YAML/JSON-based processing pipeline #791

Description

This feature would make the project significantly more scalable and production-ready for media preprocessing and AI dataset engineering workflows.

Manually running these steps every time reduces reproducibility and slows down workflow automation.

A built-in pipeline system would provide a cleaner and more maintainable workflow experience.

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions