Skip to content

Converge on a consistent CLI arg convention across recipes #33

Description

@davanstrien

Explore a consistent CLI argument convention across all recipes.

The positional INPUT OUTPUT order is already consistent, but optional flags vary widely, which weakens the "built for agents" promise — an agent that learns one recipe's flags can't assume they transfer. Examples seen in practice:

  • OCR scripts use --max-samples; vllm/classify-dataset.py has no row-limit flag (only --batch-size).
  • --model vs --model-id; --image-column vs --inference-column; varying --max-tokens / --batch-size defaults.

Concrete trigger: a smoke-test assumed --max-samples on classify-dataset.py and failed at argparse — exactly the cross-script surprise this would prevent.

Goal

A documented canonical vocabulary of common flags (e.g. --max-samples, --model, --batch-size, --input-column/--output-column, --max-tokens) that every applicable recipe uses; converge over time.

Gate (same as #31)

A convention, not a forced shared argparse module. Keep scripts self-contained and readable — prefer a documented spec + an authoring template/snippet that scripts inline, over a runtime-imported arg lib. Better a consistent convention than a hidden abstraction.

Per-family instance: the OCR arg-alignment issue. Related: the DRY exploration (#31).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestresearchExploration / research task

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions