Implement scalability benchmark sweep

Automated benchmark that discovers optimal resource configuration for a given system size by running short production trials across a sweep of resource configs.

## Motivation

Different system sizes have different optimal resource configurations. A 5,000-atom system doesn't benefit from 64 cores the way a 500,000-atom system does. Currently the pipeline uses hardcoded settings (`num_cores: 8`, `num_gpus: 1`). A short benchmark sweep identifies the throughput sweet spot before committing to a full campaign.

**Depends loosely on:** #7, #8 (Cluster autodiscovery — knows available core counts and GPU types)

## Scope

- `mdfactory/performance/benchmark.py`
- `BenchmarkConfig` dataclass: sweep parameters (core counts, GPU replicas, duration)
- `run_benchmark_sweep(system_path, config) -> BenchmarkResult`
- GROMACS md.log parser: extract ns/day from performance section
- Result selection: best ns/day or best ns/day per core-hour
- Short benchmark MDP template (100 ps, minimal output)
- Result persistence as JSON sidecar per campaign

## Sweep dimensions

- CPU path: vary cores (1, 2, 4, 8, 16, 32, 64) with OMP_NUM_THREADS
- GPU path: vary concurrent replicas (1, 2, 4, 8, 16) sharing one GPU

## Reuse across HT batch

If all systems in a campaign have similar atom counts (within ±20%), the benchmark runs once on a representative system. The discovered settings then apply to all production runs in the batch.

## Acceptance criteria

- Parser correctly extracts ns/day from GROMACS md.log samples
- Sweep generates correct srun commands for each configuration
- Result JSON contains all trials with ns/day and selects optimum
- Works without cluster (dry-run mode that generates scripts without submitting)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement scalability benchmark sweep #9

Motivation

Scope

Sweep dimensions

Reuse across HT batch

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Implement scalability benchmark sweep #9

Description

Motivation

Scope

Sweep dimensions

Reuse across HT batch

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions