Add ensemble_batch_size for single-device inference

### Describe the workflow you want to enable

Add ensemble_batch_size for single-device inference

### Describe your proposed solution

On devices with large amounts of RAM like Strix Halo, this can greatly speed up results

### Describe alternatives you've considered, if relevant

_No response_

### Additional context

_No response_

### Impact

None