Utilizing all NVDEC engines on a single GPU with multiple pipelines

### Describe the question.

When running some video decoding benchmarks by building a DALIGenericIterator with a single pipeline on my GH200 system and checking the NVDEC engine utilization via 

`nvidia-smi dmon -s u`

I find my DEC utilization to be capped at ~14%, which would align with only one (out of seven) NVDEC engines being used. According to the [documentation](https://docs.nvidia.com/video-technologies/video-codec-sdk/13.0/nvdec-application-note/index.html) I found, the NVIDIA driver should take care of load balancing between the different decoding units. 

However, when creating multiple DALI piplines (e.g., seven), I find my DEC utilization to be close to 100%, indicating that all NVDEC engines are used. In raw decoding performance, running multiple pipelines on the same GPU also give me a performance boost. 

My questions are:
- Can a single DALI pipeline only use a single decoding unit?
- Is running multiple pipelines on the same GPU the default way to utilize all decoding units? 

### Check for duplicates

- [x] I have searched the [open bugs/issues](https://github.com/NVIDIA/DALI/issues) and have found no duplicates for this bug report

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Utilizing all NVDEC engines on a single GPU with multiple pipelines #6387

Describe the question.

Check for duplicates

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Utilizing all NVDEC engines on a single GPU with multiple pipelines #6387

Description

Describe the question.

Check for duplicates

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions