Skip to content

[Bug] TestTorchTensorRTModule::test_get_layer_info AssertionError on H100 with CUDA 13.x #4151

@apbose

Description

@apbose

Bug Description

TestTorchTensorRTModule::test_get_layer_info fails with AssertionError: False is not true on H100 GPUs across multiple CUDA 13.x versions. The failure is consistent across r13.2.0 and r13.1.1.

Environment

  • GPU: H100
  • Arch: x86_64
  • CUDA: 13.2.0 / 13.1.1
  • OS: Ubuntu 24.04
  • cuDNN: 8.9.7.29
  • TensorRT: 10.16.0.59
  • TensorRT (Myelin): 2.17.78+7
  • CASK: 5.16.17+1
  • Python: 3.12
  • Package: qa_tar_py3.12

Failing Test

FAILED api/test_classes.py::TestTorchTensorRTModule::test_get_layer_info - AssertionError: False is not true

Reproducible Configurations

GPU CUDA OS Result
H100/x86_64 r13.2.0 Ubuntu 24.04 FAILED
H100/x86_64 r13.1.1 Ubuntu 24.04 FAILED

Steps to Reproduce

  1. Run on an H100 with CUDA 13.x and the environment listed above
  2. Execute: pytest api/test_classes.py::TestTorchTensorRTModule::test_get_layer_info

Expected Behavior

test_get_layer_info should return valid layer information and the assertion should pass.

Additional Context

The test suite overall is healthy (46 passed, 1 skipped), with only this single test failing. The error message (False is not true) suggests get_layer_info() may be returning an empty or falsy result on CUDA 13.x, possibly due to an API change or missing support in the newer CUDA/TensorRT stack.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions