Skip to content

[Bug] TestWeightStrippedEngine OOM on RTX 3070 (8GB) with CUDA 12.x #4150

@apbose

Description

@apbose

Bug Description

TestWeightStrippedEngine tests fail with CUDA out-of-memory errors on GPUs with 8GB VRAM (RTX 3070). The tests attempt to allocate memory that exceeds available GPU capacity.

Environment

  • GPU: RTX 3070 (7.67 GiB)
  • CUDA: 12.8.1 / 12.6.3
  • OS: Ubuntu 22.04 / 24.04
  • TensorRT (Myelin): 2.10.26+1103
  • cuDNN: 8.9.7.29
  • PyTorch: >=2.0
  • ONNX: 1.16.0

Failing Tests

FAILED models/test_weight_stripped_engine.py::TestWeightStrippedEngine::test_two_TRTRuntime_in_refitting
FAILED models/test_weight_stripped_engine.py::TestWeightStrippedEngine::test_weight_stripped_engine_results
FAILED models/test_weight_stripped_engine.py::TestWeightStrippedEngine::test_weight_stripped_engine_sizes

Error

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB.
GPU 0 has a total capacity of 7.67 GiB ...

Additionally, test_weight_stripped_engine_results also hits an AssertionError (likely after a partial OOM or on retry).

Steps to Reproduce

  1. Run on an RTX 3070 (8GB) with the environment listed above
  2. Execute: pytest models/test_weight_stripped_engine.py

Expected Behavior

Tests should either:

  1. Succeed on GPUs with 8GB VRAM, or
  2. Be skipped with an appropriate @unittest.skipIf decorator when insufficient GPU memory is detected

Suggested Fix

Consider adding a memory guard or reducing model/input sizes for these tests so they fit within 8GB GPUs. Alternatively, mark them with a minimum VRAM requirement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions