Bug Description
TestWeightStrippedEngine tests fail with CUDA out-of-memory errors on GPUs with 8GB VRAM (RTX 3070). The tests attempt to allocate memory that exceeds available GPU capacity.
Environment
- GPU: RTX 3070 (7.67 GiB)
- CUDA: 12.8.1 / 12.6.3
- OS: Ubuntu 22.04 / 24.04
- TensorRT (Myelin): 2.10.26+1103
- cuDNN: 8.9.7.29
- PyTorch: >=2.0
- ONNX: 1.16.0
Failing Tests
FAILED models/test_weight_stripped_engine.py::TestWeightStrippedEngine::test_two_TRTRuntime_in_refitting
FAILED models/test_weight_stripped_engine.py::TestWeightStrippedEngine::test_weight_stripped_engine_results
FAILED models/test_weight_stripped_engine.py::TestWeightStrippedEngine::test_weight_stripped_engine_sizes
Error
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB.
GPU 0 has a total capacity of 7.67 GiB ...
Additionally, test_weight_stripped_engine_results also hits an AssertionError (likely after a partial OOM or on retry).
Steps to Reproduce
- Run on an RTX 3070 (8GB) with the environment listed above
- Execute:
pytest models/test_weight_stripped_engine.py
Expected Behavior
Tests should either:
- Succeed on GPUs with 8GB VRAM, or
- Be skipped with an appropriate
@unittest.skipIf decorator when insufficient GPU memory is detected
Suggested Fix
Consider adding a memory guard or reducing model/input sizes for these tests so they fit within 8GB GPUs. Alternatively, mark them with a minimum VRAM requirement.
Bug Description
TestWeightStrippedEnginetests fail with CUDA out-of-memory errors on GPUs with 8GB VRAM (RTX 3070). The tests attempt to allocate memory that exceeds available GPU capacity.Environment
Failing Tests
Error
Additionally,
test_weight_stripped_engine_resultsalso hits anAssertionError(likely after a partial OOM or on retry).Steps to Reproduce
pytest models/test_weight_stripped_engine.pyExpected Behavior
Tests should either:
@unittest.skipIfdecorator when insufficient GPU memory is detectedSuggested Fix
Consider adding a memory guard or reducing model/input sizes for these tests so they fit within 8GB GPUs. Alternatively, mark them with a minimum VRAM requirement.