CUDA memory management

Hello. I wanted to run demo.py on a video in my pc, but I am getting the following **CUDA out of memory issue:**

_CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 9.68 GiB total capacity; 6.99 GiB already allocated; 712.56 MiB free; 7.09 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
  File "/media/khumoyun/9AF47E94F47E71FD1/Fayzullo/LART/lart_demo/LART/scripts/PHALP.py", line 194, in track
    pred_bbox, pred_bbox_pad, pred_masks, pred_scores, pred_classes, gt_tids, gt_annots = self.get_detections(image_frame, frame_name, t_, additional_data, measurments)
  File "/media/khumoyun/9AF47E94F47E71FD1/Fayzullo/LART/lart_demo/LART/scripts/PHALP.py", line 330, in get_detections
    outputs     = self.detector(image)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/phalp/utils/utils_detectron2.py", line 216, in __call__
    predictions = self.model([inputs])[0]
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 150, in forward
    return self.inference(batched_inputs)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 204, in inference
    features = self.backbone(images.tensor)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/detectron2/modeling/backbone/vit.py", line 489, in forward
    bottom_up_features = self.net(x)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/detectron2/modeling/backbone/vit.py", line 357, in forward
    x = blk(x)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/detectron2/modeling/backbone/vit.py", line 218, in forward
    x = self.attn(x)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/detectron2/modeling/backbone/vit.py", line 75, in forward
    attn = add_decomposed_rel_pos(attn, q, self.rel_pos_h, self.rel_pos_w, (H, W), (H, W))
  File "/home/khumoyun/miniconda3/envs/lart/lib/python3.10/site-packages/detectron2/modeling/backbone/utils.py", line 122, in add_decomposed_rel_pos
    attn.view(B, q_h, q_w, k_h, k_w) + rel_h[:, :, :, :, None] + rel_w[:, :, :, None, :]
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 9.68 GiB total capacity; 6.99 GiB already allocated; 712.56 MiB free; 7.09 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Error executing job with overrides: ['video.source=/media/khumoyun/9AF47E94F47E71FD1/Fayzullo/CAGNet_baseline/data/BIT/Bit-frames/box/box_0006']
Traceback (most recent call last):
  File "/media/khumoyun/9AF47E94F47E71FD1/Fayzullo/LART/lart_demo/LART/scripts/demo.py", line 94, in main
    _, pkl_path = phalp_tracker.track()
TypeError: cannot unpack non-iterable NoneType object

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace._


How can I solve this issue? I also tried to set max split size as suggested, but didn't help. My GPU is RTX 3080




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA memory management #31

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

CUDA memory management #31

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions