It happened when I use your method to train other dataset(use 9 input views).
Frist, I used your code DNGaussian/dpt/get_depth_map_for_llff_dtu.pyto get the depth_maps, and the results is no problem.
Second, I customized the code DNGaussian/scene/dataset_readers.pyto read my own data.
It caused error when I run the train code:
python mytrain.py \
--source_path "./new_datasets/omni3d/backpack_016" \
--model_path "./Data/test/backpack_016" \
--images "images_2" \
--dataset "DTU" \
--data_device "cuda:1" \
--n_sparse 9 \
--eval \
--test_iterations -1 \
--save_iterations 10000 \
--iterations 10000 \
As you can see the bash output below, all the cam_infos are successfully read and it seems raise error when processing the depth_map:
Reading camera 200/200 [20/02 19:37:14]
Dataset Type: DTU [20/02 19:37:14]
train ['/data/zrt/DNGaussian/new_datasets/omni3d/backpack_016/images_2/00001.jpg', '/data/zrt/DNGaussian/new_datasets/omni3d/backpack_016/images_2/00019.jpg', '/data/zrt/DNGaussian/new_datasets/omni3d/backpack_016/images_2/00036.jpg', '/data/zrt/DNGaussian/new_datasets/omni3d/backpack_016/images_2/00060.jpg', '/data/zrt/DNGaussian/new_datasets/omni3d/backpack_016/images_2/00082.jpg', '/data/zrt/DNGaussian/new_datasets/omni3d/backpack_016/images_2/00097.jpg', '/data/zrt/DNGaussian/new_datasets/omni3d/backpack_016/images_2/00122.jpg', '/data/zrt/DNGaussian/new_datasets/omni3d/backpack_016/images_2/00143.jpg', '/data/zrt/DNGaussian/new_datasets/omni3d/backpack_016/images_2/00167.jpg'] [20/02 19:37:14]
Loading Training Cameras [1.0] [20/02 19:37:14]
Traceback (most recent call last):
File "/data/zrt/DNGaussian/mytrain.py", line 370, in <module>
training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from)
File "/data/zrt/DNGaussian/mytrain.py", line 42, in training
scene = Scene(dataset, gaussians)
File "/data/zrt/DNGaussian/scene/__init__.py", line 79, in __init__
self.train_cameras[resolution_scale] = cameraList_from_camInfos(scene_info.train_cameras, resolution_scale, args)
File "/data/zrt/DNGaussian/utils/camera_utils.py", line 93, in cameraList_from_camInfos
camera_list.append(loadCam(args, id, c, resolution_scale))
File "/data/zrt/DNGaussian/utils/camera_utils.py", line 45, in loadCam
resized_depth_mono = PILtoTorch(cam_info.depth_mono, resolution)
File "/data/zrt/DNGaussian/utils/general_utils.py", line 23, in PILtoTorch
resized_image = torch.from_numpy(np.array(resized_image_PIL)) / 255.0
TypeError: can't convert np.ndarray of type numpy.uint16. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
Have you ever encountered this problem?
And I tried to solve it by modifying the code DNGaussian/utils/general_utils.py:
def PILtoTorch(pil_image, resolution):
resized_image_PIL = pil_image.resize(resolution)
resized_image = torch.from_numpy(np.array(resized_image_PIL).astype(np.uint8)) / 255.0
if len(resized_image.shape) == 3:
return resized_image.permute(2, 0, 1)
else:
return resized_image.unsqueeze(dim=-1).permute(2, 0, 1)
It can start training process, but when it iterated 5800/10000 there was a OOM problem:
Traceback (most recent call last):
File "/data/zrt/DNGaussian/mytrain.py", line 370, in <module>
training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from)
File "/data/zrt/DNGaussian/mytrain.py", line 185, in training
loss.backward()
File "/data/zrt/anaconda3/envs/DNG/lib/python3.9/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/data/zrt/anaconda3/envs/DNG/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.38 GiB (GPU 1; 23.55 GiB total capacity; 17.99 GiB already allocated; 977.19 MiB free; 21.75 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Training progress: 58%|###################### | 5800/10000 [11:04<08:01, 8.72it/s, Loss=0.2403810]
How can I make it work?
It happened when I use your method to train other dataset(use 9 input views).
Frist, I used your code
DNGaussian/dpt/get_depth_map_for_llff_dtu.pyto get the depth_maps, and the results is no problem.Second, I customized the code
DNGaussian/scene/dataset_readers.pyto read my own data.It caused error when I run the train code:
As you can see the bash output below, all the cam_infos are successfully read and it seems raise error when processing the depth_map:
Have you ever encountered this problem?
And I tried to solve it by modifying the code
DNGaussian/utils/general_utils.py:It can start training process, but when it iterated 5800/10000 there was a OOM problem:
How can I make it work?