Thanks very much for sharing this cool work. I noticed that the padding function will collapse certain inputs, for example, when the image size is 206x500.
From https://github.com/lpiccinelli-eth/UniDepth/blob/8d8cfe4c7ee15297099983607febf0d4f32eb3d6/unidepth/models/unidepthv2/unidepthv2.py#L53C9-L53C49, the padding wize will be (-1,0,0,0), which will collapse the last dimension in
|
tensor = tensor[..., pad1_t : shapes[0] - pad1_b, pad1_l : shapes[1] - pad1_r] |
, resulting in a final output shape 206x1.
After replacing the padding function with the following, the issue is resolved.
def get_paddings(original_shape, aspect_ratio_range, eps=1e-6):
H_ori, W_ori = original_shape
orig_ar = W_ori / H_ori
min_ar, max_ar = aspect_ratio_range
# clamp into range
target_ar = min(max_ar, max(min_ar, orig_ar))
# if effectively unchanged → no padding
if abs(orig_ar - target_ar) < eps:
return (0, 0, 0, 0), (H_ori, W_ori)
if orig_ar > target_ar: # too wide -> increase height
W_new = W_ori
H_new = ceil(W_ori / target_ar) # avoid floor bug
pad_top = (H_new - H_ori) // 2
pad_bottom = H_new - H_ori - pad_top
pad_left = pad_right = 0
else: # too tall -> increase width
H_new = H_ori
W_new = ceil(H_ori * target_ar) # avoid floor bug
pad_left = (W_new - W_ori) // 2
pad_right = W_new - W_ori - pad_left
pad_top = pad_bottom = 0
# sanity: never negative
assert pad_left >= 0 and pad_right >= 0 and pad_top >= 0 and pad_bottom >= 0
return (pad_left, pad_right, pad_top, pad_bottom), (H_new, W_new)
Thanks very much for sharing this cool work. I noticed that the padding function will collapse certain inputs, for example, when the image size is 206x500.
From https://github.com/lpiccinelli-eth/UniDepth/blob/8d8cfe4c7ee15297099983607febf0d4f32eb3d6/unidepth/models/unidepthv2/unidepthv2.py#L53C9-L53C49, the padding wize will be (-1,0,0,0), which will collapse the last dimension in
UniDepth/unidepth/models/unidepthv2/unidepthv2.py
Line 88 in 8d8cfe4
After replacing the padding function with the following, the issue is resolved.