Hi
I'm reading your code. I have a few questions regarding the depth map.
In the mat file of the dataset, the depth is in the unit of meter. The depth ranges from 0 to 10 meters. When you transfer the distances to pixel values in the convert_mat_to_img.py, for each depth image, you normalized the depth with the highest depth distance value, then you multiplied the normalized distance with 255. Then you train the model with labels being the png pixel values divided by 255, which is not the distance but the normalized distance. Therefore, the model's output is not regressing on the distances but on the normalized distances. Shouldn't it regress on the true distance?
I think you should normalize the distance with 10 which is the maximum depth(I tested with python that the maximum depth is 9.99547 meters in NYU2 dataset website). Then the png image can be transfered to true depth value in meter as labels.
Meanwhile, is invalid_depth needed in the codes? From my understanding it indicates the sign of the depth. But can the depth values be negative?
By the way, for the scale-invariant loss, is the 0.5 in the following code needless?
cost = tf.reduce_mean(sum_square_d / 55.0*74.0 - 0.5*sqare_sum_d / math.pow(55*74, 2))
There is not a 0.5 in the formula (3) in the paper.
Is my understanding right?
Hi
I'm reading your code. I have a few questions regarding the depth map.
In the mat file of the dataset, the depth is in the unit of meter. The depth ranges from 0 to 10 meters. When you transfer the distances to pixel values in the convert_mat_to_img.py, for each depth image, you normalized the depth with the highest depth distance value, then you multiplied the normalized distance with 255. Then you train the model with labels being the png pixel values divided by 255, which is not the distance but the normalized distance. Therefore, the model's output is not regressing on the distances but on the normalized distances. Shouldn't it regress on the true distance?
I think you should normalize the distance with 10 which is the maximum depth(I tested with python that the maximum depth is 9.99547 meters in NYU2 dataset website). Then the png image can be transfered to true depth value in meter as labels.
Meanwhile, is
invalid_depthneeded in the codes? From my understanding it indicates the sign of the depth. But can the depth values be negative?By the way, for the scale-invariant loss, is the 0.5 in the following code needless?
cost = tf.reduce_mean(sum_square_d / 55.0*74.0 - 0.5*sqare_sum_d / math.pow(55*74, 2))There is not a 0.5 in the formula (3) in the paper.
Is my understanding right?