Discrepancy in Reprojection Loss Calculation: Multiplication vs.  𝑟 / 𝜎 + log ⁡ ( 𝜎 ) r/σ+log(σ)

Hello,

I have noticed a discrepancy in the implementation of the reprojection loss in the PyTorch version compared to what is described in the original D3VO paper.

The paper theoretically defines the per-pixel reprojection loss (assuming a Laplacian noise model) as:

**𝐿= (𝑟 / 𝜎) + log(𝜎)**

where 𝑟 is the photometric residual and 𝜎 represents the predicted uncertainty. This formulation ensures that when the predicted uncertainty is low (i.e., the pixel is reliable), the residual is weighted more heavily, and when the uncertainty is high, the residual's impact is reduced.

However, in the provided PyTorch code, the reprojection loss is computed as follows:

```
def compute_reprojection_loss(self, pred, target, sigma):
    """Computes reprojection loss between a batch of predicted and target images"""
    abs_diff = torch.abs(target - pred)
    l1_loss = abs_diff.mean(1, True)

    if self.opt.no_ssim:
        reprojection_loss = l1_loss
    else:
        ssim_loss = (self.ssim(pred, target)).mean(1, True)
        reprojection_loss = 0.85 * ssim_loss + 0.15 * l1_loss

        # Reference: https://github.com/no-Seaweed/Learning-Deep-Learning-1/blob/master/paper_notes/sfm_learner.md
        # transformed_sigma = (10 * sigma + 0.1)
        
        # Exp 1 
        # transformed_sigma = sigma + 0.001
        # reprojection_loss = (reprojection_loss / transformed_sigma) + torch.log(transformed_sigma)

        reprojection_loss = (reprojection_loss * sigma)

    return reprojection_loss

```

Here, instead of dividing by 𝜎 and adding log(𝜎) the loss is simply multiplied by 𝜎
(i.e., reprojection_loss = reprojection_loss * sigma). 

This approach is contrary to the theoretical formulation—since if 𝜎 is low (indicating high confidence), the loss should be high, not scaled down by multiplication.

Additionally, there are commented-out lines that suggest experimental attempts (e.g., using transformed_sigma = sigma + 0.001) that would be closer to the  **𝑟 / 𝜎 +log(𝜎)** formulation, but these are not used in the final code.

Could we please discuss:

The rationale behind opting for a simple multiplication over the theoretically motivated formulation?

Whether the current approach is empirically validated to work better, and if so, what might be the trade-offs?

Suggestions for reconciling the implementation with the original paper's formulation without sacrificing training stability.

Looking forward to your insights on this matter.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discrepancy in Reprojection Loss Calculation: Multiplication vs. 𝑟 / 𝜎 + log ⁡ ( 𝜎 ) r/σ+log(σ) #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Discrepancy in Reprojection Loss Calculation: Multiplication vs. 𝑟 / 𝜎 + log ⁡ ( 𝜎 ) r/σ+log(σ) #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions