1. Introduction
I have always been uncertain about whether Mean Squared Error (MSE) or Chamfer Distance (CD) is the better reconstruction loss function for a VAE applied to point cloud data.
To address this, I compared MSE and CD in terms of reconstruction error to evaluate which loss function achieves higher reconstruction quality.
2. Comparison Method
The VAE architecture and datasets remain the same as in my previous article .
In this experiment, I used a Mixture of Gaussians Variational Autoencoder (MoG-VAE) and compared the reconstruction quality by switching between the CD and MSE as loss functions.
To evaluate the quality of the reconstructed shapes, I used CD and Earth Mover’s Distance (EMD) as metrics.
For the definitions of CD and EMD, please refer to my previous article.
3. Pros & Cons of CD
✅ Pros
CD provides higher reconstruction quality than MSE, particularly in preserving finer details of the point cloud.
❌ Cons
However, CD can make training less stable compared to MSE, potentially leading to convergence issues.
4. Loss Function for Stable Training
To achieve stable training, I incorporate the MSE loss $L_{\text{MSE}}$ and the CD loss $L_{\text{CD}}$.
The total loss is defined as:
$$
\mathcal{L} = L_{\text{MSE}} + 2L_{\text{CD}} + L_{\text{D_KL}} \tag1
$$
$$
\mathcal{L} = L_{\text{MSE}} + 2L_{\text{CD}} + \frac{1}{2} \sum_{k=1}^{K} \pi_k \left( \mu_k^2 + \sigma_k^2 - \log \sigma_k^2 - 1 \right) \tag2
$$
where:
- $L_{D_{\text{KL}}}$ is the Kullback–Leibler (KL) divergence, which regularizes the latent space distribution to match a prior distribution, typically a standard normal distribution.
In this article, the KL divergence is computed using a 2-component MoG. - $K$: 2(Number of Gaussian components in the mixture)
- Epochs:5000
- Learning rate: $1 × 10^{-4}$
The combination of MSE and CD can stabilize learning and promote the learning of local shapes with CD.
5. Evaluation
Comparison of MoG-VAE Loss Functions ( MSE vs. MSE+CD )
Reconstruction: MoG-VAE ( MSE + CD )
Reconstruction: MoG-VAE ( MSE )
Evaluation | CD | CD | CD | EMD | EMD | EMD |
---|---|---|---|---|---|---|
Architecture | VAE | MoG-VAE | MoG-VAE | VAE | MoG-VAE | MoG-VAE |
Loss | MSE+KL_D | MSE+KL_D | MSE+CD+KL_D | MSE+KL_D | MSE+KL_D | MSE+CD+KL_D |
Design1 | 0.0245 | 0.0247 | 0.0145 | 0.0167 | 0.0181 | 0.0088 |
Design2 | 0.0247 | 0.0226 | 0.0154 | 0.0145 | 0.0131 | 0.0082 |
Design3 | 0.0390 | 0.0161 | 0.0231 | 0.0298 | 0.0088 | 0.0142 |
Design4 | 0.0303 | 0.0227 | 0.0287 | 0.0195 | 0.0136 | 0.0189 |
Design5 | 0.0333 | 0.0287 | 0.0286 | 0.0217 | 0.0176 | 0.0177 |
Design6 | 0.0292 | 0.0174 | 0.0218 | 0.0267 | 0.0111 | 0.0140 |
Design7 | 0.0463 | 0.0277 | 0.0198 | 0.0400 | 0.0176 | 0.0112 |
Design8 | 0.0286 | 0.0281 | 0.0236 | 0.0213 | 0.0200 | 0.0161 |
Design9 | 0.0315 | 0.0316 | 0.0221 | 0.0197 | 0.0200 | 0.0130 |
Ave. | 0.0319 | 0.0244 | 0.0220 | 0.0233 | 0.0155 | 0.0136 |
As expected, the loss function of MSE + CD results in higher reconstruction quality than the loss function of MSE only.
6. Conclusion
I compared the loss function of MSE + CD with the loss function of MSE only using MoG-VAE. The loss function of MSE + CD results in higher reconstruction quality than the loss function of MSE only.
However, CD is unstable during training, so you should consider the training cost when using CD.
Nevertheless, CD can achieve high-quality shape generation.
Thank you for reading my article.
7. code