MMVAE-AVS Multimodal Variational Auto-encoder based Audio-Visual Segmentation [ICCV2023]. checkpoint https://huggingface.co/OpenNLPLab/MMVAE-AVS