Dear authors,
I am super interested in your brilliant work!
And I am very curious how image Inpainting has been achieved via such a framework.
Different from previous MAE, MAGE would like to do predict on VQGAN index domain. I could figure out the general idea of implementing inpainting via MAE, but with MAGE, how could you feed the masked images into VQGAN and do prediction on the generated codebook index domain?
Because mask on the raw pixel level does not that equal to the mask on the codebook index domain.
May you please explain this with more details? Thank you so much for your great contribution again!
Dear authors,
I am super interested in your brilliant work!
And I am very curious how image Inpainting has been achieved via such a framework.
Different from previous MAE, MAGE would like to do predict on VQGAN index domain. I could figure out the general idea of implementing inpainting via MAE, but with MAGE, how could you feed the masked images into VQGAN and do prediction on the generated codebook index domain?
Because mask on the raw pixel level does not that equal to the mask on the codebook index domain.
May you please explain this with more details? Thank you so much for your great contribution again!