Skip to content

liujiaming1996/MAE.pytorch

Repository files navigation

An unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

This repository is based on MAE-pytorch, thanks very much!!!

I'm conducting extensive experiments mentioned in the paper. The performance still seems to be a little bit different from the original.

TODO

  • implement the finetune process
  • reuse the model in modeling_pretrain.py
  • caculate the normalized pixels target
  • add the cls token in the encoder
  • visualization of reconstruction image
  • knn and linear prob
  • 2D sine-cosine position embeddings
  • Fine-tuning semantic segmentation on Cityscapes & ADE20K
  • Fine-tuning instance segmentation on COCO

Setup

pip install -r requirements.txt

Run

  1. Pretrain & Finetune
bash pretrain.sh
  1. Visualization of reconstruction
# Set the path to save images
OUTPUT_DIR='output/'
# path to image for visualization
IMAGE_PATH='files/ILSVRC2012_val_00031649.JPEG'
# path to pretrain model
MODEL_PATH='/path/to/pretrain/checkpoint.pth'

# Now, it only supports pretrained models with normalized pixel targets
python run_mae_vis.py ${IMAGE_PATH} ${OUTPUT_DIR} ${MODEL_PATH}

Result

model pretrain finetune accuracy log weight
vit-base 800e (normed pixel) 100e 83.2% - -

I'm really appreaciate for your star!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors