Skip to content

kaland313/CNN-JEPA

Repository files navigation

CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture

This repository is the official implementation of CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture

[arXiv preprint] [Official Publication (IEEE Xplore)]

Algorithm Overview

If you use our code or results, please cite our paper and consider giving this repo a ⭐ :

@INPROCEEDINGS{kalapos2024cnnjepa,
  author={Kalapos, András and Gyires-Tóth, Bálint},
  booktitle={2024 International Conference on Machine Learning and Applications (ICMLA)}, 
  title={CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture}, 
  year={2024},
  pages={1111-1114},
  doi={10.1109/ICMLA61862.2024.00169}}

Related papers

[1] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, Masked Autoencoders Are Scalable Vision Learners, presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16000–16009. [paper]

[2] K. Tian, Y. Jiang, Q. Diao, C. Lin, L. Wang, and Z. Yuan, Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling, presented at The Eleventh International Conference on Learning Representations, Sep. 2022. [paper] [code]

[3] M. Assran et al., Self-Supervised Learning From Images With a Joint-Embedding Predictive Architecture, presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 15619–15629. [paper] [code]

How to run

Configs are provided for ImageNet-100 and ImageNet-1k.

PYTHONPATH=. python pretrain/train_ijepacnn.py --config-name ijepacnn_imagenet.yaml

Baseline implementations of the following pretraining approaches are also provided:

Setup

We recommend using the provided Docker container to run the code.

Option A: Start Docker container and connect to it via ssh:

  1. Create a keypair, copy the public key to the root of this repo, and edit the Dockerfile accordingly.
  2. Run make ssh.
  3. Connect on port 2222 ssh root@<hostname> -i <private_key_path> -p 2222.

Alternatively, to run the container without starting an ssh server, run make run.

To customize Docker build and run, edit the Makefile or the Dockerfile.

⚠️ make ssh and make run start the container with the --rm flag! Only contents of the /workspace persist if the container is stopped (via a simple volume mount)!

Option B: Install dependencies locally (not tested)

Install the requirements with pip install -r requirements.txt.

Datasets

To achieve optimal performance on our HPC cluster, we store the datasets in HDF5 format. If torchvision.datasets.ImageFolder datasets are efficient on your system, you can use them instead, by editing lines 182-183 in pretrain/trainer_common.py.

To use the datasets in HDF5 format, you need to first download the datasets, extract them to their default ImageFolder format, then convert them to the HDF5 format we use. For the conversion, we provide a function in data/hdf5_imagefolder.py.

Download the datasets from the following links:

Copyright, acknowledgements

Our implementation is based on:

About

Official implementation of our paper "CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors