CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture

This repository is the official implementation of CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture

[arXiv preprint] [Official Publication (IEEE Xplore)]

Algorithm Overview

If you use our code or results, please cite our paper and consider giving this repo a ⭐ :

@INPROCEEDINGS{kalapos2024cnnjepa,
  author={Kalapos, András and Gyires-Tóth, Bálint},
  booktitle={2024 International Conference on Machine Learning and Applications (ICMLA)}, 
  title={CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture}, 
  year={2024},
  pages={1111-1114},
  doi={10.1109/ICMLA61862.2024.00169}}

Related papers

[1] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, Masked Autoencoders Are Scalable Vision Learners, presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16000–16009. [paper]

[2] K. Tian, Y. Jiang, Q. Diao, C. Lin, L. Wang, and Z. Yuan, Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling, presented at The Eleventh International Conference on Learning Representations, Sep. 2022. [paper] [code]

[3] M. Assran et al., Self-Supervised Learning From Images With a Joint-Embedding Predictive Architecture, presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 15619–15629. [paper] [code]

How to run

Configs are provided for ImageNet-100 and ImageNet-1k.

PYTHONPATH=. python pretrain/train_ijepacnn.py --config-name ijepacnn_imagenet.yaml

Baseline implementations of the following pretraining approaches are also provided:

Setup

We recommend using the provided Docker container to run the code.

Option A: Start Docker container and connect to it via ssh:

Create a keypair, copy the public key to the root of this repo, and edit the Dockerfile accordingly.
Run make ssh.
Connect on port 2222 ssh root@<hostname> -i <private_key_path> -p 2222.

Alternatively, to run the container without starting an ssh server, run make run.

To customize Docker build and run, edit the Makefile or the Dockerfile.

⚠️ make ssh and make run start the container with the --rm flag! Only contents of the /workspace persist if the container is stopped (via a simple volume mount)!

Option B: Install dependencies locally (not tested)

Install the requirements with pip install -r requirements.txt.

Datasets

To achieve optimal performance on our HPC cluster, we store the datasets in HDF5 format. If torchvision.datasets.ImageFolder datasets are efficient on your system, you can use them instead, by editing lines 182-183 in pretrain/trainer_common.py.

To use the datasets in HDF5 format, you need to first download the datasets, extract them to their default ImageFolder format, then convert them to the HDF5 format we use. For the conversion, we provide a function in data/hdf5_imagefolder.py.

Download the datasets from the following links:

Copyright, acknowledgements

Our implementation is based on:

SparK
The official I-JEPA implementation that pretrains Vision Transformers

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github		.github
data		data
models		models
pretrain		pretrain
utils		utils
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
IJEPA-Masking.ipynb		IJEPA-Masking.ipynb
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
WandbPlots.ipynb		WandbPlots.ipynb
__init__.py		__init__.py
imagenet1k_extractor.sh		imagenet1k_extractor.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture

Algorithm Overview

Related papers

How to run

Setup

Option A: Start Docker container and connect to it via ssh:

Option B: Install dependencies locally (not tested)

Datasets

Copyright, acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CNN-JEPA: Self-Supervised Pretraining Convolutional Neural Networks Using Joint Embedding Predictive Architecture

Algorithm Overview

Related papers

How to run

Setup

Option A: Start Docker container and connect to it via ssh:

Option B: Install dependencies locally (not tested)

Datasets

Copyright, acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages