Aligning Representations Across Different Text Encoders

This repository contains code for mapping embedding space of one text encoder to another. We evaluate these mapping ("aligners") for different applications, including: mapping unimodal text encoder to CLIP's text encoder and evaluating on common multimodal tasks; mapping text encoders to an embedding space invertible by Vec2Text and inverting their embeddings to text.

This project was done as part of the course Advanced Topics in Deep Learning, TAU.

Installation

Run on isolated Python 3.8+ environment, and ensure submodules are updated.

Install the forked packages:

>> cd CLIP_benchmark
>> python setup.py install

>> cd vec2text
>> python setup.py install

Next, install the original CLIP-benchmark repository:

>> cd CLIP_benchmark
>> pip install -e .

Usage

Training

Training an aligner requires:

Creating datasets for source and target text-encoders embeddings; available via slurm-jobs/make_dataset.slurm.
training the aligner; available via slurm-jobs/train_to-{clip,text}.slurm.

Evaluation

The following describes how to evaluate an existing aligner, located in ./out/{aligner_dir}/, on different tasks.

Multimodal benchmarks - Image Classification

clip_benchmark eval --dataset cifar10 cifar100 imagenet1k --task zeroshot_classification --model source source+aligner target --pretrained NONE \
  --model_type our_experimental_models  --model_cache_dir "${OUT_DIR}"  \
  --output "${OUT_DIR}/benchmark_{dataset}_{model}_{task}.json" --batch_size 1024

Multimodal benchmarks - Text-Image Retrieval

clip_benchmark eval --dataset wds/flickr8k wds/flickr30k wds/mscoco_captions \
  --task zeroshot_retrieval --model source+aligner target --pretrained NONE \
  --dataset_root "https://huggingface.co/datasets/clip-benchmark/wds_{dataset_cleaned}/tree/main" \
  --model_type our_experimental_models  --model_cache_dir ${OUT_DIR}  \
   --output "${OUT_DIR}/benchmark_{dataset}_{model}_{task}.json" --batch_size 1024

Text Inversion with Vec2Text

python cli.py evaluate text_inversion__nq ${OUT_DIR}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
CLIP_benchmark @ a42c0c6		CLIP_benchmark @ a42c0c6
notebooks		notebooks
slurm-jobs		slurm-jobs
src		src
vec2text @ 5fbb913		vec2text @ 5fbb913
.gitmodules		.gitmodules
README.md		README.md
cli.py		cli.py
flow.png		flow.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aligning Representations Across Different Text Encoders

Installation

Usage

Training

Evaluation

Multimodal benchmarks - Image Classification

Multimodal benchmarks - Text-Image Retrieval

Text Inversion with Vec2Text

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Aligning Representations Across Different Text Encoders

Installation

Usage

Training

Evaluation

Multimodal benchmarks - Image Classification

Multimodal benchmarks - Text-Image Retrieval

Text Inversion with Vec2Text

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages