High-resolution image reconstruction with latent diffusion models from human brain activity

Takagi and Nishimoto, CVPR 2023

[Paper] [Technical Paper] [Project Page] [FAQ] [FAQ(Japanese)]

General Information

This is a repository for reproducing the method we presented (Takagi and Nishimoto, CVPR 2023) for visual experience reconstruction from brain activity using Stable Diffusion.

Based on our earlier work (Takagi and Nishimoto, CVPR 2023), we further examined the extent to which various additional decoding techniques affect the performance of reconstructing visual experience in a following Technical Paper, including a method of decoding text prompt from the brain (see figure below). These methods are also available in this repository.

We confirmed that adding several techniques contribute to improving the accuracy from Takagi and Nishimoto CVPR 2023. In the figure below, for each method, three generated images from different stochastic noise were randomly chosen.

Environment setup

Download nsddata, nsddata_betas, and nsddata_stimuli from NSD and place them in the nsd directory. These are the required files:

nsddata_betas/ppdata/subj01/func1pt8mm/betas_fithrf_GLMdenoise_RR/betas_session01.hdf5 (Note: You will need the betas_sessionXX.hdf5 files for sessions 01 to 37)
nsddata_stimuli/stimuli/nsd/nsd_stimuli.hdf5
nsddata/experiments/nsd/nsd_expdesign.mat
nsddata/freesurfer/fsaverage/label/streams.mgz.ctab
nsddata/ppdata/subj01/behav/responses.tsv
nsddata/ppdata/subj01/func1pt8mm/roi/streams.nii.gz

Navigate to codes/diffusion_sd1/stable-diffusion/ and run the command conda env create -f environment.yaml. This command will create a conda environment named ldm. Activate this environment with the command conda activate ldm. This environment is used for all methods except the last one (Reconstruction with Decoded Text Prompt + GAN + Decoded Depth).
For the last method (Reconstruction with Decoded Text Prompt + GAN + Decoded Depth), navigate to codes/diffusion_sd2/stablediffusion/ and run the command conda env create -f environment.yaml to create a separate conda environment named ldm2, then activate it.
Download checkpoint (sd-v1-4.ckpt), and place it under the codes/diffusion_sd1/stable-diffusion/models/ldm/stable-diffusion-v1/ directory.
For incorporating GAN, download VGG_ILSVRC_19_layers and bvlc_reference_caffenet_generator_ILSVRC2012_Training from https://figshare.com/articles/dataset/brain-decoding-cookbook/21564384, and place them under the codes/gan/models/pytorch/ directory.
For incorporating decoded depth, download checkpoint (512-depth-ema.ckpt), and place it under the codes/diffusion_sd2/stablediffusion/models/ directory.

MRI Preprocessing

cd codes/utils/
python make_subjmri.py --subject subj01

Reconstruction based on CVPR method

cd codes/utils/
python img2feat_sd.py --imgidx 0 73000 --gpu 0
python make_subjstim.py --featname init_latent --use_stim each --subject subj01
python make_subjstim.py --featname init_latent --use_stim ave --subject subj01
python make_subjstim.py --featname c --use_stim each --subject subj01
python make_subjstim.py --featname c --use_stim ave --subject subj01
python ridge.py --target c --roi ventral --subject subj01
python ridge.py --target init_latent --roi early --subject subj01

cd codes/diffusion_sd1/
python diffusion_decoding.py --imgidx 0 --gpu 1 --subject subj01 --method cvpr

Reconstruction with Decoded Text Prompt

cd codes/caption/BLIP/
python img2feat_blip.py --gpu 0

cd codes/utils/
python make_subjstim.py --featname blip --use_stim ave --subject subj01
python make_subjstim.py --featname blip --use_stim each --subject subj01
python ridge.py --target blip --roi early ventral midventral midlateral lateral parietal  --subject subj01

cd codes/caption/BLIP/
python decode_captions.py --subject subj01

cd codes/diffusion_sd1/
python diffusion_decoding.py --imgidx 0 --gpu 1 --subject subj01 --method text

Reconstruction with Decoded Text Prompt + GAN

cd codes/gan/bdpy/
python setup.py install

cd codes/gan/
python make_vgg19bdpy.py --imgidx 0 73000

(run the following code from conv1_1 to fc8.)
python make_subjstim_vgg19.py --layer conv1_1 --subject subj01
python ridge.py --target conv1_1 --roi early ventral midventral midlateral lateral parietal --subject subj01

python make_vgg19fromdecode.py --subject subj01
python recon_icnn_image_vgg19_dgn_relu7gen_gd.py

cd codes/diffusion_sd1/
python diffusion_decoding.py --imgidx 0 --gpu 0 --subject subj01 --method gan

Reconstruction with Decoded Text Prompt + GAN + Decoded Depth

Need to pip install -U transformers. Note that this update may cause BLIP to stop working. It is recommended to do this in a different environment.

cd codes/depth/
python img2feat_dpt.py --imgidx 0 73000 --gpu 0

cd codes/utils/
(run the following code from dpt_emb0 to dpt_emb3.)
python make_subjstim.py --featname dpt_emb0 --use_stim ave --subject subj01
python make_subjstim.py --featname dpt_emb0 --use_stim each --subject subj01
python ridge.py --target dpt_emb0 --roi early ventral midventral midlateral lateral parietal --subject subj01
python ridge.py --target dpt_emb0 --roi early ventral midventral midlateral lateral parietal --subject subj01

cd codes/depth/
python dptemb2dpt.py --gpu 0 --subject subj01

cd codes/diffusion_sd2/
python diffusion_decoding.py --imgidxs 0 1 --gpu 0 --subject subj01

Evaluation

cd codes/utils/
python img2feat_decoded.py --gpu 0 --subject subj01 --method cvpr
python identification.py --usefeat inception --subject subj01 --method cvpr

Citation

Original paper.

@inproceedings{takagi2023high,
  title={High-resolution image reconstruction with latent diffusion models from human brain activity},
  author={Takagi, Yu and Nishimoto, Shinji},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14453--14463},
  year={2023}
}

Following technical report.

@misc{takagi2023improving,
      title={Improving visual image reconstruction from human brain activity using latent diffusion models via multiple decoded inputs}, 
      author={Takagi, Yu and Nishimoto, Shinji},
      year={2023},
      eprint={2306.11536},
      archivePrefix={arXiv},
      primaryClass={q-bio.NC}
}

Acknowledgement

Our codebase builds on these repositories. We would like to thank the authors.

https://github.com/CompVis/stable-diffusion

https://github.com/Stability-AI/stablediffusion

https://github.com/gallantlab/himalaya

https://github.com/tknapen/nsd_access

https://github.com/salesforce/BLIP

https://github.com/KamitaniLab/bdpy

https://github.com/KamitaniLab/brain-decoding-cookbook-public

https://github.com/isl-org/DPT

Contact

Plase email if you have any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
codes		codes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
results_tech_paper.jpg		results_tech_paper.jpg
visual_summary.jpg		visual_summary.jpg
visual_summary_techpaper.jpg		visual_summary_techpaper.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

High-resolution image reconstruction with latent diffusion models from human brain activity

General Information

Environment setup

MRI Preprocessing

Reconstruction based on CVPR method

Reconstruction with Decoded Text Prompt

Reconstruction with Decoded Text Prompt + GAN

Reconstruction with Decoded Text Prompt + GAN + Decoded Depth

Evaluation

Citation

Acknowledgement

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

High-resolution image reconstruction with latent diffusion models from human brain activity

General Information

Environment setup

MRI Preprocessing

Reconstruction based on CVPR method

Reconstruction with Decoded Text Prompt

Reconstruction with Decoded Text Prompt + GAN

Reconstruction with Decoded Text Prompt + GAN + Decoded Depth

Evaluation

Citation

Acknowledgement

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages