A comparative overview of multimodal decoding paradigms.
Overall architecture of BrainFLORA.
- 2025/12/21, we released the preprocessed datasets and pretrained checkpoints on 🤗 Hugging Face.
- 2025/07/15, the arxiv paper is public.
- 2025/07/12, we officially released the code.
- 2025/07/05, BrainFLORA is accepted by ACM MM 2025.
Option 1: Using setup script (Recommended)
bash setup.sh
conda activate BrainFLORAOption 2: Using conda environment file
conda env create -f environment.yml
conda activate BrainFLORAOption 3: Using pip
pip install -r requirements.txtImportant: Install as editable package
After setting up the environment using any of the above options, install the project in editable mode to enable proper module imports:
pip install -e .We provide preprocessed datasets ready for training on Hugging Face:
from datasets import load_dataset
# Load the preprocessed BrainFLORA dataset
dataset = load_dataset("LidongYang/BrainFLORA")To download and preprocess the raw data yourself:
| Dataset | Download path | Dataset | Download path |
|---|---|---|---|
| THINGS-EEG1 | Download | THINGS-EEG2 | Download |
| THINGS-MEG | Download | THINGS-fMRI | Download |
| THINGS-Images | Download |
After downloading, use the preprocessing scripts in data_preparing/ directory to process the raw data.
We provide the script to train the modality encoders for joint subject training in THINGS-EEG2 dataset. Please modify your dataset path and run:
python Retrieval/retrieval_joint_train_medformer.py --logger True --gpu cuda:0 --output_dir ./outputs/contrastAdditionally, replicate the results of other modalities (e.g. MEG, fMRI) by running:
# MEG
python Retrieval/retrieval_joint_train_MEG_medformer.py --logger True --gpu cuda:0 --output_dir ./outputs/contrast
# fMRI
python Retrieval/retrieval_joint_train_fMRI_medformer.py --logger True --gpu cuda:0 --output_dir ./outputs/contrastWe provide quick training and inference scripts for high level and low level pipeline of visual reconstruction. Please modify your dataset path and run:
# Train and get multimodal neural embeddings aligned with CLIP embedding:
python train/train_unified_encoder_highlevel_diffprior.py \
--modalities eeg meg fmri \
--gpu cuda:0 \
--output_dir ./outputs/contrastWe provide scripts for visual caption generation:
# Train feature adapter with caption support
python train/train_unified_encoder_highlevel_diffprior_caption.py \
--modalities eeg meg fmri \
--gpu cuda:0 \
--output_dir ./outputs/contrastFor multi-GPU training with accelerate:
accelerate launch train/train_unified_encoder_highlevel_diffprior_parallel.py \
--modalities eeg meg fmri \
--output_dir ./outputs/contrastWe provide the script to evaluation the models:
cd eval/
FLORA_inference.ipynb
# Reconstruct images by assigning modalities and subjects:
cd eval/
python FLORA_inference_reconst.py
# Get captions from prior latent
cd eval/
FLORA_inference_caption.ipynb
If you find our work useful, please consider citing:
@inproceedings{li2025brainflora,
author = {Li, Dongyang and Qin, Haoyang and Wu, Mingyang and Wei, Chen and Liu, Quanying},
title = {BrainFLORA: Uncovering Brain Concept Representation via Multimodal Neural Embeddings},
year = {2025},
isbn = {9798400720352},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3746027.3754996},
doi = {10.1145/3746027.3754996},
booktitle = {Proceedings of the 33rd ACM International Conference on Multimedia},
pages = {5577–5586}
}
@article{li2024visual,
title={Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion},
author={Li, Dongyang and Wei, Chen and Li, Shiying and Zou, Jiachen and Liu, Quanying},
journal={Advances in Neural Information Processing Systems},
volume={37},
pages={102822--102864},
year={2024}
}
@inproceedings{wei2024cocog,
title={CoCoG: controllable visual stimuli generation based on human concep08/03/2024t representations},
author={Wei, Chen and Zou, Jiachen and Heinke, Dietmar and Liu, Quanying},
booktitle={Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence},
pages={3178--3186},
year={2024}
}
1.Thanks to Y Song et al. for their contribution in data set preprocessing and neural network structure, we refer to their work:"Decoding Natural Images from EEG for Object Recognition". Yonghao Song, Bingchuan Liu, Xiang Li, Nanlin Shi, Yijun Wang, and Xiaorong Gao.
2.We also thank the authors of SDRecon for providing the codes and the results. Some parts of the training script are based on MindEye and MindEye2. Thanks for the awesome research works.
3.Here we provide the THING-EEG2 dataset cited in the paper: "A large and rich EEG dataset for modeling human visual object recognition". Alessandro T. Gifford, Kshitij Dwivedi, Gemma Roig, Radoslaw M. Cichy.
4.Another used THINGS-MEG and THINGS-fMRI data set provides a reference:"THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior". Hebart, Martin N., Oliver Contier, Lina Teichmann, Adam H. Rockter, Charles Y. Zheng, Alexis Kidder, Anna Corriveau, Maryam Vaziri-Pashkam, and Chris I. Baker.
5.We use the "BrainHub" for visual caption evaluation from "UMBRAE: Unified Multimodal Brain Decoding (ECCV 2024)" Xia, Weihao and de Charette, Raoul and Oztireli, Cengiz and Xue, Jing-Hao.
Contact Dongyang Li if you have any questions or suggestions.
This repository is released under the MIT license. See LICENSE for additional details.

