B-Free

Official PyTorch implementation of the CVPR 2025 paper "A Bias-Free Training Paradigm for More General AI-generated Image Detection".

News

TODO: batch processing support
2026-01-14: dataset of new generators released
2025-06-13: code released
2025-06-09: dataset of viral images released
2025-05-28: training data released
2025-04-03: Updated arXiv version!
2024-12-23: Paper has been uploaded on arXiv

Overview

A well-designed forensic detector should detect generator specific artifacts rather than reflect data biases. To this end, we propose B-Free, a bias-free training paradigm for AI-generated image detection, where fake images are generated from real ones using the conditioning procedure of stable diffusion models. This ensures semantic alignment between real and fake images, allowing any differences to stem solely from the subtle artifacts introduced by AI generation. Through content-based augmentation, we show significant improvements in both generalization and robustness over state-of-the-art detectors and more calibrated results.

Training Paradigm

To avoid possible biases, we generate synthetic images from self-conditioned reconstructions of real images and include augmentation in the form of inpainted versions. To ensure fake images semantically match the content of real images, we exploit the conditioning mechanism of Stable Diffusion 2.1. More specifically:

For self-conditioning, we use an empty mask (all zeroes) to induce the diffusion steps to regenerate the input.
For the content augmentation, we use an object mask to replace an object with a new one of same category, or a rectangular mask to replace it with a new one of a different category.

Besides the default inpainting, which regenerates the whole image, we consider also a version where the original background is restored (figure above).

Architecture

We adopt a variant of the ViT network proposed in [1] with four registers and use the pretraining based on DINOv2 [2]. During training, we avoid resizing the image and rely on large crops of 504 × 504 pixels. At inference time, we extract crops of 504 × 504 pixels (if the image is larger we average the results of multiple crops).
The official python implementation of this architecture is found here.

[1] Timothée Darcet et al., Vision Transformers Need Registers. (ICLR 2024)
[2] Maxime Oquab et al., DINOv2: Learning Robust Visual Features without Supervision. (TMLR 2024)

Augmentation strategies

Our approach adopts the inpainted++ strategy, which includes self-conditioned images, content augmentation (local inpainting), blurring, JPEG compression, scaling, cut-out, noise, and jittering.

Code

The code is available in the code folder.

Training set

You can download our training dataset here.

Note: training a method on this dataset only partially reflects the inpainted+ version of our augmentation strategy, as the dataset does not include blurring, JPEG, and the extra augmentations of inpainted++ (scaling, cut-out, noise addition, jittering), which are performed at training time.

	Real	Generated
Source	COCO	SD 2.1
Num.	51,517	309,102

To create this dataset we used images collected from the training set of MS-COCO dataset (discarding images with licenses different than Creative Commons) and images generated with Stable Diffusion 2.1.
Specifically, we first extracted the largest central crop and resize it to 512 x 512. Then, for the generations, we used the inpainting code from the official Stable Diffusion 2.1 repository. Note that we did NOT embed the watermark during generation (put_watermark function).

Further details can be found in appendix A of the paper.

Dataset of viral images

You can download our dataset of viral images here.

It comprises real and fake images that went viral on internet and includes multiple web-scraped versions of each image, for a total of about 1400 images.

Dataset of new generators

At this link we provide the extended version of the synthbuster dataset used in our experiments, including images generated with additional generators. Following the same procedure of synthbuster, fake images are generated using captions extracted from real images.

It comprises:

1K images from RAISE, which we resized to a dimension so that the area is about 1024x1024,
1K images generated with FLUX, using captions from RAISE
1K images generated with Stable Diffusion 3.5, using captions from RAISE

License

This software should be used, reproduced and modified only for informational and nonprofit purposes.

By downloading and/or using any of these files, you implicitly agree to all the terms of the license, as specified in the document LICENSE.txt (included in this package)

Bibtex

@InProceedings{Guillaro2024biasfree,
   author    = {Guillaro, Fabrizio and Zingarini, Giada and Usman, Ben and Sud, Avneesh and Cozzolino, Davide and Verdoliva, Luisa},
   title     = {A Bias-Free Training Paradigm for More General AI-generated Image Detection},
   booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
   month     = {June},
   year      = {2025},
   pages     = {18685-18694}
}

Acknowledgments

We gratefully acknowledge the support of this research by a Google Gift. In addition, this work has received funding from the European Union under the Horizon Europe vera.ai project, Grant Agreement number 101070093, and was partially supported by SERICS (PE00000014) under the MUR National Recovery and Resilience Plan, funded by the European Union - NextGenerationEU.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
code		code
docs		docs
training_data		training_data
viral_images_dataset		viral_images_dataset
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

B-Free

News

Overview

Training Paradigm

Architecture

Augmentation strategies

Code

Training set

Dataset of viral images

Dataset of new generators

License

Bibtex

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

B-Free

News

Overview

Training Paradigm

Architecture

Augmentation strategies

Code

Training set

Dataset of viral images

Dataset of new generators

License

Bibtex

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages