GitHub

Physics Informed Generation (PIGEN) of Crystal Structures

PIGEN (Physics Informed Generation) is a framework for generating novel crystal structures by integrating physics-informed sampling, chemically guided control, and structural evaluation into a denoising diffusion model. Building on DiffCSP, PIGEN enables targeted generation beyond known chemical spaces and supports out-of-distribution extrapolation. This enables the generation of chemically and structurally diverse, physically plausible crystal candidates, yielding a higher fraction of stable structures per batch and achieving greater chemical and structural diversity than frameworks such as DiffCSP or MatterGen (as demonstrated in our benchmarks).

Setup environment

conda env create -f environment.yml
conda activate pigen

All dependencies are managed via conda; setup.py is only for local package registration:

Install package (editable mode)

pip install -e .

Data used for training can be accessed at

https://huggingface.co/datasets/UoLiverpool/Alex_MP_20_M_LED/

Model training

For re-training the model with the default dataset - Alex_MP_20_MLED, run:

python pigen/train.py

This will use the default data and conditioning properties and is equivalent to

python pigen/train.py --data_name Alex_MP_20_M_LED --prop ['entropy_sum', 'target_energy']

Model Inference

You can use your trained model or download the model's checkpoint from: huggingface.co/DeepDrew/PIGEN/

After downloading, place the checkpoint file in:

checkpoints/

This ensures pigen/generate.py can locate it.

Run

cd pigen
python generate.py

Code Base and Key Contributions

This repository builds on DiffCSP, an open-source implementation of denoising diffusion probabilistic models for crystal structure prediction. We have further developed and extended it as described below.

Physics-informed logic integrated into the sampling process
Conditional generation with target-guided control via classifier-free guidance
Featurised dataset with local chemical and structural environment feature, enabling out-of-distribution extrapolation
Chemistry-informed structure evaluation tools
Modular refactoring for better reproducibility and configuration management.
Support for PyTorch Distributed Data Parallel to accelerate large-scale training across multiple GPUs or nodes

Tests

Run:

pytest tests

Run with Docker

Note: The Docker image is intentionally left with a flexible entry point (/bin/bash) to allow the user to either train or generate as needed, following the instructions below. This design choice supports both CPU and GPU environments.

docker build -t pigen .

To run with CPU only:

docker run --rm pigen

To run with GPU:

docker run --rm --gpus all pigen

Project structure

├── checkpoints
├── data
│   └── Alex_MP_20_M_LED/
├── environment.yml
├── log
├── pigen
│   ├── __init__.py
│   ├── assets/
│   ├── common/
│   ├── eval/
│   ├── generate.py
│   ├── normalization
│   ├── partial_sample.py
│   ├── settings.py
│   └── train.py
├── README.md
├── setup.py
├── tests
│   ├── dummy_data/
│   ├── dummy_logs/
│   ├── fixtures/
│   ├── conftest.py
│   ├── test_dependecies.py
│   ├── test_dummy_training.py
│   ├── test_pd_structure_parsing.py
│   └── test_torch_installation.py
└── verify_environment_installs.py

License and Credit

The original repository DiffCSP licensed under the MIT License. We retain this license and clearly mark any modified components. We gratefully acknowledge the authors of DiffCSP for their contribution to the research and open-source community.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
checkpoints		checkpoints
metrics		metrics
pigen.egg-info		pigen.egg-info
pigen		pigen
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py
verify_environment_installs.py		verify_environment_installs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Physics Informed Generation (PIGEN) of Crystal Structures

Setup environment

Install package (editable mode)

Data used for training can be accessed at

Model training

Model Inference

Code Base and Key Contributions

Tests

Run with Docker

To run with CPU only:

To run with GPU:

Project structure

License and Credit

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

lrcfmd/pigen

Folders and files

Latest commit

History

Repository files navigation

Physics Informed Generation (PIGEN) of Crystal Structures

Setup environment

Install package (editable mode)

Data used for training can be accessed at

Model training

Model Inference

Code Base and Key Contributions

Tests

Run with Docker

To run with CPU only:

To run with GPU:

Project structure

License and Credit

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages