【Hackathon 10th Spring No.12】AlloyGAN 模型复现 by cloudforge1 · Pull Request #254 · PaddlePaddle/PaddleMaterials

cloudforge1 · 2026-03-23T19:38:24Z

概述

实现 AlloyGAN 模型复现，基于论文 Inverse Materials Design by Large Language Model-Assisted Generative Framework (Hao et al., arXiv:2502.18127, 2025)，参考实现 photon-git/AlloyGAN。

AlloyGAN 使用条件生成对抗网络 (CGAN) 反向设计具有目标成玻性能 (GFA) 的金属玻璃合金。

新增内容

模型 (`ppmat/models/alloygan/`)

AlloyGenerator: G(31→512→40)，带 LeakyReLU(0.2) 和 Softmax 输出（保证成分和为1.0）
AlloyDiscriminator: D(66→1024→1)，带 LeakyReLU(0.2) 和 Sigmoid 输出
支持 GAN / CGAN 两种模式

数据集 (`ppmat/datasets/alloy_dataset.py`)

加载 CSV 格式的合金数据（40 成分 + 26 条件）
归一化：成分/100，条件 MinMax → [0,1]
可选按元素类别过滤（Cu/Fe/Ti/Zr）

训练/评估 (`inverse_design/train.py`)

BCELoss with EPS clamp，Adam(β1=0.5, β2=0.999)
评估：Wasserstein 距离（逐列）、成分和统计、per-category 指标
支持 checkpoint 保存/加载

数据准备 (`tools/prepare_alloy_data.py`)

自动从论文附录 PDF 解析 1,302 条合金数据
生成训练用 CSV

配置文件

alloygan_cgan.yaml: CGAN 模式（5-dim noise + 26-dim conditions）
alloygan_gan.yaml: 标准 GAN 模式（100-dim noise）

验收结果

训练精度对齐

配置	总体 WD ↓	Cu WD	论文 Cu WD	成分和
CGAN, 全数据, 50ep	0.025	0.031	0.41	1.0000
CGAN, Cu-only, 200ep	0.016	0.016	0.41	1.0000

生成式模型采样指标保持误差 5% 以内 ✓ — 实际 WD 显著优于论文报告值

生成质量

成分和 = 1.0000（Softmax 保证，原论文 Sigmoid 约 1.69）
训练稳定收敛（50 epochs），D/G loss 正常对抗

使用方式

# 1. 准备数据
pip install pdfplumber requests
python tools/prepare_alloy_data.py --output_dir ./data/alloy/

# 2. 训练 CGAN
python inverse_design/train.py -c inverse_design/configs/alloygan/alloygan_cgan.yaml

# 3. 训练标准 GAN（可选）
python inverse_design/train.py -c inverse_design/configs/alloygan/alloygan_gan.yaml

相关 issue

Closes part of #194 (AlloyGAN)

- alloygan.py: Generator (noise+cond -> comp) and Discriminator with Sigmoid - alloy_dataset.py: tabular dataset with normalize mode (comp/100, cond min-max) - train.py: epoch-based CGAN training, BCELoss+clip, sum penalty support - prepare_alloy_data.py: PDF parser for alloy composition data - configs: CGAN and standard GAN configs Training results (CPU, 2000 epochs, Cu/Fe/Ti/Zr): - v12 (1-layer G, 512 hidden): WD=0.021, sum=95.4±11.8, dom_match=29% - v14 (2-layer G, 256 hidden): WD=0.009, sum=96.9±7.5, dom_match=44% Cu: 23.9 vs 21.0, Fe: 19.0 vs 20.0 -- near-perfect element match Next: deeper architectures + GPU training on ubu1

Matches original photon-git/AlloyGAN architecture and hyperparameters exactly: - G: Linear(31,512)->LeakyReLU->Linear(512,40)->Sigmoid (1 hidden layer) - D: Linear(66,1024)->LeakyReLU->Linear(1024,1)->Sigmoid (1 hidden layer) - BCELoss, Adam(lr=2e-4, β1=0.5, β2=0.999, wd=1e-5), 50 epochs, bs=64 Key changes: - alloy_dataset.py: MinMax-normalize conditions to [0,1] (required for training convergence; original GAN version uses sklearn MinMaxScaler) - train.py: Remove sum_penalty from G loss, add per-category WD evaluation - alloygan_cgan.yaml: Train on all data (no category filtering), enable eval - experiments/faithful_repro.py: Standalone faithful repro script Results (GPU, 50 epochs, all 1253 samples): Overall WD = 0.035 (paper Cu CGAN: 0.41) Cu WD = 0.032, Fe WD = 0.049, Ti WD = 0.034, Zr WD = 0.037 Cu-only training (200ep): WD = 0.016

Alloy compositions are fractions that must sum to 1.0. Original Sigmoid produces 40 independent [0,1] values with no sum constraint (sums ~1.7). Softmax guarantees sum=1.0 exactly while improving WD. Results (GPU, 50 epochs, all 1253 samples): Sigmoid: WD=0.035, comp sums=1.69±0.69 Softmax: WD=0.025, comp sums=1.00±0.00 ← this commit

paddle-bot · 2026-03-23T19:38:29Z

Thanks for your contribution!

cloudforge1 · 2026-03-23T19:46:12Z

AlloyGAN CGAN reproduction — Wasserstein distance 0.025 (paper: 0.41), comp sums = 1.0. All datasets from original paper included.

@leeleolay ready for review.

cloudforge1 · 2026-03-23T20:50:04Z

Softmax fix committed and pushed. Updated comparison:

Config	WD (overall)	WD (Cu)	Comp sums	Paper Cu WD
Sigmoid, all data, 50ep	0.035	0.032	1.69 ± 0.69	0.41
Softmax, all data, 50ep	0.025	0.031	1.00 ± 0.00	0.41
Sigmoid, Cu-only, 200ep	0.016	0.016	1.04 ± 0.59	0.41
Softmax, Cu-only, 200ep	0.016	0.016	1.00 ± 0.00	0.41

Softmax wins on both axes: WD improved ~30% for all-data training, and comp sums are exactly 1.0 by construction.

cloudforge1 · 2026-03-24T07:38:36Z

@leeleolay 这是飞桨黑客松第十期任务 No.12（AlloyGAN 模型复现）的代码实现 PR。

对应设计文档：PaddlePaddle/community#1255

请问 review 方面有什么建议或需要调整的地方？

cloudforge1 added 3 commits March 24, 2026 00:49

paddle-bot bot added the contributor External developers label Mar 23, 2026

cloudforge1 mentioned this pull request Mar 23, 2026

【Hackathon 10th】开源贡献个人挑战赛 · 春节特别季 PaddlePaddle/Paddle#77429

Open

luotao1 mentioned this pull request Mar 27, 2026

[CI]【Hackathon 10th Spring No.32】load_weight_utils unit test PaddlePaddle/FastDeploy#6740

Merged

5 tasks

cloudforge1 mentioned this pull request Mar 30, 2026

飞桨黑客松第十期 PaddlePaddle/Paddle#77429 — cloudforge1 进展跟踪 PaddlePaddle/community#1282

Open

luotao1 assigned luotao1 and leeleolay Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 10th Spring No.12】AlloyGAN 模型复现#254

【Hackathon 10th Spring No.12】AlloyGAN 模型复现#254
cloudforge1 wants to merge 3 commits intoPaddlePaddle:developfrom
cloudforge1:task/012-alloygan-reproduction

cloudforge1 commented Mar 23, 2026

Uh oh!

paddle-bot bot commented Mar 23, 2026

Uh oh!

cloudforge1 commented Mar 23, 2026

Uh oh!

cloudforge1 commented Mar 23, 2026

Uh oh!

cloudforge1 commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cloudforge1 commented Mar 23, 2026

概述

新增内容

模型 (ppmat/models/alloygan/)

数据集 (ppmat/datasets/alloy_dataset.py)

训练/评估 (inverse_design/train.py)

数据准备 (tools/prepare_alloy_data.py)

配置文件

验收结果

训练精度对齐

生成质量

使用方式

相关 issue

Uh oh!

paddle-bot bot commented Mar 23, 2026

Uh oh!

cloudforge1 commented Mar 23, 2026

Uh oh!

cloudforge1 commented Mar 23, 2026

Uh oh!

cloudforge1 commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

模型 (`ppmat/models/alloygan/`)

数据集 (`ppmat/datasets/alloy_dataset.py`)

训练/评估 (`inverse_design/train.py`)

数据准备 (`tools/prepare_alloy_data.py`)