ZXDiagramSimplification

Code for the paper: "Optimizing Quantum Circuits via ZX Diagrams using Reinforcement Learning and Graph Neural Networks"

This repository implements the Appendix C setting (compact 8-feature representation), including:

Tree-search training with PPO (main_tree.py)
Flat PPO baseline (ppo.py)
BQSKit integration (bqskit_pass.py)
Compiler benchmarking (bench_compilers.py)

1. How the pipeline works

High-level workflow (see WORKFLOW.md for the full walkthrough):

Circuit -> ZX graph (pyzx_environment/zx_env/env.py)
Observation wrapping -> GraphMask (utils.py):
- Expanded graph representation
- Action mask of applicable ZX rewrite rules per node
Tree search over rewrite trajectories (TreePolicy.py)
Agent scoring via BundleNet (models/bundle_net.py)
PPO optimization (main_tree.py or ppo.py)
Best ZX state -> extracted optimized circuit

The main optimization target is reducing two-qubit gate count (CNOT/CZ).

2. Setup

2.1 Python environment

Use Python 3.10+ (recommended) and create a fresh environment:

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip

2.2 Install dependencies

Install core dependencies (adjust CUDA-specific Torch wheels as needed):

pip install torch torchvision torchaudio
pip install torch-geometric
pip install hydra-core omegaconf gymnasium numpy tqdm tensorboard ray pyzx

Install the local ZX environment package:

pip install -e ./pyzx_environment

Optional dependencies for benchmarking / compiler integration:

pip install bqskit qiskit qiskit-ibm-transpiler pandas seaborn matplotlib

3. Running training

All training entry points are Hydra-based and accept config group overrides from conf/.

3.1 Quick smoke test (tree-search training)

python -u main_tree.py \
	+algorithm=PPO \
	+model=GATActionModel \
	+env=simple \
	exp_name="smoke_test" \
	env.num_envs=2 \
	algorithm.total_timesteps=50000 \
	algorithm.num_steps=32 \
	max_tree_size=64 \
	multi_range=2 \
	device="cpu"

3.2 Main training command (paper-style tree-search run)

python -u main_tree.py \
	+algorithm=PPO \
	exp_name="20MIO_32_envs_5qubit_128treesize" \
	+model=GATActionModel \
	+env=more_complex_more_rules_ranges \
	env.num_envs=32 \
	model.model_type="ActionAtt" \
	model.n_message_passing=4 \
	algorithm.total_timesteps=20_000_000 \
	algorithm.num_steps=129 \
	algorithm.learning_rate=3e-3 \
	max_tree_size=128 \
	multi_range=4 \
	env.n_qubits=5 \
	device="cpu"

3.3 Flat PPO variant

python -u ppo.py \
	+algorithm=PPO \
	+model=GATActionModel \
	+env=more_complex_more_rules_ranges \
	exp_name="flat_ppo_run"

4. Ray parallelism notes

Large-scale runs are intended to use Ray for distributed rollouts.
For multi-node deployment, configure your Ray cluster before launching training.
Reference: https://docs.ray.io/en/latest/ray-overview/getting-started.html

5. Checkpoints and logs

Training outputs are written under runs/.

TensorBoard logs: runs/<run_name>/
Saved models: runs/<run_name>/saves/model-<step>.pth
Saved optimal paths: runs/<run_name>/saves/data-<step>.pkl

Launch TensorBoard with:

tensorboard --logdir runs

6. Benchmarking and inference

6.1 Benchmark compilers

bench_compilers.py CLI:

python bench_compilers.py <output_pickle> <searchdepth> <mq_ratio> <h_ratio> <t_ratio>

Example:

python bench_compilers.py results.pkl 4 1.0 0.0 0.0

Notes:

Some benchmark options depend on external optimizer code that is not distributed in this repository.
If you do not use those external methods, disable/comment the corresponding benchmark paths.

6.2 Use a trained model in BQSKit flows

bqskit_pass.py loads a model path from ZX_MODEL_PATH (or falls back to a default path).

export ZX_MODEL_PATH="runs/<run_name>/saves/model-<step>.pth"

Then run your BQSKit-based workflow (for examples, see bench_compilers.py and bqskit_pass.py).

7. Useful files

main_tree.py: Tree-search PPO training loop
ppo.py: Flat PPO baseline
TreePolicy.py: Tree data structure and batched policy/value forward pass
models/: Model definitions (BundleNet, TreeNet, ActionModel, etc.)
utils.py: Observation wrappers and GraphMask handling
pyzx_environment/: ZX RL environment package
WORKFLOW.md: Full end-to-end algorithmic walkthrough

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
archive		archive
conf		conf
dataset		dataset
models		models
runs/QACI_graph_optim__20MIO_32_envs_changed_rules_optimized_5qubit_multirange_4_depth_16__1__1726148732		runs/QACI_graph_optim__20MIO_32_envs_changed_rules_optimized_5qubit_multirange_4_depth_16__1__1726148732
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
GraphSimilarity.py		GraphSimilarity.py
Plotter.ipynb		Plotter.ipynb
Readme.md		Readme.md
SHAP.ipynb		SHAP.ipynb
ShapExplainer.py		ShapExplainer.py
TreePolicy.py		TreePolicy.py
WORKFLOW.md		WORKFLOW.md
bench_circuits.py		bench_circuits.py
bench_compilers.py		bench_compilers.py
benchmark_utils.py		benchmark_utils.py
bqskit_pass.py		bqskit_pass.py
brute_force_CX_opt.py		brute_force_CX_opt.py
data_generation.py		data_generation.py
graph.py		graph.py
greedy_random_picking.py		greedy_random_picking.py
harris_bench.py		harris_bench.py
main_tree.py		main_tree.py
model.pkl		model.pkl
multiEnv.py		multiEnv.py
ppo.py		ppo.py
test.ipynb		test.ipynb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ZXDiagramSimplification

1. How the pipeline works

2. Setup

2.1 Python environment

2.2 Install dependencies

3. Running training

3.1 Quick smoke test (tree-search training)

3.2 Main training command (paper-style tree-search run)

3.3 Flat PPO variant

4. Ray parallelism notes

5. Checkpoints and logs

6. Benchmarking and inference

6.1 Benchmark compilers

6.2 Use a trained model in BQSKit flows

7. Useful files

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ZXDiagramSimplification

1. How the pipeline works

2. Setup

2.1 Python environment

2.2 Install dependencies

3. Running training

3.1 Quick smoke test (tree-search training)

3.2 Main training command (paper-style tree-search run)

3.3 Flat PPO variant

4. Ray parallelism notes

5. Checkpoints and logs

6. Benchmarking and inference

6.1 Benchmark compilers

6.2 Use a trained model in BQSKit flows

7. Useful files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages