Skip to content

Releases: micdoh/XLRON

XLRON v1.0.0

03 May 23:48

Choose a tag to compare

XLRON v1.0.0

The first major release of XLRON — a JAX-based simulation framework for resource allocation in optical networks. One library, every execution mode: a fast simulation engine that runs entirely on accelerator hardware, an integrated PPO trainer, classical heuristics, capacity bound estimators, an end-to-end differentiable physical layer, and a browser GUI.

Documentation: https://micdoh.github.io/XLRON/


What's in this release

Reproduces results for these papers

  • XLRON: Accelerated Reinforcement Learning Environments for Optical NetworksDoherty, Beghelli — OFC 2024.
    The original announcement of XLRON, focused on the JAX implementation, GPU parallelism, and the DeepRMSA reproduction.

  • Reinforcement Learning with Graph Attention for Routing and Wavelength Assignment with Lightpath ReuseDoherty, Beghelli — ONDM 2025 (arXiv:2502.14741).
    A Graph-Attention RL agent for the RWA-with-Lightpath-Reuse formulation. → Reproduce all figures

  • Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope?Doherty, Beghelli — JOCN 17(9), D1 (2025), DOI 10.1364/JOCN.559990 (arXiv:2406.01919).
    A systematic survey of ~100 RL-for-RMSA papers, reproducing five highly-cited published RL solutions in matched simulation settings. → Reproduce all figures

  • Comparison of Dynamic Elastic Optical Network Capacity Bound Estimation MethodsDoherty*, Deng*, Beghelli, Toni, Bayvel — submitted to ECOC 2026.
    Systematic comparison of cut-set, resource-prioritized defragmentation, and KSP heuristic bounds across the 118 real-world topologies of TopologyBench. → Reproduce all figures

  • XLRON: A Framework for Hardware-Accelerated and Differentiable Simulation of Optical NetworksDoherty, Beghelli, Jarmolovičius, Deng, Killey, Bayvel, Toni — in preparation, targeted at JOCN.
    Architecture, cross-library benchmarks, ISRS GN + DRA validation against the Gerard et al. 2025 C+L-band experiment, differentiable-simulation case studies, and the GUI/CLI. → Reproduce all figures

  • Graph Transformers and Stabilized Reinforcement Learning for Large-Scale Dynamic Routing, Modulation and Spectrum Allocation in Elastic Optical NetworksDoherty, Beghelli, Toni — in preparation, targeted at JOCN.
    First transformer trained from scratch with RL for dynamic RMSA. Beats the strongest heuristic on every standard benchmark and scales to USA100 (100 nodes) and TataInd (143 nodes). → Reproduce all figures

Speed

  • 6 × 10⁶ steps/s for RMSA on a single A100 with 2,048 parallel envs.
  • 300× higher single-device throughput than Flex Net Sim (the fastest single-core CPU library) once GPU parallelism is engaged.
  • 222–1,494× wall-clock speedup over DeepRMSA / Optical-RL-Gym for end-to-end RL training on the canonical DeepRMSA benchmark, while also achieving lower blocking via invalid action masking.
  • Near-linear scaling with the number of parallel environments on GPU.
  • See the Speed & comparisons page for the cross-library numbers.

Browser GUI

  • New tabbed Streamlit GUI exposing every option, every execution mode, every preset.
  • Launch with a single command: xlron. Point your browser at the printed URL.
  • Tabs: Setup, Model & Training, Physical Layer, Logging & Output. Every flag has a descriptive tooltip.
  • All execution modes covered: RL training, heuristic evaluation, model evaluation, capacity bound estimation, differentiable optimization.
  • Save and reload complete configurations as presets (e.g. the gerard2025 C+L-band 90-channel system).
  • Live output stream with the constructed CLI command, training progress, blocking probability, and other metrics as they appear.
  • Live render visualisation of per-link spectrum allocation, network topology with the current request highlighted, blocking probability, utilisation, and request details.
  • See the GUI page.

Graph Transformer policy (--USE_TRANSFORMER)

  • Built-in Graph Transformer actor-critic with Wavelet-Induced Rotary Encodings (WiRE) for graph positional encoding.
  • To our knowledge, the first transformer trained from scratch with RL on dynamic RMSA, and the first RL method from the standard benchmarks to consistently match or beat the strongest heuristics.
  • Three training stabilisations make this work with PPO + invalid action masking:
    1. Off-policy invalid action masking (--OFF_POLICY_IAM) — importance ratios computed against the unmasked policy so gradients flow through masked actions.
    2. Valid mass stabilisation (--VALID_MASS_LOSS_COEF, --VML_SCHEDULE, --IAM_GATING, --IAM_DAMPING) — log-barrier loss + per-step damping + hard gating that keep the unmasked policy from collapsing off the valid action set.
    3. Pre-LayerNorm — eliminates warm-up, keeps gradients well-behaved.
  • Separate value-function optimizer (--SEPARATE_VF_OPTIMIZER, --VF_LR) for stable critic training.
  • Scales to USA100 (100 nodes) and TataInd (143 nodes) — the largest dynamic RMSA instances ever attempted with RL.
  • See the Graph Transformer page.

Physical layer: ISRS GN model + Distributed Raman Amplification

  • Closed-form ISRS GN model with Nyquist subchannels and EGN correction.
  • Distributed Raman Amplification (DRA) support via isrs_gn_model_dra.py — Raman-pump-aware NLI and ASE calculations using a Friis-cascaded hybrid Raman+EDFA noise figure.
  • ODE-based pump fitting at env creation time (fit_dra_params_triangular() solves the Raman ODE via jax.experimental.ode.odeint, fits semi-analytical profiles via jaxopt.LevenbergMarquardt).
  • Validated to within 0.5 dB against the Gerard et al. 2025 record-throughput C+L-band experiment.
  • See the Physical layer page.

End-to-end differentiable simulation

  • The first fully differentiable optical-network simulator. Gradients flow end-to-end through the physical layer and the resource-allocation logic.
  • Enables gradient-based pump power optimisation and direct gradient-based RSA optimisation.
  • diff_utils.py provides differentiable approximations using straight-through gradient estimators and temperature-controlled soft functions: differentiable_where, differentiable_compare, differentiable_argmax, differentiable_round/ceil/floor, differentiable_index, differentiable_one_hot_index_update, differentiable_cond.
  • Toggle with --differentiable (default False); tune sharpness with --temperature.
  • See the Differentiable simulation page and the Differentiable DRA notes.

All 119 real-world topologies from TopologyBench

  • Every topology from TopologyBench (Matzner et al. 2024) bundled out of the box, in both directed and undirected variants.
  • Switch topologies with a single flag — from small academic networks up to USA100 (100 nodes, 342 directed links) and TataInd (143 nodes, 362 directed links).
  • The historically standard research topologies are also bundled: NSFNET, COST239, USNET, JPN48, German17, CONUS, 5-node.
  • See the Topologies page.

Documentation

Other improvements

  • Stabilised PPO with reward centering (--REWARD_CENTERING), prioritized experience replay (--PRIO_ALPHA, --PRIO_BETA0), VTrace-style importance ratio clipping (--RHO_CLIP, --C_CLIP), and optional separate value-function optimizer.
  • New cut-set and reconfigurable-routing capacity bound estimators (xlron.bounds.cutsets_bounds, xlron.bounds.reconfigurable_routing_bounds).
  • 13 path heuristics: ksp_ff, ksp_lf, ksp_bf, ksp_mu, ff_ksp, lf_ksp, bf_ksp, mu_ksp, kmc_ff, kmf_ff, kme_ff, kca_ff.
  • Path-sort criteria: spectral_resources, hops, distance, hops_distance, capacity.
  • Pre-commit hooks and CI gating; ruff format + ty type checking adopted across the codebase.
  • Reorganised experimental/ directory to colocate code, results, and figures per paper.

Installation

git clone https://github.com/micdoh/XLRON.git
cd XLRON
uv sync           # base
uv sync --group dev   # dev dependencies
uv sync --group gpu   # if running on GPU
uv sync --group tpu   # if running on TPU

Citing

If you use XLRON in published work, please cite the relevant paper(s) listed at https://micdoh.github.io/XLRON/papers/. If you use the bundled TopologyBench topologies, please also cite TopologyBench.

Acknowledgements

This work was supported by the EPSRC grant EP/S022139/1 (Centre for Doctoral Training in Connected Electronic and Photonic Systems) and EPSRC Programme Gr...

Read more

Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope?

27 Dec 11:18

Choose a tag to compare

All code used to produce results in the paper:

"Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope?"

Prior to execution, it is recommended to set up a virtual environment using the pyproject.toml or requirements.txt in the repository. Instruction can be found in the online documentation: https://micdoh.github.io/XLRON/

To generate data, execute the Bash scripts in this folder, after replacing PYTHON_PATH and SCRIPT_PATH variables with absolute paths to the virtual environment python interpreter and the generate_data script, respectively:
./data_processing/JOCN2024/generate_data

To generate plots, execute the python scripts and jupyter notebooks in this folder, replacing file path variables to point to newly generated data CSV's:
./data_processing/JOCN2024/generate_plots