chore: fast-forward releases/v0.1.0 to main by ivanbasov · Pull Request #58 · NVIDIA/Ising-Decoding

ivanbasov · 2026-04-08T23:26:13Z

Fast-forward merge of main into releases/v0.1.0 to pick up post-QA commits without cherry-picking.

Commits being added:

Fix export of fp8 ONNX files #52 Fix export of fp8 ONNX files
fix: find_best_model accepts named .pt files without epoch numbers #55 fix: find_best_model accepts named .pt files without epoch numbers
fix: suppress double measurement noise injection in MemoryCircuit #53 fix: suppress double measurement noise injection in MemoryCircuit
Add standalone script to generate evaluation test data #57 Add standalone script to generate evaluation test data

Since releases/v0.1.0 is a direct ancestor of main, merging with Rebase and merge will preserve the original commit SHAs.

…ility (#43) * fix(ci): disable torch.compile in orientation training to prevent segfault torch.compile=on combined with DataLoader spawn workers during LER validation causes a segfault (20 leaked semaphores, core dumped). Set PREDECODER_TORCH_COMPILE=0 for the Train all orientations step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "fix(ci): disable torch.compile in orientation training to prevent segfault" This reverts commit 7f0f6c8. * fix(mid): seed BitMatrixSampler explicitly to restore test reproducibility torch.manual_seed() does not control cuQuantum's BitMatrixSampler internal RNG, so the two mid-GPU tests that relied on it for reproducibility were non-deterministic and intermittently failing. Add an optional `seed` parameter to `dem_sampling()` and `MemoryCircuitTorch.generate_batch()`. When a seed is provided a fresh BitMatrixSampler is always created with `Options(seed=N)`, resetting its internal RNG and guaranteeing identical outputs on every call with the same seed. Production paths (seed=None) are unaffected — the cached sampler is reused as before. Update the two failing tests to use the explicit seed kwarg instead of torch.manual_seed(): - test_he_reduces_error_weight: seed=123 - test_full_pipeline_w2_reproducible: seed=100 Fixes: NVIDIA/Ising-Decoding CI run 23963347042 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * style: fix yapf line-break position in need_new condition Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: add dem_sampling reproducibility tests for seed= parameter Add TestDEMSamplingReproducibility to test_dem_sampling.py with four cases: - same seed on CPU produces bit-exact identical frames - different seeds produce different frames - unseeded calls still reuse the cached sampler (perf regression guard) - same seed on GPU produces bit-exact identical frames (GPU-only) These tests use stochastic p values (0.1–0.9) so they would have caught the original regression: before the seed= fix, BitMatrixSampler's internal RNG was not reset between calls, making "same seed" reproducibility impossible regardless of torch.manual_seed(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: use torch.Generator for seeded path; BitMatrixSampler RNG is not seedable Options.__init__() does not accept a 'seed' keyword — the cuST BitMatrixSampler's internal RNG is not exposed via the public API. Replace the attempted Options(seed=N) approach with a small pure-torch fallback (_torch_dem_sampling) that uses a local torch.Generator seeded to the requested value. This path is only taken when seed= is explicitly passed (tests); the production BitMatrixSampler cache path is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: pass seed directly to BitMatrixSampler constructor BitMatrixSampler accepts seed as a constructor kwarg (not via Options). Replace the torch fallback workaround with the correct cuST API: pass seed= directly to BitMatrixSampler(..., seed=seed). A fresh sampler is created on every seeded call so its internal RNG is reset to the requested seed, guaranteeing identical outputs on repeated calls with the same value. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* DISTANCE and N_ROUNDS updates Signed-off-by: Ben Howe <bhowe@nvidia.com> * Formatting updates Signed-off-by: Ben Howe <bhowe@nvidia.com> * Revert "Formatting updates" This reverts commit 757f378. --------- Signed-off-by: Ben Howe <bhowe@nvidia.com>

add B200, H200 remove A100

reformat title and header, product positioning

* adding decode_batch path in failure_analysis and vectorizing observable projection Signed-off-by: Sachin Pisal <spisal@nvidia.com> * pass syndromes as list-of-lists to cudaq decode_batch Signed-off-by: Sachin Pisal <spisal@nvidia.com> * implementing feedback Signed-off-by: Sachin Pisal <spisal@nvidia.com> --------- Signed-off-by: Sachin Pisal <spisal@nvidia.com>

…ccurate} (#51) * fix(ci): disable torch.compile in orientation training to prevent segfault torch.compile=on combined with DataLoader spawn workers during LER validation causes a segfault (20 leaked semaphores, core dumped). Set PREDECODER_TORCH_COMPILE=0 for the Train all orientations step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "fix(ci): disable torch.compile in orientation training to prevent segfault" This reverts commit 7f0f6c8. * feat: rename pretrained models to Ising-Decoder-SurfaceCode-1-{Fast,Accurate} - Rename PreDecoderModelMemory_r9_v1.0.77.pt → Ising-Decoder-SurfaceCode-1-Fast.pt - Rename PreDecoderModelMemory_r13_v1.0.86.pt → Ising-Decoder-SurfaceCode-1-Accurate.pt - Models remain Git LFS-tracked via models/*.pt (no storage change) - Add model_checkpoint_file direct-path option to _load_model so named pretrained files (without epoch numbers) can be loaded without directory scanning - Update test_inference_public_model.py, README, and checkpoint_to_safetensors.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Update config_qec_decoder_r9_fp8.yaml change model 1 to model name * Update conf/config_qec_decoder_r9_fp8.yaml --------- Co-authored-by: Ben Howe <141149032+bmhowe23@users.noreply.github.com>

* Update config_qec_decoder_r13_fp8.yaml refer to model 4 as Ising-Decoder-SurfaceCode-1-Accurate * Update conf/config_qec_decoder_r13_fp8.yaml --------- Co-authored-by: Ben Howe <141149032+bmhowe23@users.noreply.github.com>

* Fix export of fp8 ONNX files Signed-off-by: Ben Howe <bhowe@nvidia.com> * test: add fp8 calibration dtype regression test for #52 `_collect_calibration_dets` returns uint8; casting to float32 before passing to mq.quantize triggered an INVALID_ARGUMENT error from the ONNX runtime ("expected: tensor(uint8), got: tensor(float)"). The new test mirrors the existing int8 variant and asserts that the fp8 path preserves the original uint8 dtype and forwards the FP8-specific kwargs (op_types_to_quantize, high_precision_dtype). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Signed-off-by: Ben Howe <bhowe@nvidia.com> Co-authored-by: Ivan Basov <ibasov@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: find_best_model now accepts named .pt files without epoch numbers The old code required filenames to start with PreDecoderModelMemory_ and encode an epoch number. After the model rename to Ising-Decoder-SurfaceCode-1- {Fast,Accurate}.pt, copying one of these files into the models dir and running inference via local_run.sh would fail with "Found 0 model files". Fall back to any .pt file (sorted, last wins) when no epoch-numbered PreDecoderModelMemory_ checkpoints are found in the directory. Fixes regression reported in #51 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * style: fix yapf formatting in find_best_model Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: suppress double measurement noise injection in MemoryCircuit final data qubit measurement _add_stabilizer_round(logical_measurement=True) correctly injects time-reversed measurement noise then restores self.noise_model before returning. The subsequent add_measure() call sees noise_model is not None and injects the same p_meas noise a second time, creating phantom error channels in the DEM. Temporarily clear self.noise_model around the final add_measure() call, matching the pattern already used inside _add_stabilizer_round itself. * Update code/qec/surface_code/memory_circuit.py * fix: suppress double measurement noise injection in MemoryCircuit + tests Fixes double p_meas injection on data qubits in MemoryCircuit.__init__. _add_stabilizer_round(logical_measurement=True) injects the time-reversed "fake SPAM" error and then restores self.noise_model before returning. The subsequent add_measure(data_qubits) at the call site saw a non-None noise_model and injected the same p_meas channel a second time, creating phantom DEM error channels (7/21/43 extra entries at d=3/5/7) that distorted PyMatching's matching graph and biased LER estimates. Fix: temporarily suppress self.noise_model around add_measure(data_qubits), matching the pattern already used inside _add_stabilizer_round itself. Also adds: - Regression test in TestNoiseModel verifying exactly one measurement-error injection appears in the post-REPEAT circuit section (not two). - Updates to TestLERComparison in test_boundary_detectors.py: replaces strict ler_with_bd < ler_no_bd assertions with a 1.5x tolerance check. The strict assertions were accidentally passing because phantom DEM entries were artificially inflating no-BD LER; the true BD improvement is a marginal 1-3% effect below the statistical resolution of 10-20k samples. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: remove duplicate data-qubit measurement + yapf formatting - Remove duplicate orig_noise_model/add_measure/restore block introduced during cherry-pick conflict resolution (caused non-deterministic detectors) - Collapse assertLessEqual arguments onto single line for yapf compliance Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Ivan Basov <5455484+ivanbasov@users.noreply.github.com> Co-authored-by: Ivan Basov <ibasov@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add standalone script to generate evaluation test data Builds a Stim memory circuit with the 25-parameter noise model, samples syndrome data, extracts the DEM check matrices (H, O, priors) via beliefmatching, and runs a baseline PyMatching decode. Outputs are saved in a custom binary format for downstream pre-decoder benchmarking. Signed-off-by: Scott Thornton <wsttiger@gmail.com> * Formatting Signed-off-by: Scott Thornton <wsttiger@gmail.com> --------- Signed-off-by: Scott Thornton <wsttiger@gmail.com>

ivanbasov and others added 12 commits April 7, 2026 10:41

Update TRAINING.md (#45)

599c57d

add B200, H200 remove A100

Update README.md (#46)

8d01c82

reformat title and header, product positioning

Update config_qec_decoder_r9_fp8.yaml (#50)

0f73eea

* Update config_qec_decoder_r9_fp8.yaml change model 1 to model name * Update conf/config_qec_decoder_r9_fp8.yaml --------- Co-authored-by: Ben Howe <141149032+bmhowe23@users.noreply.github.com>

Update config_qec_decoder_r13_fp8.yaml (#49)

47a8622

* Update config_qec_decoder_r13_fp8.yaml refer to model 4 as Ising-Decoder-SurfaceCode-1-Accurate * Update conf/config_qec_decoder_r13_fp8.yaml --------- Co-authored-by: Ben Howe <141149032+bmhowe23@users.noreply.github.com>

bmhowe23 merged commit 0925137 into releases/v0.1.0 Apr 8, 2026
26 of 28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: fast-forward releases/v0.1.0 to main#58

chore: fast-forward releases/v0.1.0 to main#58
bmhowe23 merged 12 commits into
releases/v0.1.0from
main

ivanbasov commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

ivanbasov commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants