Skip to content

chore: fast-forward releases/v0.1.0 to main#58

Merged
bmhowe23 merged 12 commits into
releases/v0.1.0from
main
Apr 8, 2026
Merged

chore: fast-forward releases/v0.1.0 to main#58
bmhowe23 merged 12 commits into
releases/v0.1.0from
main

Conversation

@ivanbasov
Copy link
Copy Markdown
Member

Fast-forward merge of main into releases/v0.1.0 to pick up post-QA commits without cherry-picking.

Commits being added:

Since releases/v0.1.0 is a direct ancestor of main, merging with Rebase and merge will preserve the original commit SHAs.

ivanbasov and others added 12 commits April 7, 2026 10:41
…ility (#43)

* fix(ci): disable torch.compile in orientation training to prevent segfault

torch.compile=on combined with DataLoader spawn workers during LER
validation causes a segfault (20 leaked semaphores, core dumped).
Set PREDECODER_TORCH_COMPILE=0 for the Train all orientations step.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Revert "fix(ci): disable torch.compile in orientation training to prevent segfault"

This reverts commit 7f0f6c8.

* fix(mid): seed BitMatrixSampler explicitly to restore test reproducibility

torch.manual_seed() does not control cuQuantum's BitMatrixSampler internal
RNG, so the two mid-GPU tests that relied on it for reproducibility were
non-deterministic and intermittently failing.

Add an optional `seed` parameter to `dem_sampling()` and
`MemoryCircuitTorch.generate_batch()`. When a seed is provided a fresh
BitMatrixSampler is always created with `Options(seed=N)`, resetting its
internal RNG and guaranteeing identical outputs on every call with the same
seed. Production paths (seed=None) are unaffected — the cached sampler is
reused as before.

Update the two failing tests to use the explicit seed kwarg instead of
torch.manual_seed():
- test_he_reduces_error_weight: seed=123
- test_full_pipeline_w2_reproducible: seed=100

Fixes: NVIDIA/Ising-Decoding CI run 23963347042

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix yapf line-break position in need_new condition

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add dem_sampling reproducibility tests for seed= parameter

Add TestDEMSamplingReproducibility to test_dem_sampling.py with four cases:
- same seed on CPU produces bit-exact identical frames
- different seeds produce different frames
- unseeded calls still reuse the cached sampler (perf regression guard)
- same seed on GPU produces bit-exact identical frames (GPU-only)

These tests use stochastic p values (0.1–0.9) so they would have caught
the original regression: before the seed= fix, BitMatrixSampler's internal
RNG was not reset between calls, making "same seed" reproducibility
impossible regardless of torch.manual_seed().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: use torch.Generator for seeded path; BitMatrixSampler RNG is not seedable

Options.__init__() does not accept a 'seed' keyword — the cuST
BitMatrixSampler's internal RNG is not exposed via the public API.

Replace the attempted Options(seed=N) approach with a small pure-torch
fallback (_torch_dem_sampling) that uses a local torch.Generator seeded
to the requested value.  This path is only taken when seed= is explicitly
passed (tests); the production BitMatrixSampler cache path is unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass seed directly to BitMatrixSampler constructor

BitMatrixSampler accepts seed as a constructor kwarg (not via Options).
Replace the torch fallback workaround with the correct cuST API:
pass seed= directly to BitMatrixSampler(..., seed=seed).

A fresh sampler is created on every seeded call so its internal RNG is
reset to the requested seed, guaranteeing identical outputs on repeated
calls with the same value.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* DISTANCE and N_ROUNDS updates

Signed-off-by: Ben Howe <bhowe@nvidia.com>

* Formatting updates

Signed-off-by: Ben Howe <bhowe@nvidia.com>

* Revert "Formatting updates"

This reverts commit 757f378.

---------

Signed-off-by: Ben Howe <bhowe@nvidia.com>
add B200, H200 remove A100
reformat title and header, product positioning
* adding decode_batch path in failure_analysis and vectorizing observable projection

Signed-off-by: Sachin Pisal <spisal@nvidia.com>

* pass syndromes as list-of-lists to cudaq decode_batch

Signed-off-by: Sachin Pisal <spisal@nvidia.com>

* implementing feedback

Signed-off-by: Sachin Pisal <spisal@nvidia.com>

---------

Signed-off-by: Sachin Pisal <spisal@nvidia.com>
…ccurate} (#51)

* fix(ci): disable torch.compile in orientation training to prevent segfault

torch.compile=on combined with DataLoader spawn workers during LER
validation causes a segfault (20 leaked semaphores, core dumped).
Set PREDECODER_TORCH_COMPILE=0 for the Train all orientations step.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Revert "fix(ci): disable torch.compile in orientation training to prevent segfault"

This reverts commit 7f0f6c8.

* feat: rename pretrained models to Ising-Decoder-SurfaceCode-1-{Fast,Accurate}

- Rename PreDecoderModelMemory_r9_v1.0.77.pt  → Ising-Decoder-SurfaceCode-1-Fast.pt
- Rename PreDecoderModelMemory_r13_v1.0.86.pt → Ising-Decoder-SurfaceCode-1-Accurate.pt
- Models remain Git LFS-tracked via models/*.pt (no storage change)
- Add model_checkpoint_file direct-path option to _load_model so named
  pretrained files (without epoch numbers) can be loaded without directory scanning
- Update test_inference_public_model.py, README, and checkpoint_to_safetensors.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Update config_qec_decoder_r9_fp8.yaml

change model 1 to model name

* Update conf/config_qec_decoder_r9_fp8.yaml

---------

Co-authored-by: Ben Howe <141149032+bmhowe23@users.noreply.github.com>
* Update config_qec_decoder_r13_fp8.yaml

refer to model 4 as Ising-Decoder-SurfaceCode-1-Accurate

* Update conf/config_qec_decoder_r13_fp8.yaml

---------

Co-authored-by: Ben Howe <141149032+bmhowe23@users.noreply.github.com>
* Fix export of fp8 ONNX files

Signed-off-by: Ben Howe <bhowe@nvidia.com>

* test: add fp8 calibration dtype regression test for #52

`_collect_calibration_dets` returns uint8; casting to float32 before
passing to mq.quantize triggered an INVALID_ARGUMENT error from the
ONNX runtime ("expected: tensor(uint8), got: tensor(float)").
The new test mirrors the existing int8 variant and asserts that the
fp8 path preserves the original uint8 dtype and forwards the
FP8-specific kwargs (op_types_to_quantize, high_precision_dtype).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Signed-off-by: Ben Howe <bhowe@nvidia.com>
Co-authored-by: Ivan Basov <ibasov@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: find_best_model now accepts named .pt files without epoch numbers

The old code required filenames to start with PreDecoderModelMemory_ and
encode an epoch number. After the model rename to Ising-Decoder-SurfaceCode-1-
{Fast,Accurate}.pt, copying one of these files into the models dir and running
inference via local_run.sh would fail with "Found 0 model files".

Fall back to any .pt file (sorted, last wins) when no epoch-numbered
PreDecoderModelMemory_ checkpoints are found in the directory.

Fixes regression reported in #51

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: fix yapf formatting in find_best_model

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: suppress double measurement noise injection in MemoryCircuit final data qubit measurement

_add_stabilizer_round(logical_measurement=True) correctly injects time-reversed
measurement noise then restores self.noise_model before returning. The subsequent
add_measure() call sees noise_model is not None and injects the same p_meas noise
a second time, creating phantom error channels in the DEM.

Temporarily clear self.noise_model around the final add_measure() call, matching
the pattern already used inside _add_stabilizer_round itself.

* Update code/qec/surface_code/memory_circuit.py

* fix: suppress double measurement noise injection in MemoryCircuit + tests

Fixes double p_meas injection on data qubits in MemoryCircuit.__init__.
_add_stabilizer_round(logical_measurement=True) injects the time-reversed
"fake SPAM" error and then restores self.noise_model before returning.
The subsequent add_measure(data_qubits) at the call site saw a non-None
noise_model and injected the same p_meas channel a second time, creating
phantom DEM error channels (7/21/43 extra entries at d=3/5/7) that
distorted PyMatching's matching graph and biased LER estimates.

Fix: temporarily suppress self.noise_model around add_measure(data_qubits),
matching the pattern already used inside _add_stabilizer_round itself.

Also adds:
- Regression test in TestNoiseModel verifying exactly one measurement-error
  injection appears in the post-REPEAT circuit section (not two).
- Updates to TestLERComparison in test_boundary_detectors.py: replaces
  strict ler_with_bd < ler_no_bd assertions with a 1.5x tolerance check.
  The strict assertions were accidentally passing because phantom DEM
  entries were artificially inflating no-BD LER; the true BD improvement
  is a marginal 1-3% effect below the statistical resolution of 10-20k
  samples.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: remove duplicate data-qubit measurement + yapf formatting

- Remove duplicate orig_noise_model/add_measure/restore block introduced
  during cherry-pick conflict resolution (caused non-deterministic detectors)
- Collapse assertLessEqual arguments onto single line for yapf compliance

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Ivan Basov <5455484+ivanbasov@users.noreply.github.com>
Co-authored-by: Ivan Basov <ibasov@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Add standalone script to generate evaluation test data

Builds a Stim memory circuit with the 25-parameter noise model, samples
syndrome data, extracts the DEM check matrices (H, O, priors) via
beliefmatching, and runs a baseline PyMatching decode. Outputs are saved
in a custom binary format for downstream pre-decoder benchmarking.

Signed-off-by: Scott Thornton <wsttiger@gmail.com>

* Formatting

Signed-off-by: Scott Thornton <wsttiger@gmail.com>

---------

Signed-off-by: Scott Thornton <wsttiger@gmail.com>
@bmhowe23 bmhowe23 merged commit 0925137 into releases/v0.1.0 Apr 8, 2026
26 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants