Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,62 @@ Notes:
- TensorRT workflows (`ONNX_WORKFLOW=2` or `3`) require `tensorrt` and `modelopt`.
- FP8 quantization failure is fatal. INT8 failure falls back to the FP32 ONNX model silently.
- ONNX and engine files are written to the current working directory.
- `ONNX_WORKFLOW` is also honoured by the `decoder_ablation` workflow — see below.

### Decoder ablation study with cudaq-qec (optional)

The `decoder_ablation` workflow compares multiple global decoders on the residual syndromes left
by the neural pre-decoder. It supports both PyTorch and TensorRT backends for the pre-decoder
and GPU-accelerated global decoders from the `cudaq-qec` package (`cudaq_qec`).

**PyTorch pre-decoder + cudaq-qec global decoders:**

```bash
# Requires: cudaq-qec (cudaq_qec), ldpc, beliefmatching, scipy
WORKFLOW=decoder_ablation bash code/scripts/local_run.sh
```

**TRT pre-decoder + cudaq-qec global decoders (full GPU pipeline):**

The same `ONNX_WORKFLOW` variable used for `inference` also applies here. When a TRT engine is
active, the neural pre-decoder runs via TensorRT (fast, quantised inference) while `cudaq-qec`
decoders handle the residual syndromes on GPU — combining fast TRT inference with
GPU-accelerated global decoding end-to-end.

```bash
# Export ONNX, build TRT engine, run ablation (TRT pre-decoder + cudaq-qec)
ONNX_WORKFLOW=2 WORKFLOW=decoder_ablation bash code/scripts/local_run.sh

# INT8 quantized TRT pre-decoder + cudaq-qec
ONNX_WORKFLOW=2 QUANT_FORMAT=int8 WORKFLOW=decoder_ablation bash code/scripts/local_run.sh

# Load a previously built engine, then run ablation
ONNX_WORKFLOW=3 WORKFLOW=decoder_ablation bash code/scripts/local_run.sh
```

The ablation study reports per-decoder logical error rates, convergence statistics for
`cudaq-qec` BP variants, residual syndrome weight distributions, and timing breakdowns.
Results are written to `outputs/<EXPERIMENT_NAME>/plots/`.

**Decoder variants benchmarked:**

| Decoder | Source | Notes |
|---|---|---|
| No-op | — | Pre-decoder output only, no global correction |
| Union-Find | `ldpc` | Fast, sub-optimal |
| BP-only | `ldpc` | Belief propagation, no OSD |
| BP+LSD-0 | `ldpc` | BP with localized statistics decoding |
| Uncorr-PM | PyMatching | Uncorrelated minimum-weight perfect matching |
| Corr-PM | PyMatching | Correlated MWPM (best classical baseline) |
| cudaq-BP | `cudaq-qec` | Sum-product BP on GPU |
| cudaq-MinSum | `cudaq-qec` | Min-sum BP on GPU |
| cudaq-BP+OSD-0/7 | `cudaq-qec` | BP + ordered statistics decoding |
| cudaq-MemBP | `cudaq-qec` | Memory-based min-sum BP |
| cudaq-MemBP+OSD | `cudaq-qec` | Memory BP + OSD |
| cudaq-RelayBP | `cudaq-qec` | Sequential relay composition |

`cudaq-qec` decoders are loaded automatically when `cudaq_qec` is importable; the study
degrades gracefully to the non-cudaq decoders if the package is absent.

### GPU selection

Expand Down
Loading
Loading