fix: suppress double measurement noise injection in MemoryCircuit#53
Conversation
…al data qubit measurement _add_stabilizer_round(logical_measurement=True) correctly injects time-reversed measurement noise then restores self.noise_model before returning. The subsequent add_measure() call sees noise_model is not None and injects the same p_meas noise a second time, creating phantom error channels in the DEM. Temporarily clear self.noise_model around the final add_measure() call, matching the pattern already used inside _add_stabilizer_round itself.
|
/ok to test fa7076d |
ivanbasov
left a comment
There was a problem hiding this comment.
Two contributions on top of the fix:
1. Inline comment on the noise_model suppression (memory_circuit.py)
The three-line guard is non-obvious. Suggest adding a comment so future readers understand why it exists (see inline).
2. Regression test — test_noise_model.py
Add this method to TestNoiseModel. It parses the post-REPEAT circuit section and asserts exactly one measurement-error injection appears on data qubits — the legitimate fake-SPAM line — not two. Deterministic, CPU-only, would have caught this bug.
def test_no_double_measurement_noise_in_final_data_qubit_readout(self):
"""
Regression test for double measurement-noise injection on data qubits at the end of
MemoryCircuit.__init__ when using the 25-parameter NoiseModel.
_add_stabilizer_round(logical_measurement=True) injects a single "fake SPAM" error on
data qubits (time-reversed p_meas) and then restores self.noise_model before returning.
Without the fix the subsequent add_measure(data_qubits) call at the __init__ call site
would see a non-None noise_model and inject the same p_meas channel a *second* time,
creating phantom DEM error entries that bias LER/threshold estimates.
The fix suppresses noise_model around that add_measure call. This test verifies that
the post-REPEAT circuit section contains exactly ONE measurement-error injection on data
qubits (the legitimate fake-SPAM line), not two.
"""
D = 3
T = 3 # n_rounds must be >= 3 for the circuit to use a REPEAT block
nm = NoiseModel(
p_prep_X=0.01,
p_prep_Z=0.02,
p_meas_X=0.03, # non-zero: triggers double-injection if bug is present
p_meas_Z=0.04,
p_idle_cnot_X=0.002,
p_idle_cnot_Y=0.001,
p_idle_cnot_Z=0.003,
p_idle_spam_X=0.002,
p_idle_spam_Y=0.001,
p_idle_spam_Z=0.003,
**{f"p_cnot_{k}": 0.0005 for k in CNOT_ERROR_TYPES}
)
for basis in ("X", "Z"):
circ = MemoryCircuit(
distance=D,
idle_error=nm.get_max_probability(),
sqgate_error=nm.get_max_probability(),
tqgate_error=nm.get_max_probability(),
spam_error=nm.get_max_probability(),
n_rounds=T,
basis=basis,
noise_model=nm,
code_rotation="XV",
)
circ.set_error_rates()
# Isolate the circuit section that appears after the REPEAT block.
lines = circ.circuit.split("\n")
in_repeat = False
after_repeat = False
post_repeat_lines = []
for line in lines:
stripped = line.strip()
if stripped.startswith("REPEAT"):
in_repeat = True
continue
if in_repeat and stripped == "}":
in_repeat = False
after_repeat = True
continue
if after_repeat:
post_repeat_lines.append(stripped)
# Basis-labelled semantics for data-qubit readout failure:
# X-basis measurement error -> Z_ERROR(p_meas_X)
# Z-basis measurement error -> X_ERROR(p_meas_Z)
# The only legitimate occurrence in the post-REPEAT section is the single fake-SPAM
# injection inside _add_stabilizer_round(logical_measurement=True). A second line
# with the same instruction is the regression.
if basis == "X":
error_instr = "Z_ERROR"
p_meas = float(nm.p_meas_X)
else:
error_instr = "X_ERROR"
p_meas = float(nm.p_meas_Z)
meas_error_lines = [l for l in post_repeat_lines if l.startswith(error_instr)]
self.assertEqual(
len(meas_error_lines), 1,
f"basis={basis}: expected exactly 1 {error_instr} line in post-REPEAT section "
f"(fake-SPAM only), got {len(meas_error_lines)}. "
f"Double injection would indicate the noise_model suppression fix is missing. "
f"Lines: {meas_error_lines}"
)
# Confirm the single line carries the correct probability.
expected_prefix = f"{error_instr}({p_meas:.10f})"
self.assertTrue(
meas_error_lines[0].startswith(expected_prefix),
f"basis={basis}: expected {error_instr} with p={p_meas:.10f}, "
f"got: {meas_error_lines[0]}"
)3. Fix two flaky CI failures — test_boundary_detectors.py
test_ler_improves_with_bd_noise_model and test_ler_improves_with_bd_all_orientations fail in CI on this branch. Root cause: both tests assert ler_with_bd < ler_no_bd (strict improvement). Those assertions were passing on main only because the phantom DEM entries from the bug artificially inflated no-BD LER. With the fix the phantom entries are gone; the true BD improvement is ~1–3%, well within the statistical noise of 10–20k independent samples (σ ≈ 1–3 errors). Verified empirically with 100k–200k samples: the difference is not reliably detectable at these parameters.
Fix: replace the strict assertLess / assertLessEqual with a 1.5× tolerance check. That bound is ~3σ detectable at these sample sizes and catches any genuine regression in BD logic.
test_ler_improves_with_bd_noise_model — replace the final assertion and update the docstring:
def test_ler_improves_with_bd_noise_model(self):
"""Test that boundary detectors do not significantly degrade LER when using NoiseModel.
NOTE on assertion strength: the LER improvement from boundary detectors is a marginal
~1-3% effect at these parameters. Asserting strict improvement (ler_with_bd <
ler_no_bd) is unreliable with sample sizes of 10k-50k because the two circuits are
sampled independently and the difference is well within statistical noise.
Before the double-measurement-noise fix the no-BD LER was *artificially* inflated by
phantom DEM entries, which made the strict-less assertion pass coincidentally. With the
corrected DEM the true improvement is small and we instead verify the weaker property:
boundary detectors must not increase LER by more than a factor of 1.5.
"""
# ... (keep existing circuit setup unchanged) ...
ratio = (ler_with_bd / ler_no_bd) if ler_no_bd > 0 else float("inf")
print(f" BD/no-BD ratio: {ratio:.2f}x")
# Boundary detectors must not substantially degrade LER.
self.assertLessEqual(
ler_with_bd,
ler_no_bd * 1.5,
f"BD degraded LER by more than 1.5x: no_bd={ler_no_bd:.4e}, with_bd={ler_with_bd:.4e}"
)test_ler_improves_with_bd_all_orientations — same change to the inner assertion:
self.assertLessEqual(
ler_with_bd,
ler_no_bd * 1.5,
f"rotation={rotation}: BD degraded LER by more than 1.5x: "
f"no_bd={ler_no_bd:.4e}, with_bd={ler_with_bd:.4e}"
)…ests Fixes double p_meas injection on data qubits in MemoryCircuit.__init__. _add_stabilizer_round(logical_measurement=True) injects the time-reversed "fake SPAM" error and then restores self.noise_model before returning. The subsequent add_measure(data_qubits) at the call site saw a non-None noise_model and injected the same p_meas channel a second time, creating phantom DEM error channels (7/21/43 extra entries at d=3/5/7) that distorted PyMatching's matching graph and biased LER estimates. Fix: temporarily suppress self.noise_model around add_measure(data_qubits), matching the pattern already used inside _add_stabilizer_round itself. Also adds: - Regression test in TestNoiseModel verifying exactly one measurement-error injection appears in the post-REPEAT circuit section (not two). - Updates to TestLERComparison in test_boundary_detectors.py: replaces strict ler_with_bd < ler_no_bd assertions with a 1.5x tolerance check. The strict assertions were accidentally passing because phantom DEM entries were artificially inflating no-BD LER; the true BD improvement is a marginal 1-3% effect below the statistical resolution of 10-20k samples. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
/ok to test 28ad1db |
- Remove duplicate orig_noise_model/add_measure/restore block introduced during cherry-pick conflict resolution (caused non-deterministic detectors) - Collapse assertLessEqual arguments onto single line for yapf compliance Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
/ok to test 016ed5e |
* fix: suppress double measurement noise injection in MemoryCircuit final data qubit measurement _add_stabilizer_round(logical_measurement=True) correctly injects time-reversed measurement noise then restores self.noise_model before returning. The subsequent add_measure() call sees noise_model is not None and injects the same p_meas noise a second time, creating phantom error channels in the DEM. Temporarily clear self.noise_model around the final add_measure() call, matching the pattern already used inside _add_stabilizer_round itself. * Update code/qec/surface_code/memory_circuit.py * fix: suppress double measurement noise injection in MemoryCircuit + tests Fixes double p_meas injection on data qubits in MemoryCircuit.__init__. _add_stabilizer_round(logical_measurement=True) injects the time-reversed "fake SPAM" error and then restores self.noise_model before returning. The subsequent add_measure(data_qubits) at the call site saw a non-None noise_model and injected the same p_meas channel a second time, creating phantom DEM error channels (7/21/43 extra entries at d=3/5/7) that distorted PyMatching's matching graph and biased LER estimates. Fix: temporarily suppress self.noise_model around add_measure(data_qubits), matching the pattern already used inside _add_stabilizer_round itself. Also adds: - Regression test in TestNoiseModel verifying exactly one measurement-error injection appears in the post-REPEAT circuit section (not two). - Updates to TestLERComparison in test_boundary_detectors.py: replaces strict ler_with_bd < ler_no_bd assertions with a 1.5x tolerance check. The strict assertions were accidentally passing because phantom DEM entries were artificially inflating no-BD LER; the true BD improvement is a marginal 1-3% effect below the statistical resolution of 10-20k samples. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: remove duplicate data-qubit measurement + yapf formatting - Remove duplicate orig_noise_model/add_measure/restore block introduced during cherry-pick conflict resolution (caused non-deterministic detectors) - Collapse assertLessEqual arguments onto single line for yapf compliance Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Ivan Basov <5455484+ivanbasov@users.noreply.github.com> Co-authored-by: Ivan Basov <ibasov@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
MemoryCircuitwhen using the 25-parameter noise model_add_stabilizer_round(logical_measurement=True)correctly injects time-reversed measurement noise, then restoresself.noise_modelbefore returning. The subsequentadd_measure()at line 1009 seesnoise_model is not Noneand injects the samep_measnoise a second time, creating phantom error channels in the DEMself.noise_modelaround the finaladd_measure()call (3 lines added), matching the pattern already used inside_add_stabilizer_rounditselfImpact
All inference users running with the default
config_public.yaml25-parameter noise model are affected. The bug is silent (no crash), but introduces phantom DEM error channels that do not correspond to any real physical noise process:These phantom entries include spurious links to the logical observable (
D20 L0,D21 L0), distorting PyMatching's matching graph weights and systematically biasing LER and threshold estimates.Verification
Circuit-level (deterministic, no GPU needed):
Before fix:
X_ERROR(0.002)/Z_ERROR(0.002)present on all d² data qubits before final measurement.After fix: No spurious noise lines before final measurement.
Test plan
WORKFLOW=inference bash code/scripts/local_run.shruns successfully with fix appliedNVBug
https://nvbugspro.nvidia.com/bug/6059125