diff --git a/artifacts/requirements.yaml b/artifacts/requirements.yaml index 78a9775..dadce16 100644 --- a/artifacts/requirements.yaml +++ b/artifacts/requirements.yaml @@ -2688,13 +2688,19 @@ artifacts: server — e.g. a converging tree handed in first-appearance order) is likewise rejected (PlpError::NonTopologicalSinkTree) rather than panicking inside the sink-tree projection; reordering such trees so - PLP can fire is REQ-NC-PLP-002. This ports - panco's PURE polynomial PLP LP (no TFA/SFA strengthening): - bounds are always ≤ TFA and strictly inside the soundness sandwich - EXACT ≤ plp ≤ TFA, matching panco's polynomial-PLP reference to - solver precision — exact on 2 of 3 validation fixtures, within - 5.6% of EXACT on the 3-hop line (72.22 vs 68.4 µs, still ≪ the - 134.7 µs TFA). PLP works beyond the tandem-only LUDB/PMOO: 72.22 + PLP can fire is the follow-on REQ-NC-PLP-CONVERGE-001 (which + repurposes that same guard as a post-relabel invariant). This ports + panco's PURE polynomial PLP LP (no TFA/SFA strengthening): bounds are + SOUND (≥ EXACT) and, on the validation fixtures, also ≤ TFA. But pure + PLP and TFA are INCOMPARABLE over-approximations — neither dominates; + under extreme cross-burst pure PLP can exceed TFA (panco's own + reference does so on the converging-bridge net: PLP 1148 > TFA 1048 ≥ + EXACT 1035 µs — the network_wide_tfa_index_aligns_on_converging_tree + counterexample). The authoritative per-flow bound is therefore + min(PLP, TFA) (tracked as REQ-NC-PLP-MIN-001). On the validation + fixtures PLP matches panco's polynomial-PLP reference to solver + precision — exact on 2 of 3, within 5.6% of EXACT on the 3-hop line + (72.22 vs 68.4 µs, still ≪ the 134.7 µs TFA). PLP works beyond the tandem-only LUDB/PMOO: 72.22 ≪ 134.7 on a 3-server LINE is the dominance the existing tools cannot show. NOTE: this is PORT-FIDELITY against panco, not independent method-soundness (the LP is transcribed from panco and @@ -2719,14 +2725,12 @@ artifacts: spar-network shall extend PLP (REQ-NC-PLP-001) from feed-forward sink-trees to GENERIC topology — including CYCLIC service dependencies — via panco's forest decomposition (edges_forest) and - the LP fixpoint over the decomposed trees. Includes the smaller - precursor of topologically REORDERING a single-successor converging - sink-tree whose indices arrive non-topological (today rejected by - REQ-NC-PLP-001 as PlpError::NonTopologicalSinkTree) so PLP fires on - the common many-talkers→one-listener TSN shape — a relabelling, sound - because TFA/PLP are invariant under server renumbering, but gated on - the same converging-tree panco fixtures below before emitting numbers. - This carries the + the LP fixpoint over the decomposed trees. The smaller single- + successor precursor (topologically RELABELLING a converging sink-tree + whose indices arrive non-topological so PLP fires on the common + many-talkers→one-listener TSN shape) is split out as the now-shipped + REQ-NC-PLP-CONVERGE-001; this requirement is the GENERIC/CYCLIC core + that remains. This carries the headline claim REQ-NC-PLP-001 cannot test: PLP CONVERGES where TFA diverges at high utilization. That divergence is a cyclic- dependency phenomenon (TFA's fixpoint terminates on any feed- @@ -2739,6 +2743,43 @@ artifacts: arXiv:2010.09263; Tabatabaee/Bouillard/Le Boudec arXiv:2208.11400] status: proposed tags: [network-calculus, plp, cyclic, substrate, tier1] + fields: + release: v0.20.0 + + - id: REQ-NC-PLP-CONVERGE-001 + type: requirement + title: "PLP on converging sink-trees via topological relabel" + description: > + spar-network shall fire PLP (REQ-NC-PLP-001) on single-successor + CONVERGING sink-trees whose server indices arrive in NON-topological + order — the common many-talkers→one-listener TSN shape handed in + first-appearance order, which REQ-NC-PLP-001 shipped REJECTING + (PlpError::NonTopologicalSinkTree) rather than panicking inside the + sink-tree projection. plp_bound computes a topological relabelling of + servers and applies it as a pure permutation BEFORE projection, so the + sink lands at the highest index — the invariant the per-flow sub-network + solve assumes (sink == num_servers-1). This is SOUND with no change to + the bound because TFA/PLP are invariant under server renumbering, and a + graph already validated single-successor + acyclic is a rooted in-forest + that ALWAYS admits a topological order. The former NonTopologicalSinkTree + guard is not deleted but REPURPOSED as a post-relabel invariant check + (after relabel no edge may point to a lower-numbered server; a violation + means the topo sort is buggy → Err, never the .expect panic this guard + was added to prevent). GATE: per-flow bounds (a) match panco's pinned + converging-tree reference to solver precision, sound (≥ EXACT) and — + on these benign converging nets — below TFA (PLP ≤ TFA holds per- + fixture here; it is NOT universal, see REQ-NC-PLP-001's incomparability + note and counterexample), and (b) equal — exactly — the bounds of the SAME + physical network presented already-topologically (permutation + invariance, the primary property). Pinned converging fixtures from + panco: converge_2to1 (PLP 48.89 ≪ TFA 60.39 µs), converge_2to1_tail2 + (60.0 ≪ 106.44, the 1.77× wedge), converge_3to1 (70.0 ≪ 74.59). This is + the single-successor precursor split out of REQ-NC-PLP-002; the + GENERIC/CYCLIC core and the "PLP converges where TFA diverges" headline + remain in REQ-NC-PLP-002 (gated on not-yet-generated cyclic fixtures). + [SOLID — Bouillard arXiv:2010.09263] + status: implemented + tags: [network-calculus, plp, converging, substrate, tier1] fields: release: v0.19.0 @@ -2764,6 +2805,31 @@ artifacts: fields: release: v0.19.0 + - id: REQ-NC-PLP-MIN-001 + type: requirement + title: "Authoritative per-flow NC bound = min(PLP, TFA)" + description: > + The network-wide NC bridge (REQ-NC-BRIDGE-001) shall report, per flow, + the TIGHTER of its PLP and TFA bounds as the authoritative end-to-end + delay. MOTIVATION: pure polynomial PLP (REQ-NC-PLP-001) and TFA are + INCOMPARABLE over-approximations — both sound (≥ exact), but neither + dominates. PLP is usually tighter, yet under extreme cross-burst it can + exceed TFA (the live counterexample + network_wide_tfa_index_aligns_on_converging_tree: panco's own PLP 1148 + > TFA 1048 ≥ EXACT 1035 µs, reproduced to the decimal by spar). Taking + min(PLP, TFA) makes the reported bound never worse than TFA while + keeping the PLP win wherever it helps — so "PLP > TFA sometimes" is a + non-issue, not a regression. SCOPE: pure presentation/selection over the + two already-computed bounds; does NOT change either engine. This is + DISTINCT from REQ-NC-PLP-003 (which tightens the LP itself via TFA- + strengthening so PLP ≤ TFA holds by construction) — MIN-001 is the + cheap, always-correct selection that holds even for the pure LP. + GATE: on the converging-bridge fixture the bridge's authoritative bound + equals TFA (1048 µs) for the cross-burst flow, not the looser PLP. + [SOLID — Bouillard arXiv:2010.09263] + status: proposed + tags: [network-calculus, plp, bridge, substrate, tier1] + - id: REQ-NC-BRIDGE-001 type: requirement title: "AADL instance -> network-wide NC solver inputs (TfaFlow/PlpFlow)" @@ -2796,8 +2862,11 @@ artifacts: tagged tandem): TFA/PLP take ALL flows + ALL services at once. Use pmoo_or_sfa as the pattern for HOW to bridge (Stream -> solver input, fall back on failure), not WHAT to compute. Output: the network-wide - TFA and PLP end-to-end delay bound, where PLP <= TFA is the - demonstrable competitor wedge on a real model. + TFA and PLP end-to-end delay bound. PLP is usually the tighter bound + (the demonstrable competitor wedge on a real model), but PLP and TFA + are incomparable in general (REQ-NC-PLP-001) — the authoritative + per-flow bound is min(PLP, TFA) (REQ-NC-PLP-MIN-001); the bridge emits + both. FABRIC: TSN-first and TSN-only. NC (FIFO/CBS) is the CORRECT model for TSN/AVB and extract_network_graph already yields the NetworkGraph. diff --git a/artifacts/verification.yaml b/artifacts/verification.yaml index 6ff4d7f..6be2912 100644 --- a/artifacts/verification.yaml +++ b/artifacts/verification.yaml @@ -1810,8 +1810,10 @@ artifacts: (optimistic) bound, which a self-consistent hand-golden cannot. panco is not vendored; the reference numbers are pinned in-test with provenance + regen recipe in tests/oracles/panco/README.md (Python is - not in spar's CI path). Establishes the soundness floor that - REQ-NC-PLP-001 is gated against (sandwich EXACT <= plp <= TFA). + not in spar's CI path). Establishes the EXACT soundness floor that + REQ-NC-PLP-001 is gated against (plp >= EXACT). PLP <= TFA holds on + these fixtures but is NOT a universal law (PLP and TFA are + incomparable — see TEST-NC-PLP's incomparability note). fields: method: automated-test steps: @@ -1837,22 +1839,26 @@ artifacts: the SOUNDNESS floor (plp >= panco EXACT ELP bound), PORT-FIDELITY (spar's HiGHS solve reproduces panco's lp_solve pure-PLP optimum to <= 0.05% — solver precision: 39.0/34.0, 27.0, and the discriminating - 72.22/61.11/57.98 µs), and the SANDWICH ceiling (plp <= spar's own - TFA bound). Pure PLP equals EXACT on the tandems and is strictly - inside the sandwich on the 3-hop line (72.22 vs 68.4 µs EXACT, - << 134.7 µs TFA) — the dominance LUDB/PMOO cannot show on a line. + 72.22/61.11/57.98 µs), and a PER-FIXTURE pin plp <= spar's own TFA + bound. That ≤ TFA pin is NOT a universal law: pure polynomial PLP and + TFA are INCOMPARABLE over-approximations (both sound, neither + dominates) — under extreme cross-burst PLP can exceed TFA (panco's own + PLP 1148 > TFA 1048 ≥ EXACT 1035 µs on the converging-bridge net; see + network_wide_tfa_index_aligns_on_converging_tree). The authoritative + bound is min(PLP, TFA) (REQ-NC-PLP-MIN-001). On these benign fixtures + pure PLP equals EXACT on the tandems and is strictly below TFA on the + 3-hop line (72.22 vs 68.4 µs EXACT, << 134.7 µs TFA) — the dominance + LUDB/PMOO cannot show on a line. Because the LP is transcribed from panco and the floor is also panco's, this is port-fidelity (a regression pin), NOT independent method-soundness; the latter is the deferred simulation floor REQ-NC-SIM-FLOOR-001. Cyclic / branching topology is rejected (NotFeedForward, REQ-NC-PLP-002); the tfa=True tightening to EXACT is - REQ-NC-PLP-003. Two guard tests pin the soundness boundary of the - sink-tree projection: non_topological_sink_tree_rejected asserts a - valid converging tree whose sink is NOT the highest-numbered server - returns PlpError::NonTopologicalSinkTree (no panic in solve), and - topological_sink_tree_accepted asserts the same shape relabelled with - the sink last solves cleanly — proving the guard rejects the labelling, - not the topology (reordering is REQ-NC-PLP-002). + REQ-NC-PLP-003. As of REQ-NC-PLP-CONVERGE-001 a non-topological + single-successor converging tree is no longer rejected but topologically + RELABELLED and solved (see TEST-NC-PLP-CONVERGE); the former + NonTopologicalSinkTree guard is retained as a post-relabel invariant + that still prevents the .expect panic if the topo sort is buggy. fields: method: automated-test steps: @@ -1866,6 +1872,45 @@ artifacts: - type: satisfies target: REQ-NC-VALIDATION-001 + - id: TEST-NC-PLP-CONVERGE + type: feature + title: PLP on converging sink-trees via topological relabel + description: > + Exercises the topological-relabel path added to plp_bound for + single-successor CONVERGING sink-trees handed in non-topological order + (many-talkers→one-listener). Two layers of oracle. (1) PERMUTATION + INVARIANCE — the primary property, needing no external numbers: the + plp.rs unit test non_topological_relabel_is_permutation_invariant builds + a converging tree whose sink is NOT the highest-numbered server, asserts plp_bound + now returns Ok (no PlpError::NonTopologicalSinkTree, no panic in solve), + and asserts its per-flow bound equals — exactly — the bound of the SAME + physical network presented already-topologically (sink last). This + proves the relabel is a sound pure permutation. topological_sink_tree_ + accepted is retained unchanged as the already-ordered baseline. (2) + SOUNDNESS FLOOR — the plp_converge_xval module in nc_validation.rs + (feature milp-solver) runs the three pinned converging panco fixtures in + non-topological first-appearance order and asserts per-flow PLP matches + panco's pure-PLP reference to solver precision, sound (≥ EXACT) and — + on these benign converging nets — below TFA (PLP ≤ TFA holds per-fixture + here; it is NOT universal, see TEST-NC-PLP's incomparability note): + converge_2to1 PLP 48.89 ≪ TFA 60.39, converge_2to1_ + tail2 60.0 ≪ 106.44 (the 1.77× wedge), converge_3to1 70.0 ≪ 74.59 µs. + Values regenerated fresh from the pinned panco venv (lp_solve backend). + The relabel closes the sink==num_servers-1 assumption the sub-network + solve depends on (lines that .expect a successor for every non-sink + server), so the converging shape — the dominant TSN topology — now + yields the PLP≪TFA dominance LUDB/PMOO cannot show on a tree. + fields: + method: automated-test + steps: + - run: "cargo test -p spar-network --lib plp::" + - run: "cargo test -p spar-network --test nc_validation plp_xval::panco_converge" + status: passing + tags: [v0.19.0, network, network-calculus, plp, converging, validation, oracle] + links: + - type: satisfies + target: REQ-NC-PLP-CONVERGE-001 + - id: TEST-NC-BRIDGE type: feature title: AADL instance → network-wide NC solver bridge @@ -1898,7 +1943,16 @@ artifacts: Non-FIFO hops, unstable servers, and PLP NotFeedForward skip the arm transparently, leaving the per-stream SFA bounds authoritative. The production caller is the CLI --pmoo path - (run_all_analyses_with_pmoo, package spar). + (run_all_analyses_with_pmoo, package spar). A fourth test, + network_wide_tfa_index_aligns_on_converging_tree, guards the global + index on a CONVERGING (many-talkers→one-sink) shape; as of + REQ-NC-PLP-CONVERGE-001 the PLP arm fires there (topological relabel) + rather than skipping. That fixture is a deliberately pathological + 10000-µs-slow-switch net and doubles as the live counterexample to a + universal PLP ≤ TFA: pure PLP exceeds TFA there (panco PLP 1148 > TFA + 1048 ≥ EXACT 1035 µs, reproduced to the decimal), so the test asserts + only index alignment (slow-stream PLP dominates fast), not a sandwich. + The authoritative bound min(PLP, TFA) is REQ-NC-PLP-MIN-001. fields: method: automated-test steps: diff --git a/crates/spar-analysis/src/wctt.rs b/crates/spar-analysis/src/wctt.rs index d10743a..865ddfc 100644 --- a/crates/spar-analysis/src/wctt.rs +++ b/crates/spar-analysis/src/wctt.rs @@ -1045,9 +1045,14 @@ fn network_wide_nc_bounds( }); } - // PLP: the tighter feed-forward bound (PLP ≤ TFA per flow on a - // sink-tree). HiGHS/good_lp-backed, so gated behind `milp-solver` - // (off for wasm, #259). A multi-successor / cyclic topology returns + // PLP: the polynomial-LP feed-forward bound (Bouillard arXiv:2010.09263). + // Usually tighter than TFA, but PLP and TFA are *incomparable* + // over-approximations — both sound (≥ exact), neither dominates: under + // extreme cross-burst pure PLP can exceed TFA (see the + // `network_wide_tfa_index_aligns_on_converging_tree` counterexample). We + // emit both; the authoritative per-flow bound is min(PLP, TFA). + // HiGHS/good_lp-backed, so gated behind `milp-solver` (off for wasm, + // #259). A multi-successor / cyclic topology returns // `PlpError::NotFeedForward`; we then omit the PLP lines and the TFA // numbers above remain the network-wide result. #[cfg(feature = "milp-solver")] @@ -1066,7 +1071,7 @@ fn network_wide_nc_bounds( severity: Severity::Info, message: format!( "WcttPlpBound: stream '{}' network-wide PLP end-to-end delay {} ps \ - (≤ TFA; {} hop{})", + ({} hop{})", s.display_name(instance), delay_ps, s.hops.len(), @@ -3830,13 +3835,14 @@ end NetDiv; // `sw_c`. Every server's out-degree is ≤ 1, so it is a genuine // feed-forward sink-tree — but first-appearance global indexing // (sw_a→0, sw_c→1, sw_b→2) places the sink (sw_c) OFF the highest index, - // which is not a topological labelling. `plp_bound` rejects that with - // `NonTopologicalSinkTree` (reordering converging trees so PLP can fire - // is REQ-NC-PLP-002), so the PLP arm cleanly skips here. The value of - // this fixture is twofold: (1) it guards the *TFA* global index on a - // converging shape (complementing the diverging fixture's), and (2) it - // proves the bridge's `if let Ok(plp)` contract survives a sink-tree the - // solver declines — no panic, TFA bounds still emitted. + // which is not a topological labelling. As of REQ-NC-PLP-CONVERGE-001 + // `plp_bound` relabels that into topological order internally and FIRES + // (the v0.18 `NonTopologicalSinkTree` reject path is now an internal + // post-relabel invariant, not a caller-visible skip). The value of this + // fixture is twofold: (1) it guards the *TFA* and *PLP* global index on a + // converging shape (complementing the diverging fixture's), and (2) it is + // a deliberately pathological 10000-µs-slow-switch network where pure PLP + // exceeds TFA — the live counterexample to a universal `PLP ≤ TFA`. fn bridge_converge_aadl() -> &'static str { r#" package NetConv @@ -3950,20 +3956,46 @@ end NetConv; index would alias its slow hop onto the fast curve. delays={delays:#?}", ); - // Soundness end-to-end: the converging tree's first-appearance index - // is non-topological (sink sw_c is slot 1, not the last slot), so - // `plp_bound` returns `NonTopologicalSinkTree` and the bridge skips - // the PLP arm WITHOUT panicking. Before the plp.rs entry guard this - // path panicked at `solve` (`successor[sink] = None` on a server it - // treated as non-sink). Converging PLP via reordering is - // REQ-NC-PLP-002. + // Index alignment end-to-end: the converging tree's first-appearance + // index is non-topological (sink sw_c is slot 1, not the last slot). As + // of REQ-NC-PLP-CONVERGE-001 the bridge no longer skips this — + // `plp_bound` relabels the tree into topological order and FIRES, + // emitting one bound per stream. Before the relabel this path skipped + // (the v0.18 `NonTopologicalSinkTree` guard); before that guard it + // panicked at `solve` (`successor[sink] = None`). + // + // NOTE — no `PLP ≤ TFA` sandwich is asserted here, and that is + // deliberate: pure polynomial PLP and TFA are *incomparable* + // over-approximations. Both are sound (≥ exact), but on this + // 10000-µs-slow-switch network the cross-burst at the shared sink + // inflates the fast stream's PLP *above* its TFA. panco's own + // reference confirms it (panco PLP dataA = 1148.33µs > panco TFA + // dataA = 1048.09µs ≥ panco EXACT 1035.2µs), and spar reproduces + // panco's PLP to the decimal — see `nc_validation::plp_xval`. The + // authoritative per-flow bound is therefore min(PLP, TFA), tracked as + // a follow-on NC item; this test asserts only the index alignment that + // is its purpose. #[cfg(feature = "milp-solver")] { let plp = collect_delays(&diags, "WcttPlpBound"); + assert_eq!( + plp.len(), + 2, + "PLP now fires on the relabelled converging tree \ + (REQ-NC-PLP-CONVERGE-001), one bound per stream: {plp:#?}", + ); + // Same index-alignment guard as the TFA arm: the stream through + // the slow switch must dominate. A local (per-stream) index would + // alias dataB's slow first hop onto sw_a's fast curve and the two + // PLP bounds would be near-equal. The real PLP gap here is ~8.7× + // (10032µs vs 1148µs), so ×5 is a safe, non-flaky margin. + let mut plp_sorted: Vec = plp.values().copied().collect(); + plp_sorted.sort_unstable(); + let (plp_fast, plp_slow) = (plp_sorted[0], plp_sorted[1]); assert!( - plp.is_empty(), - "PLP must cleanly skip a non-topological converging tree \ - (REQ-NC-PLP-002), not fire or panic: {plp:#?}", + plp_fast > 0 && plp_slow > plp_fast.saturating_mul(5), + "the PLP bound through the slow switch must dominate; a local \ + index would alias its slow hop onto the fast curve. plp={plp:#?}", ); } } diff --git a/crates/spar-network/src/plp.rs b/crates/spar-network/src/plp.rs index f9c5ea3..517783c 100644 --- a/crates/spar-network/src/plp.rs +++ b/crates/spar-network/src/plp.rs @@ -116,14 +116,13 @@ pub enum PlpError { /// relation is single-valued), so branching is rejected here too. /// Generic feed-forward (forks) and cyclic topology are REQ-NC-PLP-002. NotFeedForward, - /// The topology is a single-successor sink-tree, but its server indices - /// are not a topological order: some edge `j → s` has `s < j`, so the - /// sink is not the highest-numbered server. [`SubNetwork`] re-indexes - /// its per-flow projection by ascending global index and assumes the - /// sink is the last local index, so a non-topological labelling would - /// otherwise misplace the sink (and panic in `solve`). Reordering a - /// converging tree into topological order so PLP can fire on it is - /// REQ-NC-PLP-002; callers should reorder before retrying. + /// Internal invariant violation: after the topological relabel + /// (REQ-NC-PLP-CONVERGE-001) an edge `j → s` still has `s < j`, so the + /// labelling handed to the per-flow projection is not topological and + /// `solve` would misplace the sink (panic). `plp_bound` now reorders any + /// single-successor converging tree into topological order itself, so a + /// caller should never see this — it fires only if the relabel pass is + /// buggy, surfaced as an error rather than the `.expect` panic in `solve`. NonTopologicalSinkTree, /// Aggregate sustained rate `≥` service rate at some server: the /// necessary stability condition fails and no finite bound exists, so @@ -155,7 +154,7 @@ impl core::fmt::Display for PlpError { ), Self::NonTopologicalSinkTree => write!( f, - "PLP: sink-tree server indices are not in topological order (an edge points to a lower index, so the sink is not the highest-numbered server) — reorder before retrying; see REQ-NC-PLP-002" + "PLP: internal invariant: server labelling is still non-topological after the converging-tree relabel (REQ-NC-PLP-CONVERGE-001) — the relabel pass is buggy" ), Self::Unstable { server, @@ -196,19 +195,7 @@ pub fn plp_bound(flows: &[PlpFlow], services: &[ServiceCurve]) -> Result 1 means branching/cyclic → REQ-NC-PLP-002. - let mut successors: Vec> = vec![Vec::new(); n_servers]; - let mut predecessors: Vec> = vec![Vec::new(); n_servers]; - for flow in flows { - for w in flow.path.windows(2) { - let (a, b) = (w[0], w[1]); - if !successors[a].contains(&b) { - successors[a].push(b); - } - if !predecessors[b].contains(&a) { - predecessors[b].push(a); - } - } - } + let (successors, predecessors) = adjacency(flows, n_servers); for s in &successors { if s.len() > 1 { return Err(PlpError::NotFeedForward); @@ -219,16 +206,55 @@ pub fn plp_bound(flows: &[PlpFlow], services: &[ServiceCurve]) -> Result; + let relabelled_services: Vec; + let (flows, services, successors, predecessors) = + if new_of.iter().enumerate().all(|(old, &new)| old == new) { + (flows, services, successors, predecessors) + } else { + let mut svc = services.to_vec(); + for (old, s) in services.iter().enumerate() { + svc[new_of[old]] = *s; + } + relabelled_services = svc; + relabelled_flows = flows + .iter() + .map(|f| PlpFlow { + alpha: f.alpha, + path: f.path.iter().map(|&h| new_of[h]).collect(), + }) + .collect(); + let (succ, pred) = adjacency(&relabelled_flows, n_servers); + ( + relabelled_flows.as_slice(), + relabelled_services.as_slice(), + succ, + pred, + ) + }; + + // Post-relabel invariant (the guard REQ-NC-PLP-001 added, repurposed): + // the labelling is now topological, so every edge `j → s` has `s > j`. A + // violation means `topological_relabel` is buggy — return an error rather + // than hit the `.expect("non-sink has a successor")` panic in `solve`. for (j, succ) in successors.iter().enumerate() { if succ.first().is_some_and(|&s| s < j) { return Err(PlpError::NonTopologicalSinkTree); @@ -275,6 +301,59 @@ pub fn plp_bound(flows: &[PlpFlow], services: &[ServiceCurve]) -> Result (Vec>, Vec>) { + let mut successors: Vec> = vec![Vec::new(); n]; + let mut predecessors: Vec> = vec![Vec::new(); n]; + for flow in flows { + for w in flow.path.windows(2) { + let (a, b) = (w[0], w[1]); + if !successors[a].contains(&b) { + successors[a].push(b); + } + if !predecessors[b].contains(&a) { + predecessors[b].push(a); + } + } + } + (successors, predecessors) +} + +/// Compute a topological relabelling `new_of[old] = new_index` of an acyclic, +/// ≤1-successor graph (a rooted in-forest) via Kahn's algorithm, taking the +/// smallest available old index first so the result is deterministic. Sources +/// (talkers, in-degree 0) take the low indices and sinks the high ones, so +/// every edge `old → succ` ends up with `new_of[old] < new_of[succ]` — the +/// order `SubNetwork`/`solve` require. Returns the identity permutation when +/// the input is already topological. Caller must have ruled out cycles +/// (`has_cycle`); on an acyclic graph the Kahn loop always assigns all `n`. +fn topological_relabel(successors: &[Vec], n: usize) -> Vec { + let mut in_degree = vec![0usize; n]; + for succ in successors { + for &b in succ { + in_degree[b] += 1; + } + } + let mut new_of = vec![0usize; n]; + let mut assigned = vec![false; n]; + // On an acyclic graph each iteration places exactly one node, so the loop + // index `next` is the new index to assign (0 = first source … n-1 = sink). + for next in 0..n { + // Smallest-index ready node (in-degree 0, not yet placed). + let Some(u) = (0..n).find(|&v| !assigned[v] && in_degree[v] == 0) else { + break; // unreachable on an acyclic graph; a cycle would leave nodes + }; + assigned[u] = true; + new_of[u] = next; + for &w in &successors[u] { + in_degree[w] -= 1; + } + } + new_of +} + /// Does following the unique successor from any node revisit a node? On a /// feed-forward network it never does (servers form a rooted forest). fn has_cycle(successors: &[Vec], n: usize) -> bool { @@ -834,36 +913,56 @@ mod tests { ); } - /// A converging sink-tree (two talkers → one shared sink) is a valid - /// single-successor feed-forward shape, but here its server indices are - /// NOT topological: flow A `0→1`, flow B `2→1` — the sink (server 1) is - /// not the highest-numbered server (talker 2 outranks it). `SubNetwork` - /// assumes the sink is the last local index, so without the entry guard - /// `solve` would panic (`successor[sink] = None` on a server it treats as - /// non-sink). It must be REJECTED, not panic — reordering converging - /// trees so PLP can fire is REQ-NC-PLP-002. + /// A converging sink-tree handed in NON-topological order is no longer + /// rejected — `plp_bound` relabels it (REQ-NC-PLP-CONVERGE-001). This is a + /// pure permutation, so the per-flow bounds must be IDENTICAL to the same + /// physical network presented already-topologically. Distinct per-server + /// latencies make the check bite: a relabel that misassigns a service + /// curve to a server would change the numbers (and swap flow A vs B). #[test] - fn non_topological_sink_tree_rejected() { - let services = vec![ - ServiceCurve::rate_latency(GBPS, 10 * US_PS), - ServiceCurve::rate_latency(GBPS, 10 * US_PS), - ServiceCurve::rate_latency(GBPS, 10 * US_PS), + fn non_topological_relabel_is_permutation_invariant() { + // One physical converging net: two talkers (sA latency 5 µs, + // sB latency 8 µs) into one shared sink switch (latency 20 µs). + // (a) Topological labelling: sink is the highest index (2). + let topo_services = vec![ + ServiceCurve::rate_latency(GBPS, 5 * US_PS), // sA = 0 + ServiceCurve::rate_latency(GBPS, 8 * US_PS), // sB = 1 + ServiceCurve::rate_latency(GBPS, 20 * US_PS), // sink = 2 ]; - // sink is server 1; talker 2 has a higher index → edge 2→1 descends. - let flows = vec![ - flow(FRAME, HUNDRED_MBPS, &[0, 1]), - flow(FRAME, HUNDRED_MBPS, &[2, 1]), + let topo_flows = vec![ + flow(FRAME, HUNDRED_MBPS, &[0, 2]), // A: sA → sink + flow(FRAME, HUNDRED_MBPS, &[1, 2]), // B: sB → sink ]; - assert_eq!( - plp_bound(&flows, &services).unwrap_err(), - PlpError::NonTopologicalSinkTree + let topo = plp_bound(&topo_flows, &topo_services).expect("topological in scope"); + + // (b) Same physical net, NON-topological labelling: sink = 0 (not + // last), sA = 1, sB = 2 — edges 1→0 and 2→0 both descend. + let shuf_services = vec![ + ServiceCurve::rate_latency(GBPS, 20 * US_PS), // sink = 0 + ServiceCurve::rate_latency(GBPS, 5 * US_PS), // sA = 1 + ServiceCurve::rate_latency(GBPS, 8 * US_PS), // sB = 2 + ]; + let shuf_flows = vec![ + flow(FRAME, HUNDRED_MBPS, &[1, 0]), // A: sA → sink + flow(FRAME, HUNDRED_MBPS, &[2, 0]), // B: sB → sink + ]; + let shuf = plp_bound(&shuf_flows, &shuf_services) + .expect("non-topological converging tree is reordered, not rejected"); + + // Per-flow bounds match exactly (flow A == flow A, flow B == flow B); + // the relabel did not perturb the result. A and B differ (5 vs 8 µs + // talker), so the element-wise equality also catches a transpose bug. + assert_eq!(shuf.flow_delay_ps, topo.flow_delay_ps); + assert!(shuf.flow_delay_ps.iter().all(|&d| d > 0)); + assert_ne!( + topo.flow_delay_ps[0], topo.flow_delay_ps[1], + "asymmetric talkers should give distinct bounds" ); } - /// The same converging tree, but with the sink relabelled to the highest - /// index (flow A `0→2`, flow B `1→2`), IS in scope and solves — proving - /// the guard rejects only the *labelling*, not the *shape*. (REQ-NC-PLP-002 - /// will reorder automatically; until then callers pre-order.) + /// A converging tree already in topological order (sink at the highest + /// index: flow A `0→2`, flow B `1→2`) solves directly — the relabel is the + /// identity here, exercising the no-op path. (REQ-NC-PLP-CONVERGE-001) #[test] fn topological_sink_tree_accepted() { let services = vec![ diff --git a/crates/spar-network/tests/nc_validation.rs b/crates/spar-network/tests/nc_validation.rs index f4344c8..84b596c 100644 --- a/crates/spar-network/tests/nc_validation.rs +++ b/crates/spar-network/tests/nc_validation.rs @@ -182,8 +182,13 @@ fn panco_three_server_line_sound_and_agrees() { // LP and the exact floor is *also* panco's, so the agreement check here is // **port-fidelity** (a solver-precision regression pin), not independent // soundness — that is the deferred simulation floor REQ-NC-SIM-FLOOR-001. -// What stays independent is the sandwich: `EXACT ≤ plp ≤ TFA`, where the TFA -// ceiling is spar's own engine. +// What stays independent is the EXACT soundness floor (`plp ≥ panco exact`), +// where panco's exact ELP is the true worst case. We ALSO assert `plp ≤ TFA` +// here, but note that is a *per-fixture* observation on these benign nets, +// NOT a universal law: pure polynomial PLP and TFA are incomparable +// over-approximations (neither dominates), and under extreme cross-burst PLP +// can exceed TFA — see the `network_wide_tfa_index_aligns_on_converging_tree` +// counterexample in spar-analysis, where panco's own PLP > panco's own TFA. #[cfg(feature = "milp-solver")] mod plp_xval { use super::*; @@ -204,9 +209,12 @@ mod plp_xval { } } - /// For one fixture: spar PLP must (1) be `≥` panco EXACT (soundness - /// floor), (2) match the pinned pure-PLP value within `PLP_TOL`, and - /// (3) be `≤` spar's own TFA bound (the sandwich ceiling). + /// For one fixture: spar PLP must (1) be `≥` panco EXACT (the independent + /// soundness floor), (2) match the pinned pure-PLP value within `PLP_TOL`, + /// and (3) be `≤` spar's own TFA bound. Check (3) is a *per-fixture* pin — + /// PLP ≤ TFA holds on these benign nets but is NOT universal (the two are + /// incomparable; see the converging-tree counterexample), so it is a + /// regression pin on these inputs, not a soundness law. fn check_plp( name: &str, flows: &[PlpFlow], @@ -246,10 +254,13 @@ mod plp_xval { rel * 100.0, PLP_TOL * 100.0, ); - // (3) SANDWICH ceiling — plp ≤ spar TFA (the dominance claim). + // (3) PER-FIXTURE pin — plp ≤ spar TFA on these benign nets. NOT a + // universal law: PLP and TFA are incomparable (see the converging- + // tree counterexample); this catches regressions on these inputs. assert!( plp_ps <= tfa.flow_delay_ps[i], - "{name} flow {i}: plp {plp_ps} ps exceeds spar TFA {} ps (sandwich violated)", + "{name} flow {i}: plp {plp_ps} ps exceeds spar TFA {} ps \ + (per-fixture PLP≤TFA pin broke)", tfa.flow_delay_ps[i], ); } @@ -298,7 +309,8 @@ mod plp_xval { plp_flow(FRAME, HUNDRED_MBPS, &[1, 2]), ]; // Pure PLP `[72.22, 61.11, 57.98]` — looser than EXACT `[68.4, 58.4, - // 57.98]` but strictly inside the sandwich (TFA `[134.7, 86.8, 100.7]`). + // 57.98]` but, on this benign net, strictly below TFA `[134.7, 86.8, + // 100.7]` (PLP ≤ TFA is per-fixture here, not universal). check_plp( "three_server_line", &flows, @@ -307,4 +319,84 @@ mod plp_xval { &[68.4, 58.4, 57.98], ); } + + // ── Converging sink-trees (REQ-NC-PLP-CONVERGE-001) ─────────────────── + // The many-talkers→one-listener TSN shape. Each fixture is handed in + // NON-topological order (sink at index 0, talkers above it) so the path + // edges all *descend* — exercising `plp_bound`'s topological relabel. The + // pinned panco values were regenerated fresh from the offline panco venv + // (pure FifoLP(polynomial=True, tfa=False), lp_solve backend). Because the + // bound is invariant under the relabel, these equal what panco reports for + // the same physical net in topological order. The PLP ≪ TFA gap here is the + // converging-tree dominance LUDB/PMOO cannot show. + + /// 2 talkers → 1 sink (1 hop each). panco: PLP 48.89 ≪ TFA 60.39 µs. + #[test] + fn panco_converge_2to1_plp() { + // sink = 0; talkers = 1, 2. Edges 1→0 and 2→0 both descend. + let services = vec![ + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // sink = 0 + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // talker A = 1 + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // talker B = 2 + ]; + let flows = vec![ + plp_flow(FRAME, HUNDRED_MBPS, &[1, 0]), + plp_flow(FRAME, HUNDRED_MBPS, &[2, 0]), + ]; + check_plp( + "converge_2to1", + &flows, + &services, + &[48.89, 48.89], + &[46.2, 46.2], + ); + } + + /// 2 talkers → shared tail switch → 1 sink (2 hops each). The 1.77× wedge: + /// panco PLP 60.0 ≪ TFA 106.44 µs. + #[test] + fn panco_converge_2to1_tail2_plp() { + // sink = 0; shared tail = 1; talkers = 2, 3. Paths 2→1→0 and 3→1→0. + let services = vec![ + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // sink = 0 + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // shared tail = 1 + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // talker A = 2 + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // talker B = 3 + ]; + let flows = vec![ + plp_flow(FRAME, HUNDRED_MBPS, &[2, 1, 0]), + plp_flow(FRAME, HUNDRED_MBPS, &[3, 1, 0]), + ]; + check_plp( + "converge_2to1_tail2", + &flows, + &services, + &[60.0, 60.0], + &[56.2, 56.2], + ); + } + + /// 3 talkers → 1 sink (1 hop each). panco: PLP 70.0 ≪ TFA 74.59 µs. + #[test] + fn panco_converge_3to1_plp() { + // sink = 0; talkers = 1, 2, 3. Edges 1→0, 2→0, 3→0 all descend. + let services = vec![ + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // sink = 0 + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // talker A = 1 + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // talker B = 2 + ServiceCurve::rate_latency(GBPS, 10 * US_PS), // talker C = 3 + ]; + let flows = vec![ + plp_flow(FRAME, HUNDRED_MBPS, &[1, 0]), + plp_flow(FRAME, HUNDRED_MBPS, &[2, 0]), + plp_flow(FRAME, HUNDRED_MBPS, &[3, 0]), + ]; + check_plp( + "converge_3to1", + &flows, + &services, + &[70.0, 70.0, 70.0], + &[60.4, 60.4, 60.4], + ); + } }