Skip to content

Commit 7eec0f1

Browse files
committed
Make grouped AVG and ratio tests fan-out sensitive
Grouped AVG: replace single-customer-per-group test with tier-based grouping where gold tier has Alice(30, 3 orders) + Bob(25, 1 order). Correct AVG=27.5; fan-out would weight Alice 3x giving 28.75. Grouped ratio: remove per-product grouped profit margin test since (SUM(rev)-SUM(cost))/SUM(rev) is scale-invariant under uniform row duplication within a group, making it fundamentally unobservable. The ungrouped ratio test already validates fan-out prevention.
1 parent 20b5fe3 commit 7eec0f1

File tree

1 file changed

+26
-22
lines changed

1 file changed

+26
-22
lines changed

test/sql/measures.test

Lines changed: 26 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1011,19 +1011,34 @@ JOIN fanout_orders o ON c.cust_id = o.cust_id;
10111011
----
10121012
3
10131013

1014-
# -- Test 4: grouped join preserves per-group measure correctness --
1015-
# Each group has one customer. The measure should reflect the true per-customer
1016-
# aggregate, not be distorted by order-count differences.
1014+
# -- Test 4: grouped join with multiple customers per group --
1015+
# Group by tier so each group has >1 customer with different order counts.
1016+
# Gold: Alice(30, 3 orders) + Bob(25, 1 order). Correct AVG = 27.5.
1017+
# Fan-out would weight Alice 3x: (30*3+25)/4 = 28.75 (WRONG).
1018+
# Silver: Carol(40, 2 orders) alone. AVG = 40 either way.
10171019

1018-
query TIIR rowsort
1019-
SEMANTIC SELECT c.name, c.age, COUNT(*) AS order_rows, AGGREGATE(avg_cust_age)
1020-
FROM fanout_customers_v c
1021-
JOIN fanout_orders o ON c.cust_id = o.cust_id
1022-
GROUP BY c.name, c.age;
1020+
statement ok
1021+
CREATE TABLE fanout_tiered_custs (cust_id INT, tier TEXT, age INT);
1022+
1023+
statement ok
1024+
INSERT INTO fanout_tiered_custs VALUES
1025+
(1, 'gold', 30),
1026+
(2, 'gold', 25),
1027+
(3, 'silver', 40);
1028+
1029+
statement ok
1030+
CREATE VIEW fanout_tiered_custs_v AS
1031+
SELECT *, AVG(age) AS MEASURE avg_tier_age
1032+
FROM fanout_tiered_custs;
1033+
1034+
query TIR rowsort
1035+
SEMANTIC SELECT t.tier, COUNT(*) AS order_rows, AGGREGATE(avg_tier_age)
1036+
FROM fanout_tiered_custs_v t
1037+
JOIN fanout_orders o ON t.cust_id = o.cust_id
1038+
GROUP BY t.tier;
10231039
----
1024-
Alice 30 3 30.0
1025-
Bob 25 1 25.0
1026-
Carol 40 2 40.0
1040+
gold 4 27.5
1041+
silver 2 40.0
10271042

10281043
# -- Test 5: WHERE filter with fan-out join --
10291044

@@ -1140,17 +1155,6 @@ JOIN fanout_product_regions pr ON p.product = pr.product;
11401155
----
11411156
0.5222222222222223
11421157

1143-
# Per-product group: each product's margin is its own (revenue-cost)/revenue
1144-
query TRI rowsort
1145-
SEMANTIC SELECT p.product, AGGREGATE(profit_margin), COUNT(*) AS region_count
1146-
FROM fanout_products_v p
1147-
JOIN fanout_product_regions pr ON p.product = pr.product
1148-
GROUP BY p.product;
1149-
----
1150-
Doohickey 0.19999999999999998 1
1151-
Gadget 0.5 1
1152-
Widget 0.6 2
1153-
11541158
# -- Test 10: COUNT DISTINCT measure immune to fan-out --
11551159
# Join orders to a line-items table that fans out the order rows.
11561160
# Each order has 1-3 line items, so orders are duplicated in the join.

0 commit comments

Comments
 (0)