Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
2 changes: 1 addition & 1 deletion common/src/main/scala/org/apache/comet/CometConf.scala
Original file line number Diff line number Diff line change
Expand Up @@ -302,7 +302,7 @@ object CometConf extends ShimCometConf {
"of the JVM implementation. This can improve performance for queries that need to " +
"convert between columnar and row formats. This is an experimental feature.")
.booleanConf
.createWithDefault(false)
.createWithDefault(true)

val COMET_EXEC_SORT_MERGE_JOIN_WITH_JOIN_FILTER_ENABLED: ConfigEntry[Boolean] =
conf("spark.comet.exec.sortMergeJoinWithJoinFilter.enabled")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ TakeOrderedAndProject (44)
: : +- * BroadcastHashJoin Inner BuildRight (28)
: : :- * Filter (11)
: : : +- * HashAggregate (10)
: : : +- * CometColumnarToRow (9)
: : : +- CometNativeColumnarToRow (9)
: : : +- CometColumnarExchange (8)
: : : +- * HashAggregate (7)
: : : +- * Project (6)
Expand All @@ -20,11 +20,11 @@ TakeOrderedAndProject (44)
: : +- BroadcastExchange (27)
: : +- * Filter (26)
: : +- * HashAggregate (25)
: : +- * CometColumnarToRow (24)
: : +- CometNativeColumnarToRow (24)
: : +- CometColumnarExchange (23)
: : +- * HashAggregate (22)
: : +- * HashAggregate (21)
: : +- * CometColumnarToRow (20)
: : +- CometNativeColumnarToRow (20)
: : +- CometColumnarExchange (19)
: : +- * HashAggregate (18)
: : +- * Project (17)
Expand All @@ -34,12 +34,12 @@ TakeOrderedAndProject (44)
: : : +- Scan parquet spark_catalog.default.store_returns (12)
: : +- ReusedExchange (15)
: +- BroadcastExchange (34)
: +- * CometColumnarToRow (33)
: +- CometNativeColumnarToRow (33)
: +- CometProject (32)
: +- CometFilter (31)
: +- CometNativeScan parquet spark_catalog.default.store (30)
+- BroadcastExchange (41)
+- * CometColumnarToRow (40)
+- CometNativeColumnarToRow (40)
+- CometProject (39)
+- CometFilter (38)
+- CometNativeScan parquet spark_catalog.default.customer (37)
Expand All @@ -53,27 +53,27 @@ PartitionFilters: [isnotnull(sr_returned_date_sk#4), dynamicpruningexpression(sr
PushedFilters: [IsNotNull(sr_store_sk), IsNotNull(sr_customer_sk)]
ReadSchema: struct<sr_customer_sk:int,sr_store_sk:int,sr_return_amt:decimal(7,2)>

(2) ColumnarToRow [codegen id : 2]
(2) ColumnarToRow [codegen id : 1]
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]

(3) Filter [codegen id : 2]
(3) Filter [codegen id : 1]
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Condition : (isnotnull(sr_store_sk#2) AND isnotnull(sr_customer_sk#1))

(4) ReusedExchange [Reuses operator id: 49]
Output [1]: [d_date_sk#6]

(5) BroadcastHashJoin [codegen id : 2]
(5) BroadcastHashJoin [codegen id : 1]
Left keys [1]: [sr_returned_date_sk#4]
Right keys [1]: [d_date_sk#6]
Join type: Inner
Join condition: None

(6) Project [codegen id : 2]
(6) Project [codegen id : 1]
Output [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
Input [5]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4, d_date_sk#6]

(7) HashAggregate [codegen id : 2]
(7) HashAggregate [codegen id : 1]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#3))]
Expand All @@ -84,17 +84,17 @@ Results [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
Arguments: hashpartitioning(sr_customer_sk#1, sr_store_sk#2, 5), ENSURE_REQUIREMENTS, CometColumnarShuffle, [plan_id=1]

(9) CometColumnarToRow [codegen id : 9]
(9) CometNativeColumnarToRow
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]

(10) HashAggregate [codegen id : 9]
(10) HashAggregate [codegen id : 5]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [sum(UnscaledValue(sr_return_amt#3))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#3))#9]
Results [3]: [sr_customer_sk#1 AS ctr_customer_sk#10, sr_store_sk#2 AS ctr_store_sk#11, MakeDecimal(sum(UnscaledValue(sr_return_amt#3))#9,17,2) AS ctr_total_return#12]

(11) Filter [codegen id : 9]
(11) Filter [codegen id : 5]
Input [3]: [ctr_customer_sk#10, ctr_store_sk#11, ctr_total_return#12]
Condition : isnotnull(ctr_total_return#12)

Expand All @@ -106,27 +106,27 @@ PartitionFilters: [isnotnull(sr_returned_date_sk#16), dynamicpruningexpression(s
PushedFilters: [IsNotNull(sr_store_sk)]
ReadSchema: struct<sr_customer_sk:int,sr_store_sk:int,sr_return_amt:decimal(7,2)>

(13) ColumnarToRow [codegen id : 4]
(13) ColumnarToRow [codegen id : 2]
Input [4]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15, sr_returned_date_sk#16]

(14) Filter [codegen id : 4]
(14) Filter [codegen id : 2]
Input [4]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15, sr_returned_date_sk#16]
Condition : isnotnull(sr_store_sk#14)

(15) ReusedExchange [Reuses operator id: 49]
Output [1]: [d_date_sk#17]

(16) BroadcastHashJoin [codegen id : 4]
(16) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [sr_returned_date_sk#16]
Right keys [1]: [d_date_sk#17]
Join type: Inner
Join condition: None

(17) Project [codegen id : 4]
(17) Project [codegen id : 2]
Output [3]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15]
Input [5]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15, sr_returned_date_sk#16, d_date_sk#17]

(18) HashAggregate [codegen id : 4]
(18) HashAggregate [codegen id : 2]
Input [3]: [sr_customer_sk#13, sr_store_sk#14, sr_return_amt#15]
Keys [2]: [sr_customer_sk#13, sr_store_sk#14]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#15))]
Expand All @@ -137,17 +137,17 @@ Results [3]: [sr_customer_sk#13, sr_store_sk#14, sum#19]
Input [3]: [sr_customer_sk#13, sr_store_sk#14, sum#19]
Arguments: hashpartitioning(sr_customer_sk#13, sr_store_sk#14, 5), ENSURE_REQUIREMENTS, CometColumnarShuffle, [plan_id=2]

(20) CometColumnarToRow [codegen id : 5]
(20) CometNativeColumnarToRow
Input [3]: [sr_customer_sk#13, sr_store_sk#14, sum#19]

(21) HashAggregate [codegen id : 5]
(21) HashAggregate [codegen id : 3]
Input [3]: [sr_customer_sk#13, sr_store_sk#14, sum#19]
Keys [2]: [sr_customer_sk#13, sr_store_sk#14]
Functions [1]: [sum(UnscaledValue(sr_return_amt#15))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#15))#9]
Results [2]: [sr_store_sk#14 AS ctr_store_sk#20, MakeDecimal(sum(UnscaledValue(sr_return_amt#15))#9,17,2) AS ctr_total_return#21]

(22) HashAggregate [codegen id : 5]
(22) HashAggregate [codegen id : 3]
Input [2]: [ctr_store_sk#20, ctr_total_return#21]
Keys [1]: [ctr_store_sk#20]
Functions [1]: [partial_avg(ctr_total_return#21)]
Expand All @@ -158,31 +158,31 @@ Results [3]: [ctr_store_sk#20, sum#24, count#25]
Input [3]: [ctr_store_sk#20, sum#24, count#25]
Arguments: hashpartitioning(ctr_store_sk#20, 5), ENSURE_REQUIREMENTS, CometColumnarShuffle, [plan_id=3]

(24) CometColumnarToRow [codegen id : 6]
(24) CometNativeColumnarToRow
Input [3]: [ctr_store_sk#20, sum#24, count#25]

(25) HashAggregate [codegen id : 6]
(25) HashAggregate [codegen id : 4]
Input [3]: [ctr_store_sk#20, sum#24, count#25]
Keys [1]: [ctr_store_sk#20]
Functions [1]: [avg(ctr_total_return#21)]
Aggregate Attributes [1]: [avg(ctr_total_return#21)#26]
Results [2]: [(avg(ctr_total_return#21)#26 * 1.2) AS (avg(ctr_total_return) * 1.2)#27, ctr_store_sk#20]

(26) Filter [codegen id : 6]
(26) Filter [codegen id : 4]
Input [2]: [(avg(ctr_total_return) * 1.2)#27, ctr_store_sk#20]
Condition : isnotnull((avg(ctr_total_return) * 1.2)#27)

(27) BroadcastExchange
Input [2]: [(avg(ctr_total_return) * 1.2)#27, ctr_store_sk#20]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, true] as bigint)),false), [plan_id=4]

(28) BroadcastHashJoin [codegen id : 9]
(28) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ctr_store_sk#11]
Right keys [1]: [ctr_store_sk#20]
Join type: Inner
Join condition: (cast(ctr_total_return#12 as decimal(24,7)) > (avg(ctr_total_return) * 1.2)#27)

(29) Project [codegen id : 9]
(29) Project [codegen id : 5]
Output [2]: [ctr_customer_sk#10, ctr_store_sk#11]
Input [5]: [ctr_customer_sk#10, ctr_store_sk#11, ctr_total_return#12, (avg(ctr_total_return) * 1.2)#27, ctr_store_sk#20]

Expand All @@ -201,20 +201,20 @@ Condition : ((staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharC
Input [2]: [s_store_sk#28, s_state#29]
Arguments: [s_store_sk#28], [s_store_sk#28]

(33) CometColumnarToRow [codegen id : 7]
(33) CometNativeColumnarToRow
Input [1]: [s_store_sk#28]

(34) BroadcastExchange
Input [1]: [s_store_sk#28]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=5]

(35) BroadcastHashJoin [codegen id : 9]
(35) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ctr_store_sk#11]
Right keys [1]: [s_store_sk#28]
Join type: Inner
Join condition: None

(36) Project [codegen id : 9]
(36) Project [codegen id : 5]
Output [1]: [ctr_customer_sk#10]
Input [3]: [ctr_customer_sk#10, ctr_store_sk#11, s_store_sk#28]

Expand All @@ -233,20 +233,20 @@ Condition : isnotnull(c_customer_sk#30)
Input [2]: [c_customer_sk#30, c_customer_id#31]
Arguments: [c_customer_sk#30, c_customer_id#32], [c_customer_sk#30, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c_customer_id#31, 16, true, false, true) AS c_customer_id#32]

(40) CometColumnarToRow [codegen id : 8]
(40) CometNativeColumnarToRow
Input [2]: [c_customer_sk#30, c_customer_id#32]

(41) BroadcastExchange
Input [2]: [c_customer_sk#30, c_customer_id#32]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=6]

(42) BroadcastHashJoin [codegen id : 9]
(42) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ctr_customer_sk#10]
Right keys [1]: [c_customer_sk#30]
Join type: Inner
Join condition: None

(43) Project [codegen id : 9]
(43) Project [codegen id : 5]
Output [1]: [c_customer_id#32]
Input [3]: [ctr_customer_sk#10, c_customer_sk#30, c_customer_id#32]

Expand All @@ -258,7 +258,7 @@ Arguments: 100, [c_customer_id#32 ASC NULLS FIRST], [c_customer_id#32]

Subquery:1 Hosting operator id = 1 Hosting Expression = sr_returned_date_sk#4 IN dynamicpruning#5
BroadcastExchange (49)
+- * CometColumnarToRow (48)
+- CometNativeColumnarToRow (48)
+- CometProject (47)
+- CometFilter (46)
+- CometNativeScan parquet spark_catalog.default.date_dim (45)
Expand All @@ -279,7 +279,7 @@ Condition : ((isnotnull(d_year#33) AND (d_year#33 = 2000)) AND isnotnull(d_date_
Input [2]: [d_date_sk#6, d_year#33]
Arguments: [d_date_sk#6], [d_date_sk#6]

(48) CometColumnarToRow [codegen id : 1]
(48) CometNativeColumnarToRow
Input [1]: [d_date_sk#6]

(49) BroadcastExchange
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ TakeOrderedAndProject
: : +- BroadcastHashJoin
: : :- Filter
: : : +- HashAggregate
: : : +- CometColumnarToRow
: : : +- CometNativeColumnarToRow
: : : +- CometColumnarExchange
: : : +- HashAggregate
: : : +- Project
Expand All @@ -17,23 +17,23 @@ TakeOrderedAndProject
: : : : +- Scan parquet spark_catalog.default.store_returns [COMET: Native DataFusion scan does not support subqueries/dynamic pruning]
: : : : +- SubqueryBroadcast
: : : : +- BroadcastExchange
: : : : +- CometColumnarToRow
: : : : +- CometNativeColumnarToRow
: : : : +- CometProject
: : : : +- CometFilter
: : : : +- CometNativeScan parquet spark_catalog.default.date_dim
: : : +- BroadcastExchange
: : : +- CometColumnarToRow
: : : +- CometNativeColumnarToRow
: : : +- CometProject
: : : +- CometFilter
: : : +- CometNativeScan parquet spark_catalog.default.date_dim
: : +- BroadcastExchange
: : +- Filter
: : +- HashAggregate
: : +- CometColumnarToRow
: : +- CometNativeColumnarToRow
: : +- CometColumnarExchange
: : +- HashAggregate
: : +- HashAggregate
: : +- CometColumnarToRow
: : +- CometNativeColumnarToRow
: : +- CometColumnarExchange
: : +- HashAggregate
: : +- Project
Expand All @@ -43,17 +43,17 @@ TakeOrderedAndProject
: : : +- Scan parquet spark_catalog.default.store_returns [COMET: Native DataFusion scan does not support subqueries/dynamic pruning]
: : : +- ReusedSubquery
: : +- BroadcastExchange
: : +- CometColumnarToRow
: : +- CometNativeColumnarToRow
: : +- CometProject
: : +- CometFilter
: : +- CometNativeScan parquet spark_catalog.default.date_dim
: +- BroadcastExchange
: +- CometColumnarToRow
: +- CometNativeColumnarToRow
: +- CometProject
: +- CometFilter
: +- CometNativeScan parquet spark_catalog.default.store
+- BroadcastExchange
+- CometColumnarToRow
+- CometNativeColumnarToRow
+- CometProject
+- CometFilter
+- CometNativeScan parquet spark_catalog.default.customer
Expand Down
Loading
Loading