feat: Add spark-compat mode to integrate datafusion-spark features au…#1416
feat: Add spark-compat mode to integrate datafusion-spark features au…#1416milenkovicm merged 2 commits intoapache:mainfrom
Conversation
|
I'm not certain that we have a well-defined place to document compile-time features, just docs for runtime configurations. I'd like to follow this up with a markdown file to describe our features |
maybe section in the readme file? |
b1cef8d to
caa8f6e
Compare
caa8f6e to
92793e6
Compare
Threw up a separate PR: https://github.com/apache/datafusion-ballista/pull/1418/changes |
milenkovicm
left a comment
There was a problem hiding this comment.
thanks @mattcuento, great addition
i'm not sure if we have to add table functions into registry. also if we there are table functions to be supported, we may need to create encoder for them. so I'd suggest to ignore table functions for now and take them as a follow up if you agree
Adding Spark compatibility is a pretty major feature(!), so I think there should be a documentation update as part of this PR. I would be careful to explain it as "using Spark-compatible expressions" where available, rather than full compatibility. |
Yup agreed, happy to take the easy wins for scalar/window/aggregate for now and can see what we can do with table functions as a follow up!
Good point, I can write something up as a part of this PR 👍 |
93bb6a5 to
8df2329
Compare
|
sorry for delay @mattcuento I'm behind reviews (and everything else :) ) had a quick look and it looks ok with me. Will try to do proper review tomorrow or over the weekend. maybe @andygrove has some more ideas what should be addressed |
|
No worries! Sounds good to me. If there's any PRs I can help with reviewing as a new contributor to help alleviate a bit, let me know! |
milenkovicm
left a comment
There was a problem hiding this comment.
thanks @mattcuento
just one comment to address, otherwise its a good in my opinion
|
|
||
| [features] | ||
| default = ["substrait", "standalone"] | ||
| spark-compat = ["ballista-core/spark-compat"] |
There was a problem hiding this comment.
we need to add
[[example]]
name = "remote-spark-functions"
required-features = ["spark-compat"]
at the bottom of the file, so it triggers so it triggers build with spark-compact
please add same for substrait example and remove it from default
There was a problem hiding this comment.
ah good call, added for both, did substrait in a separate commit. Ran both locally. Thanks!
…tomatically add spark compat example add spark compat example
8df2329 to
86606bd
Compare
|
Thanks @mattcuento |
…tomatically
Which issue does this PR close?
Closes #1397.
Rationale for this change
Exposing
datafusion-sparkfunctions as a 'bundled' feature with Ballista. This simplifies augmenting Ballista with spark features for quicker and more simple experimentation/adoption.Documentation has been added in the user guide to describe how to enable these functions in builds.
What changes are included in this PR?
spark-compatfeature to register all spark scalar/agg/window/table functions by defaultdatafusion_x_functionsto register default (and spark) functions if applicable in theSessionStatefor scheduler/client usageBallistaFunctionRegistry::default()I've rendered the docs locally to ensure all looks well.
Are there any user-facing changes?
spark-compatto automatically register Spark-compatible scalar, aggregate, and window functions fromdatafusion-spark