Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add calcite and express operators in terms of it #58

Open
kyprifog opened this issue May 6, 2021 · 0 comments
Open

Add calcite and express operators in terms of it #58

kyprifog opened this issue May 6, 2021 · 0 comments

Comments

@kyprifog
Copy link
Collaborator

kyprifog commented May 6, 2021

This is the first step in the following very ambitious 3 step process:

Step 1.
Reexpress as many mason operators in calcite as possible. In terms of execution, this would mean those jobs would all become QueryJob. Validate the calcite SQL, so when the job is serialized they are sending across calcite SQL (as opposed to SparkSQL, Hive, PrestoSQL)

Step 2.
In mason-spark use Coral to translate the calcite to SparkSQL (RelNode -> Spark Catalyst). Use this to build mason-hive and mason-presto (and get alot of operator support for free using coral).

Step 3.
Mason operators and workflows are now a curated collection of calcite SQL pipelines with some additional connecting tissue that goes outside of what SQL (really should) express. Look into the effort to add logica (datalog based query language) as a view language for coral. This has possibility to allow constraints address the additional connecting tissue (types for governance, authenatication, io formatting, workflow specification?). Then mason operators/workflows would be expressed completely within Datalog. This could be particularly interesting for expressing things like job fan out and aggregation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant