Add calcite and express operators in terms of it #58

kyprifog · 2021-05-06T13:56:41Z

This is the first step in the following very ambitious 3 step process:

Step 1.
Reexpress as many mason operators in calcite as possible. In terms of execution, this would mean those jobs would all become QueryJob. Validate the calcite SQL, so when the job is serialized they are sending across calcite SQL (as opposed to SparkSQL, Hive, PrestoSQL)

Step 2.
In mason-spark use Coral to translate the calcite to SparkSQL (RelNode -> Spark Catalyst). Use this to build mason-hive and mason-presto (and get alot of operator support for free using coral).

Step 3.
Mason operators and workflows are now a curated collection of calcite SQL pipelines with some additional connecting tissue that goes outside of what SQL (really should) express. Look into the effort to add logica (datalog based query language) as a view language for coral. This has possibility to allow constraints address the additional connecting tissue (types for governance, authenatication, io formatting, workflow specification?). Then mason operators/workflows would be expressed completely within Datalog. This could be particularly interesting for expressing things like job fan out and aggregation

kyprifog added the big picture label May 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add calcite and express operators in terms of it #58

Add calcite and express operators in terms of it #58

kyprifog commented May 6, 2021 •

edited

Loading

Add calcite and express operators in terms of it #58

Add calcite and express operators in terms of it #58

Comments

kyprifog commented May 6, 2021 • edited Loading

kyprifog commented May 6, 2021 •

edited

Loading