-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add a SQL IR and factor out optimizations. #80
Conversation
04dd788
to
91d0f26
Compare
Getting a compilation error for this query: WITH bids as (SELECT bid.auction as auction, bid.datetime as datetime
FROM (select bid from nexmark) where bid is not null)
SELECT AuctionBids.auction as auction, AuctionBids.num as count
FROM (
SELECT
B1.auction,
HOP(INTERVAL '2' SECOND, INTERVAL '10' SECOND) as window,
count(*) AS num
FROM bids B1
GROUP BY 1,2) AS AuctionBids
JOIN (
SELECT max(num) AS maxn, window
FROM (
SELECT count(*) AS num, HOP(INTERVAL '2' SECOND, INTERVAL '10' SECOND) AS window
FROM bids B2
GROUP BY B2.auction, 2
) AS CountBids
GROUP BY 2
) AS MaxBids
ON
AuctionBids.num = MaxBids.maxn
and AuctionBids.window = MaxBids.window; |
This query compiles (interestingly enough, it fails on master) but doesn't produce any output: SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (
PARTITION BY window
ORDER BY bids DESC) as row_num
FROM (SELECT count(*) as bids,
bid.auction as auction_id,
hop(interval '2 seconds', interval '1 minute') as window
FROM nexmark
WHERE bid is not null
group by auction_id, window)) WHERE row_num <= 3``` |
Looking into those. |
This is working now. |
filter: None, | ||
}) => { | ||
if args.len() != 1 { | ||
bail!("unexpected arg length"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this an error that would be sent back to the user? or does it imply a bug in our code? if it's the latter, then it should probably be a panic instead.
}, | ||
JoinListFlatten(JoinType, StructPair), | ||
JoinPairFlatten(JoinType, StructPair), | ||
// TODO: figure out naming of various things called 'window' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SqlWindowFunction?
This introduces arroyo-sql/src/plan_graph.rs, which is a new intermediate representation for our SQL planning. Prior to this change we were converting from DataFusion's logical plan to a SqlOperator and then the SqlOperator produced an
arroyo_datastream::Program
. However, because the SqlOperator is not in a graph it was challenging to perform optimization passes. In order to capture some optimizations they would happen in-line with the conversions from LogicalPlan to SqlOperator. As we added more logic like this it was getting unwieldy. The new sequence for compiling SQL to a pipeline is.PostgresSqlDialect
.SqlOperator
to aPlanGraph
.arroyo_datastream::Program
.It's a lot of steps, but firming up what happens where should allow us to continue to improve our engine.