You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We could make compiler smarter and try to compute some of data transformations at the compile time. But it would reduce similarity between input PRQL and output SQL, so I'm not sure if it's worth it.
There is much we can do with compile-time evaluation (we could basically write a turing-complete interpreter), but simple cases start with:
derive [5 * 15 - 3]
SELECT42
My use case would be:
let fuel_consumption =4.8 # L per 100km
let fuel_price =1.95 # EUR per Lfrom trips
derive fuel_cost = distance_km * fuel_consumption /100* fuel_price
This would be possible for any values that are known at compile-time; thus mostly constants or function parameters.
Now, I know that the amount of work we could offload with this would be minuscule, because majority of the work happens when iterating over rows - which we cannot compute in advance, since we don't know the contents of the rows. Even further, there would probably be no performance improvement on real databases, because they (probably) already evaluate scalar expressions before applying them to all the rows.
But in compiler, this would be very easy to do - we already have to infer types of binary operations so would only have to:
check if an operation uses only constant operands (they don't reference any columns or s-strings)
evaluate the operation
As PRQL functions are already materialized during the compilation, that would also be included.
But the question is not whether we can, but whether we should we implement that? Arguably, having output be similar to input is a big help with debugging. If implemented, it must have a --no-eval flag to disable it.
The only upside I can think of is that some databases may not support all operations PRQL does, for example date arithmetic:
So, as I'm done writing this, I realize that this is quite an unnecessary feature and there is little point investing time into it. Let's just leave it here for future generations.
The text was updated successfully, but these errors were encountered:
I agree with both the question and conclusion — I think we should try and do the minimum possible, and offload everything else to the underlying DB. As you say — it's easier to debug — and we also compile on every keystroke, but only execute the query once, so we want to do as little as possible when compiling.
The only upside I can think of is that some databases may not support all operations PRQL does, for example date arithmetic
Yes, possibly. Though compiling into DATEDIFF is generally possible for most DBs, I think? And doing it regardless of whether it's a scalar or column means the compiler doesn't need to differentiate between those two
Yes, we could put this sort of principle somewhere — i.e. delegate as much to the DB as possible. Maybe we start an ARCHITECTURE.md doc — I find those quite useful, and our architecture is becoming much larger!
We could make compiler smarter and try to compute some of data transformations at the compile time. But it would reduce similarity between input PRQL and output SQL, so I'm not sure if it's worth it.
There is much we can do with compile-time evaluation (we could basically write a turing-complete interpreter), but simple cases start with:
My use case would be:
Also there is this issue.
This would be possible for any values that are known at compile-time; thus mostly constants or function parameters.
Now, I know that the amount of work we could offload with this would be minuscule, because majority of the work happens when iterating over rows - which we cannot compute in advance, since we don't know the contents of the rows. Even further, there would probably be no performance improvement on real databases, because they (probably) already evaluate scalar expressions before applying them to all the rows.
But in compiler, this would be very easy to do - we already have to infer types of binary operations so would only have to:
As PRQL functions are already materialized during the compilation, that would also be included.
But the question is not whether we can, but whether we should we implement that? Arguably, having output be similar to input is a big help with debugging. If implemented, it must have a
--no-eval
flag to disable it.The only upside I can think of is that some databases may not support all operations PRQL does, for example date arithmetic:
So, as I'm done writing this, I realize that this is quite an unnecessary feature and there is little point investing time into it. Let's just leave it here for future generations.
The text was updated successfully, but these errors were encountered: