Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should compiler strive to evaulate expressions? #643

Open
aljazerzen opened this issue Jun 23, 2022 · 3 comments
Open

Should compiler strive to evaulate expressions? #643

aljazerzen opened this issue Jun 23, 2022 · 3 comments
Labels
compiler language-design Changes to PRQL-the-language

Comments

@aljazerzen
Copy link
Member

aljazerzen commented Jun 23, 2022

We could make compiler smarter and try to compute some of data transformations at the compile time. But it would reduce similarity between input PRQL and output SQL, so I'm not sure if it's worth it.

There is much we can do with compile-time evaluation (we could basically write a turing-complete interpreter), but simple cases start with:

derive [5 * 15 - 3]
SELECT 42

My use case would be:

let fuel_consumption = 4.8 # L per 100km
let fuel_price = 1.95 # EUR per L

from trips
derive fuel_cost = distance_km * fuel_consumption / 100 * fuel_price
SELECT distance_km * 0.0936 as fuel_cost FROM trips

Also there is this issue.

This would be possible for any values that are known at compile-time; thus mostly constants or function parameters.

Now, I know that the amount of work we could offload with this would be minuscule, because majority of the work happens when iterating over rows - which we cannot compute in advance, since we don't know the contents of the rows. Even further, there would probably be no performance improvement on real databases, because they (probably) already evaluate scalar expressions before applying them to all the rows.

But in compiler, this would be very easy to do - we already have to infer types of binary operations so would only have to:

  • check if an operation uses only constant operands (they don't reference any columns or s-strings)
  • evaluate the operation

As PRQL functions are already materialized during the compilation, that would also be included.

But the question is not whether we can, but whether we should we implement that? Arguably, having output be similar to input is a big help with debugging. If implemented, it must have a --no-eval flag to disable it.

The only upside I can think of is that some databases may not support all operations PRQL does, for example date arithmetic:

let today_started = @2022-06-23T04:36+02
let today_finished = @2022-06-23T11:44+01

from hikes
filter duration_min < (to_min today_finished - today_started)
SELECT * FROM hikes WHERE duration_min < 488 

So, as I'm done writing this, I realize that this is quite an unnecessary feature and there is little point investing time into it. Let's just leave it here for future generations.

@aljazerzen aljazerzen added language-design Changes to PRQL-the-language compiler labels Jun 23, 2022
@max-sixty
Copy link
Member

I agree with both the question and conclusion — I think we should try and do the minimum possible, and offload everything else to the underlying DB. As you say — it's easier to debug — and we also compile on every keystroke, but only execute the query once, so we want to do as little as possible when compiling.

The only upside I can think of is that some databases may not support all operations PRQL does, for example date arithmetic

Yes, possibly. Though compiling into DATEDIFF is generally possible for most DBs, I think? And doing it regardless of whether it's a scalar or column means the compiler doesn't need to differentiate between those two

@max-sixty
Copy link
Member

Let's just leave it here for future generations.

Yes, we could put this sort of principle somewhere — i.e. delegate as much to the DB as possible. Maybe we start an ARCHITECTURE.md doc — I find those quite useful, and our architecture is becoming much larger!

@vanillajonathan

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler language-design Changes to PRQL-the-language
Projects
None yet
Development

No branches or pull requests

3 participants