Should compiler strive to evaulate expressions? #643

aljazerzen · 2022-06-23T13:14:01Z

We could make compiler smarter and try to compute some of data transformations at the compile time. But it would reduce similarity between input PRQL and output SQL, so I'm not sure if it's worth it.

There is much we can do with compile-time evaluation (we could basically write a turing-complete interpreter), but simple cases start with:

derive [5 * 15 - 3]

SELECT 42

My use case would be:

let fuel_consumption = 4.8 # L per 100km
let fuel_price = 1.95 # EUR per L

from trips
derive fuel_cost = distance_km * fuel_consumption / 100 * fuel_price

SELECT distance_km * 0.0936 as fuel_cost FROM trips

Also there is this issue.

This would be possible for any values that are known at compile-time; thus mostly constants or function parameters.

Now, I know that the amount of work we could offload with this would be minuscule, because majority of the work happens when iterating over rows - which we cannot compute in advance, since we don't know the contents of the rows. Even further, there would probably be no performance improvement on real databases, because they (probably) already evaluate scalar expressions before applying them to all the rows.

But in compiler, this would be very easy to do - we already have to infer types of binary operations so would only have to:

check if an operation uses only constant operands (they don't reference any columns or s-strings)
evaluate the operation

As PRQL functions are already materialized during the compilation, that would also be included.

But the question is not whether we can, but whether we should we implement that? Arguably, having output be similar to input is a big help with debugging. If implemented, it must have a --no-eval flag to disable it.

The only upside I can think of is that some databases may not support all operations PRQL does, for example date arithmetic:

let today_started = @2022-06-23T04:36+02
let today_finished = @2022-06-23T11:44+01

from hikes
filter duration_min < (to_min today_finished - today_started)

SELECT * FROM hikes WHERE duration_min < 488

So, as I'm done writing this, I realize that this is quite an unnecessary feature and there is little point investing time into it. Let's just leave it here for future generations.

The text was updated successfully, but these errors were encountered:

max-sixty · 2022-06-23T17:56:14Z

I agree with both the question and conclusion — I think we should try and do the minimum possible, and offload everything else to the underlying DB. As you say — it's easier to debug — and we also compile on every keystroke, but only execute the query once, so we want to do as little as possible when compiling.

The only upside I can think of is that some databases may not support all operations PRQL does, for example date arithmetic

Yes, possibly. Though compiling into DATEDIFF is generally possible for most DBs, I think? And doing it regardless of whether it's a scalar or column means the compiler doesn't need to differentiate between those two

max-sixty · 2022-06-23T17:58:36Z

Let's just leave it here for future generations.

Yes, we could put this sort of principle somewhere — i.e. delegate as much to the DB as possible. Maybe we start an ARCHITECTURE.md doc — I find those quite useful, and our architecture is becoming much larger!

aljazerzen added language-design Changes to PRQL-the-language compiler labels Jun 23, 2022

aljazerzen mentioned this issue Aug 4, 2022

nulls in expressions #905

Open

aljazerzen mentioned this issue Dec 22, 2022

feat: static analysis #1324

Merged

This comment was marked as off-topic.

Sign in to view

vanillajonathan mentioned this issue Feb 12, 2023

Allow no main pipeline? #1803

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should compiler strive to evaulate expressions? #643

Should compiler strive to evaulate expressions? #643

aljazerzen commented Jun 23, 2022 •

edited

Loading

max-sixty commented Jun 23, 2022

max-sixty commented Jun 23, 2022

This comment was marked as off-topic.

Should compiler strive to evaulate expressions? #643

Should compiler strive to evaulate expressions? #643

Comments

aljazerzen commented Jun 23, 2022 • edited Loading

max-sixty commented Jun 23, 2022

max-sixty commented Jun 23, 2022

This comment was marked as off-topic.

aljazerzen commented Jun 23, 2022 •

edited

Loading