Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Approach for expressions in from #919

Closed
max-sixty opened this issue Aug 8, 2022 · 7 comments
Closed

Approach for expressions in from #919

max-sixty opened this issue Aug 8, 2022 · 7 comments
Labels
language-design Changes to PRQL-the-language

Comments

@max-sixty
Copy link
Member

Currently from s"<s-string>" is not allowed. We'd like to make this possible, both for #918, and because it's a general & orthogonal language feature.

As part of that we can also allow broader expressions, such as from (from y). While I don't see this being that useful, it would be useful to generally extend expressions to everywhere; e.g. in sort & group; so sort (foo * 2) works.

There's some tradeoff between this and Semantic — if we have an s-string there, then we can't really understand where columns come from.

Any thoughts on this tradeoff? Do we have a mode in Semantic that basically shrugs and says "I dunno"?

@max-sixty max-sixty added language-design Changes to PRQL-the-language semantic labels Aug 8, 2022
@aljazerzen
Copy link
Member

Unfortunately, I don't think this is possible.

From requires an ident, because it is later used as prefix for all column names. I.e.

from employees
select age

When age is encountered, it is resolved as employees.age. And if later another age is encountered, it is also translated to SQL as such:

table buildings = (from buildings | select [age, building_id])

from employees
select [age, building_id]
join buildings [building_id]

Even though employees.age was referenced only as age, it will translate to employees.age. That's why we need table name as a single ident.

Just a peek behind the scenes: From is actually an identity function. It's only function is that it resolves its argument in "table namespace" - so it cannot resolve into a column or a function.

@max-sixty
Copy link
Member Author

Right, very good point.

One approach that is a) awkward and b) only covers some cases — is to make a CTE out of the s-string:

from s"SELECT * FROM employees WHERE foo > 5"

would compile to:

with _31 as (
    SELECT * FROM employees WHERE foo > 5
)
SELECT * FROM _31

...and then the columns are on _31, like _31.age not on the s-string.


But it only covers some cases — it doesn't cover the dbt case:

from s'"foo"."employees"'

...which needs to be treated like the literal "foo"."employees", so we get:

SELECT * from "foo"."employees".age

We could make a guess (i.e. does it have a SELECT word), but then it feels like we're tripping over ourselves because of a bad abstraction.


I still think it's worth doing this for sort / group / etc — there are far fewer tradeoffs there, AFAIK

@aljazerzen
Copy link
Member

Yes, sort and group are a differnet story. Similar, but possible to implement.

from employees
group age % 10 (aggregate count)

... would resolve to:

from employees
derive _tmp = age % 10
group _tmp (aggregate count)

... which would translate to:

with _table_0 = (select age % 10 as tmp, * from employees)
select _tmp, count(*)
from _table_0
group by _tmp

@aljazerzen
Copy link
Member

Oh wait, SQL does support expressions in GROUP BY! It's even easier.

@max-sixty
Copy link
Member Author

Ah great! Though ORDER BY needs the adjustment above (I have been trying to get this to work, but in short bursts again...)

@aljazerzen
Copy link
Member

This was closed by #1197

Both:

from employees
group age % 10 (aggregate count)

... and ...

from s"SELECT * FROM employees WHERE foo > 5"

... now compile correctly.

@max-sixty
Copy link
Member Author

That's so awesome, again, @aljazerzen !

Maybe I add some examples to the book?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language-design Changes to PRQL-the-language
Projects
None yet
Development

No branches or pull requests

2 participants