Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing dbt-prql to work with Postgres #918

Closed
max-sixty opened this issue Aug 7, 2022 · 3 comments
Closed

Allowing dbt-prql to work with Postgres #918

max-sixty opened this issue Aug 7, 2022 · 3 comments
Labels
bug Invalid compiler output or panic integrations language-design Changes to PRQL-the-language

Comments

@max-sixty
Copy link
Member

I've been adding some integration tests to dbt-prql, and unfortunately it doesn't work with the dialects that use double-quotes.

That's because:

from in_process = {{ source('salesforce', 'in_process') }}

is first parsed to:

from "salesforce"."in_process"

...and then passed to the prql compiler. That's not valid PRQL. I had been expecting from in_process = {{ source('salesforce', 'in_process') }} to be passed to the prql compiler first — which is valid prql — and then for the compiled sql to be formatted by dbt1.

Some options for remedying it:

  • Ask people to use an s-string for these — i.e. s"{{ source('salesforce', 'in_process') }}" — and allow s-strings in from
  • Ask people to use backticks — i.e. `{{ source('salesforce', 'in_process') }}` — and adjust how we handle backticks such that `"foo"` compiles to "foo" rather than """foo""".
    • Currently we take the literal inside backticks, and then escape it, such that the DB receives the identifier within the backticks. That means it escapes the "s — as though we've called a column "foo", with the quotes.

I spent a lot of time trying to get dbt to pass the raw string to prql, and then parse the jinja after. But I couldn't manage a way, because we monkey patch into dbt's jinja, and jinja passes extensions an AST. Ofc if anyone has any ideas, then very open to exploring them.

Footnotes

  1. With BQ, this is fine, since this is valid PRQL:

    from `salesforce`.`in_process`
    
@max-sixty max-sixty added bug Invalid compiler output or panic language-design Changes to PRQL-the-language integrations labels Aug 7, 2022
@aljazerzen
Copy link
Member

aljazerzen commented Aug 10, 2022

The problem here is that DBT emits result of {{ }} as an SQL identifier, which is not the same as PRQL identifier.

The ideal solution would be for DBT to emit a PRQL ident which would be salesforce.in_process.

I don't think that s-strings are an option, as I wrote in #919

@max-sixty
Copy link
Member Author

The problem here is that DBT emits result of {{ }} as an SQL identifier, which is not the same as PRQL identifier.

Yes, well summarized

The ideal solution would be for DBT to emit a PRQL ident which would be salesforce.in_process.

I agree — though that would require a much bigger change. Either we need dbt to support PRQL (we're in discussions with them now), or we'd need to do very invasive monkeypatching, a whole extra level, and we're already quite extreme there...


I'll continue the discussion #919.

Otherwise we could just not support postgres / redshift / snowflake until we're better integrated into dbt.. Maybe we work on increasing adoption in BQ first, before we sink more time into making it work with those.

@aljazerzen
Copy link
Member

If I remember correctly, we decided not to parse jinja expressions with PRQL and instead create a dbt plugin that would allow to run dbt first, resolve jinja expressions and only then run PRQL.

The plugin is here dbt-core#5982.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Invalid compiler output or panic integrations language-design Changes to PRQL-the-language
Projects
None yet
Development

No branches or pull requests

2 participants