Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PRQL code generation & formatting #1420

Closed
aljazerzen opened this issue Jan 3, 2023 · 5 comments · Fixed by #2669
Closed

PRQL code generation & formatting #1420

aljazerzen opened this issue Jan 3, 2023 · 5 comments · Fixed by #2669
Labels
compiler good first issue Tasks suited for learning the compiler codebase

Comments

@aljazerzen
Copy link
Member

aljazerzen commented Jan 3, 2023

This is an onboarding issue, see instructions on how to get started.

For debugging purposes and for the prqlc fmt command, we have implemented a basic PRQL code generation from PL AST. It is done via impl std::fmt::Display for X, which is convenient, but lacks a few key features:

  • indentation can only be done ad-hoc, when outputting nested pipelines or lists,
  • we have no control over maximum line width,
  • we have no control on tab size.

Instead, I suggest we follow implementation of rustfmt. I've started work on codegen branch.

  • Start by implementing WriteSource for pl::StmtKind.

  • You should be able to reuse majority of impl Display for StmtKind and adapt it to use string concatenation, similar to existing code in src/codegen.rs. Whit this I mean to change things like this:

    write!(f, "prql")?;
    if let Some(version) = &query.version {
        write!(f, " version:{}", version)?;
    }
    Ok(())
    

    ... into:

    let mut res = String::new();
    res += "prql";
    if let Some(version) = &query.version {
      res += &format!(" version:{}", version);
    }
    Some(res)
    
  • When you use write!(f, "{some_var}") or format!("{some_var}"), Rust will call std::fmt::Display on some_var. We used this to do formatting and this issue is about migrating away from that to our own trait WriteSource. So when reusing the code from current impl, it's fine to use write! and format! for now, but the goal is to replace this with calls to WriteSource::write().

  • To test it, uncomment r += &stmt.kind.write(opt.clone()).unwrap(); in codegen.rs and cargo run fmt a_test_file.prql should be working.

Here are a few test files:

prql    version:   "^0.4"    dialect:  sql.postgres

prql

from       my_table
select   [    some_column
]
prql    version:   "^0.4"    dialect:  sql.postgres

let my_table   =    (from base_table | select a)

from my_table | select [some_column]

The contents of pipelines are obviously not yet implemented and should output <todo> or something.

This is the gist of it. Similarly, we have adapt basically all Display impls in ast::pl to WriteSource. I suggest starting with ExprKind, which is already partially done and can easily be tested.

Because this is quite a lot of work, we can merge smaller commits implementing any small chunk of the codegen.

This issue is a prerequisite for #130.

@aljazerzen aljazerzen added the good first issue Tasks suited for learning the compiler codebase label Jan 3, 2023
@aljazerzen aljazerzen added this to the 0.5 milestone Jan 3, 2023
@aljazerzen
Copy link
Member Author

@Rishav1707 this is the followup on #1418

@max-sixty max-sixty changed the title PRQL code generation PRQL code generation & formatting Jan 10, 2023
@max-sixty
Copy link
Member

max-sixty commented Mar 6, 2023

FYI — in #2016 we test that the examples, after being formatted with the existing formatter, can still compile.

Where they can't, the language in the markdown in the book is prql_no_fmt rather than prql.

@PRQL PRQL deleted a comment from aljazerzen Mar 23, 2023
@max-sixty
Copy link
Member

FYI I deleted a couple of messages to keep the issue clean, I confused things.


@aljazerzen I think for a Good "Actually First" Issue, I think this needs an initial effort for the framework. I feel like it's at my level at the moment, but not trivially so, and so for a completely new person it might be quite confusing.

  • (Not saying you should necessarily do it, tbc. Instead just raising the issue — I've updated my view on the amount of context that a good "actually first" issue can require, and I think it should ideally be limited to a couple of files, if possible with a similar PR that's already been merged. So this could work really well after the first one)

@aljazerzen
Copy link
Member Author

I did the framework on codegen branch, as noted in the description.

It may need to be rebased to latest main branch. I didn't t merge it because we'd have to maintain it along with existing codegen impl.

@max-sixty
Copy link
Member

(Thanks a lot. I added some thoughts to the broader point in #1840. Hope this isn't interpreted as criticism, am still trying to work out why we haven't been so successful here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler good first issue Tasks suited for learning the compiler codebase
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants