Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example for simple Expr --> SQL conversion #10528

Merged
merged 7 commits into from
May 17, 2024
Merged

Conversation

edmondop
Copy link
Contributor

Which issue does this PR close?

Closes #10524 .

At the time of opening this PR, the example fails with:

thread 'main' panicked at datafusion-examples/examples/plan_to_sql.rs:43:5:
assertion `left == right` failed
  left: "a < Int32(5) OR a = Int32(8)"
 right: "a < 5 OR a = 8"

@edmondop
Copy link
Contributor Author

Using the expr_to_sql api, we get the following error:

assertion `left == right` failed
  left: "((\"a\" < 5) OR (\"a\" = 8))"
 right: "a < 5 OR a = 8"

@alamb alamb changed the title Example for simple conversion Example for simple Expr --> SQL conversion May 15, 2024
@alamb
Copy link
Contributor

alamb commented May 15, 2024

Using the expr_to_sql api, we get the following error:

For anyone else following along, this conversation is happening on the ticket: #10524 (comment)

@alamb alamb marked this pull request as draft May 15, 2024 19:05
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @edmondop -- this is pretty sweet

It would also be good to add plan_to_sql.rs to the list of examples in https://github.com/apache/datafusion/tree/main/datafusion-examples#single-process

Ok(())
}

/// DataFusion can convert expressions to SQL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it!

@edmondop edmondop marked this pull request as ready for review May 16, 2024 01:33
@edmondop edmondop requested a review from alamb May 16, 2024 01:33
@edmondop
Copy link
Contributor Author

Thanks @edmondop -- this is pretty sweet

It would also be good to add plan_to_sql.rs to the list of examples in main/datafusion-examples#single-process

Examples are now more complete and readme is updated

edmondop added 2 commits May 16, 2024 06:42
Update file header with more details about the examples
@@ -63,6 +63,7 @@ cargo run --example csv_sql
- [`parquet_sql.rs`](examples/parquet_sql.rs): Build and run a query plan from a SQL statement against a local Parquet file
- [`parquet_sql_multiple_files.rs`](examples/parquet_sql_multiple_files.rs): Build and run a query plan from a SQL statement against multiple local Parquet files
- ['parquet_exec_visitor.rs'](examples/parquet_exec_visitor.rs): Extract statistics by visiting an ExecutionPlan after execution
- - [`plan_to_sql.rs`](examples/plan_to_sql.rs): Generate SQL from Datafusion `Expr` and `LogicalPlan`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems an additional -?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @edmondop -- this looks like a great start of these examples. I think @yyy1000's comments about the extra - should be addressed but it isn't required.

cc @devinjdangelo and @backkem (given you started this project)

@alamb
Copy link
Contributor

alamb commented May 16, 2024

I filed #10550 for hte logical plan version too

Fixing extra -
@backkem
Copy link
Contributor

backkem commented May 16, 2024

Using the expr_to_sql api, we get the following error:

assertion `left == right` failed
  left: "((\"a\" < 5) OR (\"a\" = 8))"
 right: "a < 5 OR a = 8"

The current unparser is somewhat conservative for correctness sake; Always quoting identifiers and adding every set of parentheses possible. If we want to make the generated SQL more succinct, these steps will have to be made "smarter". We'll have to add in the math rules to avoid unneeded parentheses and (likely dialect specific) rules for determining of quoting is needed. Note that the latter likely involves listing out the reserved keywords for each dialect.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @edmondop

@alamb alamb merged commit eb3817a into apache:main May 17, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented May 17, 2024

Using the expr_to_sql api, we get the following error:

assertion `left == right` failed
  left: "((\"a\" < 5) OR (\"a\" = 8))"
 right: "a < 5 OR a = 8"

The current unparser is somewhat conservative for correctness sake; Always quoting identifiers and adding every set of parentheses possible. If we want to make the generated SQL more succinct, these steps will have to be made "smarter". We'll have to add in the math rules to avoid unneeded parentheses and (likely dialect specific) rules for determining of quoting is needed. Note that the latter likely involves listing out the reserved keywords for each dialect.

Thank you for that excellent explanation @backkem . Since this has now come up several times, I filed #10557 to track it

findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
* Example for simple conversion

* Fixing example

* Updating README

* Update plan_to_sql.rs

Update file header with more details about the examples

* Fixing formatting

* Update README.md

Fixing extra -
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add an example of how to use the SQL parser/unparser API
4 participants