Skip to content

Conversation

qstommyshu
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

This is the part 5 of #15484 breakdown, as the code changes in #15484 is too large. Part1: #15497, Part2: #15499, Part3: #15533, Part4: 15548

Are these changes tested?

Yes, I manually tested the before/after changes.

Are there any user-facing changes?

No

@github-actions github-actions bot added the sql SQL Planner label Apr 3, 2025
@alamb
Copy link
Contributor

alamb commented Apr 3, 2025

is this the last one?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @qstommyshu

FYI @blaginin

"#
.trim();

let expected_dt = "[Int32]";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like this lost the datatype somehow
Maybe we can make a function that returns the plan string as well as the logical datatype?

Copy link
Contributor Author

@qstommyshu qstommyshu Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function fn generate_prepare_stmt_and_data_types(sql: &str) -> (LogicalPlan, String) already returns what we need — both the plan and the inferred logical data types. However, the test named test_prepare_statement_infer_types_from_between_predicate() is a bit misleading, as it suggests we're testing a PREPARE statement and its associated data types, but it is not.

In the original prepare_stmt_quick_test(), the expected_dt argument isn't actually used in this specific test case — because the SQL being tested is not a PREPARE statement. It doesn’t generate a Statement::Prepare variant in the logical plan, so it never enters the matching arm in prepare_stmt_quick_test():

// verify data types
if let LogicalPlan::Statement(Statement::Prepare(Prepare { data_types, .. })) = assert_plan {
    let dt = format!("{data_types:?}");
    assert_eq!(dt, expected_data_types);
}

This is exactly why I chose to panic! in generate_prepare_stmt_and_data_types() — to ensure we're only using it with valid PREPARE statements. The original test was essentially relying on a semantic bug, and what I did was make that behaviour explicit and correct.

I've updated other similar cases as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense -- thank you

I actually looks to me like it is a mistake in the test 🤔

Perhaps the test should be something like

    let sql = "PREPARE SELECT id, age FROM person WHERE age BETWEEN $1 AND $2";

Instead of

    let sql = "SELECT id, age FROM person WHERE age BETWEEN $1 AND $2";

(maybe we can fix this in a follow on PR -- I will file a ticket)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected_plan: &str,
expected_data_types: &str,
) -> LogicalPlan {
fn generate_prepare_stmt_and_data_types(sql: &str) -> (LogicalPlan, String) {
Copy link
Contributor

@blaginin blaginin Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make output arguments consistent, do you think you could you return data_types directly and compare them via assert_debug_snapshot?

Copy link
Contributor Author

@qstommyshu qstommyshu Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just took a closer look at this method 👀. It turns out that assert_debug_snapshot! doesn’t support inline snapshots — it always writes to an external .snap file under the snapshots/ directory (which I assume we’re trying to avoid?).

As for the output arguments — if we want to return just the LogicalPlan, I’d need to extract the data_types separately using pattern matching or a helper function, and then snapshot that field explicitly.

So the updated test flow for a PREPARE statement would look like this:

let plan = logical_plan(sql).unwrap();
if let LogicalPlan::Statement(Statement::Prepare(Prepare { data_types, .. })) = plan {
	assert_snapshot!(data_types, @r"SOME DATA TYPE");
}
assert_snapshot!(
        plan,
        @r#"
    Prepare: "my_plan" [Int32] 
      Projection: person.id, person.age
        Filter: person.age = Int64(10)
          TableScan: person
"#
    );

instead of what we have now:

let (plan, dt) = generate_prepare_stmt_and_data_types(sql);
    assert_snapshot!(
        plan,
        @r#"
    Prepare: "my_plan" [Int32] 
      Projection: person.id, person.age
        Filter: person.age = Int64(10)
          TableScan: person
    "#
    );
    assert_snapshot!(dt, @r#"[Int32]"#);

I’m happy to switch to the first version if that’s preferred — just let me know what works best for you! @blaginin @alamb

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like what you have here 👍

@qstommyshu
Copy link
Contributor Author

qstommyshu commented Apr 3, 2025

is this the last one?

This is the last one for migrating tests in tests/sql_integration.rs.

There are still some cases in tests/cases/plan_to_sql.rs and tests/cases/diagnostic.rs. I suppose this is the second last one (that's my guess) as the cases in those two files are much less and simpler. The last PR would be migrating tests in tests/cases/plan_to_sql.rs and tests/cases/diagnostic.rs to insta.

@qstommyshu qstommyshu requested review from alamb and blaginin April 4, 2025 13:08
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again @qstommyshu -- this PR is very well explained, coded and so nice 👌 👨‍🍳

Thanks @blaginin for helping make this happen

expected_plan: &str,
expected_data_types: &str,
) -> LogicalPlan {
fn generate_prepare_stmt_and_data_types(sql: &str) -> (LogicalPlan, String) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like what you have here 👍

"#
.trim();

let expected_dt = "[Int32]";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense -- thank you

I actually looks to me like it is a mistake in the test 🤔

Perhaps the test should be something like

    let sql = "PREPARE SELECT id, age FROM person WHERE age BETWEEN $1 AND $2";

Instead of

    let sql = "SELECT id, age FROM person WHERE age BETWEEN $1 AND $2";

(maybe we can fix this in a follow on PR -- I will file a ticket)

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

Let's keep the project and momentum -- merging this one in!

@alamb alamb merged commit 2cd6ed9 into apache:main Apr 4, 2025
27 checks passed
nirnayroy pushed a commit to nirnayroy/datafusion that referenced this pull request May 2, 2025
* migrate all tests to `insta` if possible

* clean up

* remove unused import from sql integration tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants