Skip to content

Conversation

irenjj
Copy link
Contributor

@irenjj irenjj commented Mar 8, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 8, 2025
Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the efforts, I think the only thing missing is adding test for BoundedWindowAggExec

DuckDB also uses the whole window expression in their pretty explain, this is good enough:

┌─────────────┴─────────────┐
│           WINDOW          │
│    ────────────────────   │
│        Projections:       │
│ sum(v1) OVER (ORDER BY v1 │
│     ASC NULLS LAST ROWS   │
│      BETWEEN UNBOUNDED    │
│ PRECEDING AND CURRENT ROW)│
└─────────────┬─────────────┘

But I think if we can split window-expr, partition-by, order-by, window-frame to separate rows, it will look even better:

window expression:    SUM(v1) OVER (...)
window frame:         ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
partition by:         v1 % 10
order by:             v1 ASC NULLS LAST

I think this should be left to a separate follow-up ticket, I saw BoundedWindowExec can have multiple window expressions, they should all have the same partition and order? This should be checked first and then figure out how to better format it.

01)┌───────────────────────────┐
02)│ ProjectionExec │
03)└─────────────┬─────────────┘
04)┌─────────────┴─────────────┐
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To also test BoundedWindowAggExec, we can set a bounded window frame (e.g. 'rows between 1 preceding and current row`)

 explain SELECT
    v1,
    SUM(v1) OVER (ORDER BY v1 ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS rolling_sum
FROM generate_series(1, 1000) AS t1(v1);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you

  1. update this test to use two expressions that share the same window definition? For exmaple
select 
  count(*) over(),
  row_number() over ()
from table1
  1. Add a query that has two different window definitions? For example
select 
  rank() over(ORDER BY int_col DESC),
  row_number() over (ORDER BY int_col ASC)
from table1

@irenjj
Copy link
Contributor Author

irenjj commented Mar 8, 2025

Thank you so much for the efforts, I think the only thing missing is adding test for BoundedWindowAggExec

DuckDB also uses the whole window expression in their pretty explain, this is good enough:

┌─────────────┴─────────────┐
│           WINDOW          │
│    ────────────────────   │
│        Projections:       │
│ sum(v1) OVER (ORDER BY v1 │
│     ASC NULLS LAST ROWS   │
│      BETWEEN UNBOUNDED    │
│ PRECEDING AND CURRENT ROW)│
└─────────────┬─────────────┘

But I think if we can split window-expr, partition-by, order-by, window-frame to separate rows, it will look even better:

window expression:    SUM(v1) OVER (...)
window frame:         ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
partition by:         v1 % 10
order by:             v1 ASC NULLS LAST

I think this should be left to a separate follow-up ticket, I saw BoundedWindowExec can have multiple window expressions, they should all have the same partition and order? This should be checked first and then figure out how to better format it.

Thanks for you excellent advice!❤️

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @irenjj -- other than a few more tests this one is looking good to me!

01)┌───────────────────────────┐
02)│ ProjectionExec │
03)└─────────────┬─────────────┘
04)┌─────────────┴─────────────┐
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you

  1. update this test to use two expressions that share the same window definition? For exmaple
select 
  count(*) over(),
  row_number() over ()
from table1
  1. Add a query that has two different window definitions? For example
select 
  rank() over(ORDER BY int_col DESC),
  row_number() over (ORDER BY int_col ASC)
from table1

DisplayFormatType::TreeRender => {
// TODO: collect info
write!(f, "")?;
let g: Vec<String> = self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might look better if we added one list of window expressions and then printed the window

expr=count()

window = OVER ORDER BY ...

Not sure if this is possible. We can also explore it in a follow on PR (no need to change here)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @irenjj -- this looks like a good improvement to me

Thanks @2010YOUY01 for the reviews

@alamb alamb changed the title Implement tree explain for BoundedWindowAggExec and WindowAggExec` Implement tree explain for BoundedWindowAggExec and WindowAggExec Mar 9, 2025
@2010YOUY01 2010YOUY01 merged commit f0b86fc into apache:main Mar 10, 2025
24 checks passed
@alamb
Copy link
Contributor

alamb commented Mar 10, 2025

🎉

@alamb
Copy link
Contributor

alamb commented Mar 10, 2025

I am pretty excited about this tree feature -- it is coming along quite well I think. I bet by the end of this week it will be looking quite nice 👌

@irenjj irenjj deleted the explain_winagg branch March 10, 2025 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement tree explain for BoundedWindowAggExec and WindowAggExec

3 participants