Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: EXCEPT in SELECT #10438

Merged
merged 20 commits into from
Jun 25, 2023
Merged

feat: EXCEPT in SELECT #10438

merged 20 commits into from
Jun 25, 2023

Conversation

wugouzi
Copy link
Contributor

@wugouzi wugouzi commented Jun 20, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Add support to following syntax:

SELECT * EXCEPT (column1, column2, ...)
...

close #9092.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)

Documentation

  • My PR contains user-facing changes.
Click here for Documentation

Types of user-facing changes

Please keep the types that apply to your changes, and remove the others.

  • Installation and deployment
  • Connector (sources & sinks)
  • SQL commands, functions, and operators
  • RisingWave cluster configuration changes
  • Other (please specify in the release note below)

Release note

@github-actions github-actions bot added type/feature user-facing-changes Contains changes that are visible to users labels Jun 20, 2023
@wugouzi wugouzi marked this pull request as ready for review June 20, 2023 09:08
@wugouzi wugouzi requested a review from st1page June 20, 2023 09:08
@lmatz
Copy link
Contributor

lmatz commented Jun 20, 2023

Can we add more tests, e.g.:

  1. except in an inner subquery
  2. except in the outer query when there is an inner subquery
  3. a combination of 1 + 2

Copy link
Contributor

@st1page st1page left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. we can not bind it in the query level, we should treat it as a modifier on the * or WildCard.
dev=# select * from t;
 a | b | c 
---+---+---
 2 | 1 | 3
(1 row)

dev=# select *, * from t;
 a | b | c | a | b | c 
---+---+---+---+---+---
 2 | 1 | 3 | 2 | 1 | 3
(1 row)
  1. we need to support multiple value in the except. SELECT * EXCEPT (i, j, k);

In conclusion, we might need to add SelectItem::ExceptWithExcept(Vec<Expr>) even use it to replace the original WildCard

@@ -37,6 +37,8 @@
formatted_sql: SELECT id, fname, lname FROM customer WHERE salary <> 'Not Provided' AND salary <> ''
- input: SELECT id FROM customer WHERE NOT salary = ''
formatted_sql: SELECT id FROM customer WHERE NOT salary = ''
- input: SELECT * EXCEPT v1 FROM foo
formatted_sql: SELECT EXCEPT v1 FROM foo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this valid?

Copy link
Contributor Author

@wugouzi wugouzi Jun 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is valid in current implementation and there are many things to be fixed and there will be possible change later.

@xxchan
Copy link
Member

xxchan commented Jun 20, 2023

except in the outer query when there is an inner subquery

e.g., the TopN pattern


BTW, is the syntax decided?

  • Whether to use EXCEPT (clickhouse, spanner) or EXCLUDE (snowflakes, duckdb)
  • Whether () is required (all 4 DBs above seem to require that).

Besides, multiple columns are not tested and things like select a,b except are not tested.

@kwannoel
Copy link
Contributor

Hi could you add some sqlsmith fuzzing tests? Feel free to ask me if any questions.

You may check out the developer docs here: https://github.com/risingwavelabs/risingwave/blob/main/src/tests/sqlsmith/develop.md

@st1page
Copy link
Contributor

st1page commented Jun 20, 2023

except in the outer query when there is an inner subquery

e.g., the TopN pattern

BTW, is the syntax decided?

  • Whether to use EXCEPT (clickhouse, spanner) or EXCLUDE (snowflakes, duckdb)
  • Whether () is required (all 4 DBs above seem to require that).

Besides, multiple columns are not tested and things like select a,b except are not tested.

  1. EXCEPT
  2. () is required

https://duckdb.org/docs/sql/query_syntax/select.html

image

@codecov
Copy link

codecov bot commented Jun 20, 2023

Codecov Report

Merging #10438 (0bfaa3b) into main (f885259) will decrease coverage by 0.01%.
The diff coverage is 61.53%.

@@            Coverage Diff             @@
##             main   #10438      +/-   ##
==========================================
- Coverage   70.38%   70.37%   -0.01%     
==========================================
  Files        1269     1269              
  Lines      217167   217216      +49     
==========================================
+ Hits       152843   152868      +25     
- Misses      64324    64348      +24     
Flag Coverage Δ
rust 70.37% <61.53%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/meta/src/manager/catalog/utils.rs 49.47% <0.00%> (-1.96%) ⬇️
src/sqlparser/src/ast/mod.rs 85.02% <16.66%> (-0.67%) ⬇️
src/frontend/src/binder/expr/function.rs 86.97% <50.00%> (-0.11%) ⬇️
src/sqlparser/src/parser.rs 86.68% <76.92%> (-0.07%) ⬇️
src/frontend/src/binder/select.rs 91.85% <93.33%> (-0.01%) ⬇️
src/frontend/src/handler/create_sink.rs 94.93% <100.00%> (ø)
src/sqlparser/src/ast/query.rs 87.58% <100.00%> (+0.50%) ⬆️

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@xxchan
Copy link
Member

xxchan commented Jun 20, 2023

You shared an EXCLUDE doc 😄

image

@st1page
Copy link
Contributor

st1page commented Jun 20, 2023

You shared an EXCLUDE doc smile

image

yes and just to say that all the grammer need ()

@@ -42,6 +42,15 @@
StreamMaterialize { columns: [v1, t._row_id(hidden)], stream_key: [t._row_id], pk_columns: [t._row_id], pk_conflict: "NoCheck" }
└─StreamFilter { predicate: (t.v1 < 1:Int32) }
└─StreamTableScan { table: t, columns: [t.v1, t._row_id], pk: [t._row_id], dist: UpstreamHashShard(t._row_id) }
- sql: |
create table t (v1 int, v2 int, v3 int);
select * except (v1, v2) from t;
Copy link
Contributor

@kwannoel kwannoel Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add more tests which explicitly select except columns?
Something like:

select v3 from (select * except (v1, v2) from t

Can we also add tests which have selected the excepted row, and re-select already included rows?

select * except (v1, v2), v1 from t; -- v1 excluded from except.
select * except (v1, v2), v3 from t; -- v3 included from except.

Comment on lines 283 to 288
if self.context.range_of.is_empty() {
return Err(ErrorCode::BindError(
"SELECT * with no tables specified is not valid".into(),
)
.into());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract it out of the match?

Comment on lines 292 to 306
let (exprs, names) = self.iter_column_groups();
select_list.extend(exprs);
aliases.extend(names);

// Bind columns that are not in groups
let (exprs, names) = Self::iter_bound_columns(
self.context.columns[..].iter().filter(|c| {
!c.is_hidden
&& !self
.context
.column_group_context
.mapping
.contains_key(&c.index)
}),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some common logic about the column_groups is lost

Copy link
Contributor

@st1page st1page left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wugouzi wugouzi added this pull request to the merge queue Jun 25, 2023
@wugouzi wugouzi removed this pull request from the merge queue due to a manual request Jun 25, 2023
github-merge-queue bot pushed a commit that referenced this pull request Jun 25, 2023
@wugouzi wugouzi added this pull request to the merge queue Jun 25, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 25, 2023
@wugouzi wugouzi enabled auto-merge June 25, 2023 07:27
@wugouzi wugouzi disabled auto-merge June 25, 2023 07:29
@wugouzi wugouzi enabled auto-merge June 25, 2023 07:49
@wugouzi wugouzi added this pull request to the merge queue Jun 25, 2023
Merged via the queue into main with commit 12c145c Jun 25, 2023
@wugouzi wugouzi deleted the qiao/select-exclude branch June 25, 2023 08:14
Copy link
Member

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There aren't any tests for syntax errors 😟

formatted_sql: SELECT * EXCEPT (v1, v2) FROM foo
formatted_ast: 'Query(Query { with: None, body: Select(Select { distinct: All, projection: [WildcardOrWithExcept(Some([Identifier(Ident { value: "v1", quote_style: None }), Identifier(Ident { value: "v2", quote_style: None })]))], from: [TableWithJoins { relation: Table { name: ObjectName([Ident { value: "foo", quote_style: None }]), alias: None, for_system_time_as_of_proctime: false }, joins: [] }], lateral_views: [], selection: None, group_by: [], having: None }), order_by: [], limit: None, offset: None, fetch: None })'
- input: SELECT v3 EXCEPT (v1, v2) FROM foo
error_msg: |-
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a test for syntax. But it is not sufficient indeed. I will add some.

@emile-00 emile-00 added the 📖✓ Covered or will be covered in the user docs. label Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experimental type/feature user-facing-changes Contains changes that are visible to users 📖✓ Covered or will be covered in the user docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: SELECT * EXCEPT [columnA,columnB,...]
6 participants