Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support array_sort(list_sort) #8279

Merged
merged 50 commits into from
Dec 6, 2023
Merged

feat: Support array_sort(list_sort) #8279

merged 50 commits into from
Dec 6, 2023

Conversation

Asura7969
Copy link
Contributor

Which issue does this PR close?

Closes #109 .

Rationale for this change

Support append_sort(list_sort) function

What changes are included in this PR?

Four array functions now start to support column data:
array_sort
list_sort

Are these changes tested?

Yes
test_array_sort

Are there any user-facing changes?

Yes

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt) labels Nov 20, 2023
Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since DuckDB has array_sort already, and I think there is no good reason that we should not follow theirs. Can you have the similar style like theirs?

https://duckdb.org/docs/sql/functions/nested#sorting-lists

Especially, I think array_sort(list, 'asc') or array_sort(list, 'desc') is more much straightforward then true, false. We need to check one of the result to know whether true is asc or desc. So as null_first

@@ -2669,6 +2698,48 @@ mod tests {
);
}

#[test]
fn test_array_sort() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont add test here, just add them in slt file

@Asura7969
Copy link
Contributor Author

Since DuckDB has array_sort already, and I think there is no good reason that we should not follow theirs. Can you have the similar style like theirs?

https://duckdb.org/docs/sql/functions/nested#sorting-lists

Especially, I think array_sort(list, 'asc') or array_sort(list, 'desc') is more much straightforward then true, false. We need to check one of the result to know whether true is asc or desc. So as null_first

I'll try

@@ -1044,6 +1044,20 @@ select make_array(['a','b'], null);
----
[[a, b], ]

## array_sort (aliases: `list_sort`)
query ???
select array_sort(make_array(4, 2, 3, 1)), array_sort(make_array(1.0, 4.0, null, 3.0), true, true), array_sort(make_array('a', 'd', 'c', null, 'b'), false, false);
Copy link
Contributor

@jayzhan211 jayzhan211 Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, you can also try column-wise support, pass the test with table, you can refer other array_function that run with table. But it is fine for me to support that in follow on PR

[, 11, 12, 13, 14, 15, 16, 17, 18, 20]
[, 21, 22, 23, 25, 26, 27, 28, 29, 30]
[, 31, 32, 33, 34, 35, 37, 38, 39, 40]
[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array_sort(null) should return null, not []

Copy link
Contributor

@jayzhan211 jayzhan211 Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

# Conflicts:
#	datafusion/proto/proto/datafusion.proto
#	datafusion/proto/src/generated/pbjson.rs
#	datafusion/proto/src/generated/prost.rs
@Asura7969
Copy link
Contributor Author

cc @alamb

Copy link
Member

@Weijun-H Weijun-H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @Asura7969

docs/source/user-guide/sql/scalar_functions.md Outdated Show resolved Hide resolved
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Asura7969 (and @jayzhan211 and @Weijun-H for the reviews), and I am really sorry for taking so long to look at this PR

I think all that is needed now is to merge up from main, and resolve some conflicts, and this PR will be ready to merge

Thanks again

# Conflicts:
#	datafusion/expr/src/built_in_function.rs
#	datafusion/proto/proto/datafusion.proto
#	datafusion/proto/src/generated/pbjson.rs
#	datafusion/proto/src/generated/prost.rs
@alamb alamb merged commit d9d8ddd into apache:main Dec 6, 2023
23 checks passed
@alamb
Copy link
Contributor

alamb commented Dec 6, 2023

Thanks again 🚀

@Asura7969 Asura7969 deleted the list_sort branch December 7, 2023 00:44
appletreeisyellow pushed a commit to appletreeisyellow/datafusion that referenced this pull request Dec 15, 2023
* Minor: Improve the document format of JoinHashMap

* list sort

* fix: example doc

* fix: ci

* fix: doc error

* fix pb

* like DuckDB function semantics

* fix ci

* fix pb

* fix: doc

* add table test

* fix: not as expected

* fix: return null

* resolve conflicts

* doc

* merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for list_sort
4 participants