Skip to content

Conversation

@miland-db
Copy link
Contributor

@miland-db miland-db commented Mar 21, 2024

What changes were proposed in this pull request?

Extend built-in string functions to support non-binary, non-lowercase collation for: instr & find_in_set.

Why are the changes needed?

Update collation support for built-in string functions in Spark.

Does this PR introduce any user-facing change?

Yes, users should now be able to use COLLATE within arguments for built-in string functions INSTR and FIND_IN_SET in Spark SQL queries, using non-binary collations such as UNICODE_CI.

How was this patch tested?

Unit tests for queries using "collate" (CollationSuite).

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Mar 21, 2024
@miland-db miland-db force-pushed the miland-db/substr-functions branch from 6a52d92 to eb2d7c5 Compare March 21, 2024 15:55
@miland-db
Copy link
Contributor Author

@cloud-fan @uros-db @mihailom-db can you take a look at this changes please?

@miland-db miland-db changed the title [SPARK-47477][SQL][COLLATION] String function support: instr [SPARK-47477][SQL][COLLATION] String function support: instr & find_in_set Mar 21, 2024
@miland-db miland-db changed the title [SPARK-47477][SQL][COLLATION] String function support: instr & find_in_set [SPARK-47411][SQL][COLLATION] String function support: instr & find_in_set Mar 25, 2024
Copy link
Contributor

@uros-db uros-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @cloud-fan please review

@miland-db miland-db requested a review from cloud-fan April 16, 2024 14:36
# Conflicts:
#	common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationSupport.java
#	sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
Copy link
Contributor

@mihailomilosevic2001 mihailomilosevic2001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please change PR name to coincide with functions that you were implementing and with the JIRA ticket

@miland-db miland-db changed the title [SPARK-47411][SQL] Support INSTR & FIND_IN_SET functions to work with collated strings [SPARK-47411][SQL] Support StringInstr & FindInSet functions to work with collated strings Apr 22, 2024
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 256fc51 Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants