-
Notifications
You must be signed in to change notification settings - Fork 28.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-49505][SQL] Create new SQL functions "randstr" and "uniform" t…
…o generate random strings or numbers within ranges ### What changes were proposed in this pull request? This PR introduces two new SQL functions "randstr" and "uniform" to generate random strings or numbers within ranges. * The "randstr" function returns a string of the specified length whose characters are chosen uniformly at random from the following pool of characters: 0-9, a-z, A-Z. The random seed is optional. The string length must be a constant two-byte or four-byte integer (SMALLINT or INT, respectively). * The "uniform" function returns a random value with independent and identically distributed values with the specified range of numbers. The random seed is optional. The provided numbers specifying the minimum and maximum values of the range must be constant. If both of these numbers are integers, then the result will also be an integer. Otherwise if one or both of these are floating-point numbers, then the result will also be a floating-point number. For example: ``` SELECT randstr(5); > ceV0P SELECT randstr(10, 0) FROM VALUES (0), (1), (2) tab(col); > ceV0PXaR2I fYxVfArnv7 iSIv0VT2XL SELECT uniform(10, 20.0F); > 17.604954 SELECT uniform(10, 20, 0) FROM VALUES (0), (1), (2) tab(col); > 15 16 17 ``` ### Why are the changes needed? This improves the SQL functionality of Apache Spark and improves its parity with other systems: * https://clickhouse.com/docs/en/sql-reference/functions/random-functions#randuniform * https://docs.snowflake.com/en/sql-reference/functions/uniform * https://www.microfocus.com/documentation/silk-test/21.0.2/en/silktestclassic-help-en/STCLASSIC-8BFE8661-RANDSTRFUNCTION-REF.html * https://docs.snowflake.com/en/sql-reference/functions/randstr ### Does this PR introduce _any_ user-facing change? Yes, see above. ### How was this patch tested? This PR adds golden file based test coverage. ### Was this patch authored or co-authored using generative AI tooling? Not this time. Closes #48004 from dtenedor/uniform-randstr-functions. Authored-by: Daniel Tenedorio <daniel.tenedorio@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
- Loading branch information
Showing
7 changed files
with
1,191 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.