Skip to content

Conversation

@Jefffrey
Copy link
Contributor

@Jefffrey Jefffrey commented Oct 8, 2025

Which issue does this PR close?

Part of #17964

Rationale for this change

Some of the internals are the same; reuse this code.

What changes are included in this PR?

Refactor SparkAscii to reuse the DataFusion Ascii internal code, adding doc comments to SparkAscii to explain difference with DataFusion version (hence why they are separate functions). Also update some of the Spark function documentation to clarify behaviour when it has overlapping function names with DataFusion functions.

Are these changes tested?

Existing tests

Are there any user-facing changes?

No

@github-actions github-actions bot added functions Changes to functions implementation spark labels Oct 8, 2025
Comment on lines +187 to +188
test_ascii!(Some(String::from("\n")), Ok(Some(10)));
test_ascii!(Some(String::from("\t")), Ok(Some(9)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consolidating the tests here, removing from Spark unit test (Spark slt still remains)

/// [ascii]: https://spark.apache.org/docs/latest/api/sql/index.html#ascii
/// [default ascii function]: datafusion_functions::string::ascii::AsciiFunc
#[derive(Debug, PartialEq, Eq, Hash)]
pub struct SparkAscii {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered removing this entirely and making AsciiFunc have a toggleable behaviour for this, but keeping it like this was easier to work with considering existing macros used to export the udf's

@Jefffrey Jefffrey marked this pull request as ready for review October 8, 2025 13:02
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Jefffrey please clarify if ascii can support numeric like in

https://spark.apache.org/docs/latest/api/sql/#ascii

> SELECT ascii(2);
 50

@Jefffrey
Copy link
Contributor Author

Jefffrey commented Oct 9, 2025

Thanks @Jefffrey please clarify if ascii can support numeric like in

https://spark.apache.org/docs/latest/api/sql/#ascii

> SELECT ascii(2);
 50

Sorry, do you mean in the DF version function doc or the Spark version function doc?

@comphead
Copy link
Contributor

comphead commented Oct 9, 2025

Sorry, do you mean in the DF version function doc or the Spark version function doc?

Slightly confused. Can ascii from datafusion-spark support the query > SELECT ascii(2); ?

@Jefffrey
Copy link
Contributor Author

Jefffrey commented Oct 9, 2025

Sorry, do you mean in the DF version function doc or the Spark version function doc?

Slightly confused. Can ascii from datafusion-spark support the query > SELECT ascii(2); ?

Yes it can; it seems it casts the 2 to '2' and returns ascii of that. I added a note about that in the SparkAscii doc.

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Jefffrey just double checked there is no regression

@Jefffrey
Copy link
Contributor Author

Jefffrey commented Oct 9, 2025

Thanks @Jefffrey just double checked there is no regression

No problem, I only realized myself when the existing Spark ascii SLT test caught it when I tried to make them exactly the same functions:

query I
SELECT ascii(2::INT);
----
50

@Jefffrey Jefffrey added this pull request to the merge queue Oct 10, 2025
Merged via the queue into apache:main with commit dd4e228 Oct 10, 2025
28 checks passed
@Jefffrey Jefffrey deleted the spark-ascii-dedupe branch October 10, 2025 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation spark

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants