Skip to content

Conversation

@hvanhovell
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds the following functions to Spark Connect Scala Client:

  • Sort Functions
  • Aggregate Functions
  • Misc Functions
  • Math Functions

Why are the changes needed?

We want to the Spark Connect Scala Client to reach parity with the original functions API.

Does this PR introduce any user-facing change?

Yes, it adds a lot of functions.

How was this patch tested?

Added test for all functions and their significant variations.

# Conflicts:
#	connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala
#	connector/connect/common/src/test/resources/query-tests/explain-results/function_max.explain
#	connector/connect/common/src/test/resources/query-tests/queries/function_max.json
#	connector/connect/common/src/test/resources/query-tests/queries/function_max.proto.bin
@hvanhovell
Copy link
Contributor Author

hvanhovell commented Feb 16, 2023

Note for the reviewer. I still want to add a couple of tests for duplicate functions.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you verify this with Scala 2.13?

To reviewers, please hold on approving this PR. This module has been broken already with Scala 2.13. The broken branch blocks other community member's development severely.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the previous PR leaves 9 UT failures.

     [info] *** 9 TESTS FAILED ***
     [error] Failed tests:
     [error] 	org.apache.spark.sql.connect.ProtoToParsedPlanTestSuite
     [error] (connect / Test / test) sbt.TestsFailedException: Tests unsuccessful
     [error] Total time: 41 s, completed Feb 16, 2023, 7:06:12 AM

@hvanhovell
Copy link
Contributor Author

@dongjoon-hyun I will fix those today. I do think we should have a discussion about this. Currently we have both maven and scala-2.13 that are not tested during CI. That seems wrong if both are apparently supported. The mental overhead of testing these manually is very high.

@hvanhovell
Copy link
Contributor Author

As for blocking other community members severely, the same applies to the lack of testing of scala-2.13.

# Conflicts:
#	connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala
@hvanhovell
Copy link
Contributor Author

Fix for 2.13 has merged. I am going to hold this off until #40056 is merged.

@dongjoon-hyun dongjoon-hyun dismissed their stale review February 16, 2023 22:11

Scala 2.13 is recovered.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (Pending CIs).
I assume that this is tested with Scala 2.13 in the same way, @hvanhovell .

@hvanhovell
Copy link
Contributor Author

merging

hvanhovell added a commit that referenced this pull request Feb 17, 2023
### What changes were proposed in this pull request?
This PR adds the following functions to Spark Connect Scala Client:
- Sort Functions
- Aggregate Functions
- Misc Functions
- Math Functions

### Why are the changes needed?
We want to the Spark Connect Scala Client to reach parity with the original functions API.

### Does this PR introduce _any_ user-facing change?
Yes, it adds a lot of functions.

### How was this patch tested?
Added test for all functions and their significant variations.

Closes #40050 from hvanhovell/SPARK-42461.

Authored-by: Herman van Hovell <herman@databricks.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
(cherry picked from commit e6a84fe)
Signed-off-by: Herman van Hovell <herman@databricks.com>
@hvanhovell
Copy link
Contributor Author

thanks all

snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
### What changes were proposed in this pull request?
This PR adds the following functions to Spark Connect Scala Client:
- Sort Functions
- Aggregate Functions
- Misc Functions
- Math Functions

### Why are the changes needed?
We want to the Spark Connect Scala Client to reach parity with the original functions API.

### Does this PR introduce _any_ user-facing change?
Yes, it adds a lot of functions.

### How was this patch tested?
Added test for all functions and their significant variations.

Closes apache#40050 from hvanhovell/SPARK-42461.

Authored-by: Herman van Hovell <herman@databricks.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
(cherry picked from commit e6a84fe)
Signed-off-by: Herman van Hovell <herman@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants