Skip to content

Conversation

@LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Sep 28, 2022

What changes were proposed in this pull request?

#37894 changed the preconditions for the following two tests from assume(shouldTestGroupedAggPandasUDFs) to assume(shouldTestPythonUDFs):

  • SPARK-39962: Global aggregation of Pandas UDF should respect the column order in PythonUDFSuite
  • continuous mode with various UDFs - Scalar Pandas UDF in ContinuousSuite

but this change this change will cause test failure if pandas is not installed, so this pr change the test preconditions from assume(shouldTestPythonUDFs) to assume(shouldTestPandasUDFs).

Why are the changes needed?

Fix test precondition of PythonUDFSuite and ContinuousSuite

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • Pass GitHub Actions
  • Manual test, pandas is not installed:
build/sbt clean "sql/testOnly org.apache.spark.sql.execution.python.PythonUDFSuite"
build/sbt clean "sql/testOnly org.apache.spark.sql.streaming.continuous.ContinuousSuite"

Before

PythonUDFSuite

[info] - SPARK-39962: Global aggregation of Pandas UDF should respect the column order *** FAILED *** (799 milliseconds)
[info]   java.lang.RuntimeException: Python executable [python3] and/or pyspark are unavailable.
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$.pandasGroupedAggFunc$lzycompute(IntegratedUDFTestUtils.scala:236)
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$.org$apache$spark$sql$IntegratedUDFTestUtils$$pandasGroupedAggFunc(IntegratedUDFTestUtils.scala:217)
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$TestGroupedAggPandasUDF.udf$lzycompute(IntegratedUDFTestUtils.scala:433)
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$TestGroupedAggPandasUDF.udf(IntegratedUDFTestUtils.scala:430)
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$TestGroupedAggPandasUDF.apply(IntegratedUDFTestUtils.scala:444)
[info]   at org.apache.spark.sql.execution.python.PythonUDFSuite.$anonfun$new$9(PythonUDFSuite.scala:82)
[info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

and

ContinuousSuite

[info] - continuous mode with various UDFs - Scalar Pandas UDF *** FAILED *** (715 milliseconds)
[info]   java.lang.RuntimeException: Python executable [python3] and/or pyspark are unavailable.
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$.pandasFunc$lzycompute(IntegratedUDFTestUtils.scala:214)
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$.org$apache$spark$sql$IntegratedUDFTestUtils$$pandasFunc(IntegratedUDFTestUtils.scala:194)
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$TestScalarPandasUDF$$anon$2.<init>(IntegratedUDFTestUtils.scala:382)
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$TestScalarPandasUDF.udf$lzycompute(IntegratedUDFTestUtils.scala:379)
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$TestScalarPandasUDF.udf(IntegratedUDFTestUtils.scala:379)
[info]   at org.apache.spark.sql.IntegratedUDFTestUtils$TestScalarPandasUDF.apply(IntegratedUDFTestUtils.scala:404)
[info]   at org.apache.spark.sql.streaming.continuous.ContinuousSuite.$anonfun$new$24(ContinuousSuite.scala:289)

After

PythonUDFSuite

[info] Run completed in 11 seconds, 278 milliseconds.
[info] Total number of tests run: 4
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 4, failed 0, canceled 1, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 72 s (01:12), completed 2022-9-28 15:46:40

and

ContinuousSuite

[info] Run completed in 33 seconds, 197 milliseconds.
[info] Total number of tests run: 13
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 13, failed 0, canceled 1, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 64 s (01:04), completed 2022-9-28 15:49:45

@LuciferYang
Copy link
Contributor Author

cc @HeartSaVioR and @HyukjinKwon

@LuciferYang LuciferYang changed the title [SPARK-40435][SQL][TESTS][FOLLOWUP] Correct test precondition of PythonUDFSuite and ContinuousSuite [SPARK-40435][SQL][SS][TESTS][FOLLOWUP] Correct test precondition of PythonUDFSuite and ContinuousSuite Sep 28, 2022
Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Nice catch! Let's wait for @HyukjinKwon to double-check.

@HyukjinKwon
Copy link
Member

Merged to master.

@LuciferYang
Copy link
Contributor Author

thanks @HyukjinKwon @HeartSaVioR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants