Three unit tests newly failed on master branch #493

rui-mo · 2021-08-30T08:58:59Z

Describe the bug
We found the below tests failed on master branch:

columnar arrow_udf test *** FAILED ***
continuous mode with various UDFs - Scalar Pandas UDF *** FAILED ***
fallback arrow_udf test *** FAILED ***

Error is:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 2) (sr404 executor driver): java.lang.IllegalArgumentException: Could not load buffers for field _0: Utf8. error message: A buffer can only be associated between two allocators that share the same root
at org.apache.arrow.vector.VectorLoader.loadBuffers(VectorLoader.java:117)
at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:81)
at org.apache.spark.sql.execution.python.ColumnarArrowPythonRunner$$anon$2.$anonfun$writeIteratorToStream$1(ColumnarArrowPythonRunner.scala:163)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
at org.apache.spark.sql.execution.python.ColumnarArrowPythonRunner$$anon$2.writeIteratorToStream(ColumnarArrowPythonRunner.scala:171)
at org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:397)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1996)
at org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:232)
Caused by: java.lang.IllegalStateException: A buffer can only be associated between two allocators that share the same root
at org.apache.arrow.util.Preconditions.checkState(Preconditions.java:458)
at org.apache.arrow.memory.AllocationManager.associate(AllocationManager.java:96)
at org.apache.arrow.memory.AllocationManager.associate(AllocationManager.java:91)
at org.apache.arrow.memory.BufferLedger.retain(BufferLedger.java:320)
at org.apache.arrow.vector.BaseVariableWidthVector.loadFieldBuffers(BaseVariableWidthVector.java:320)
at org.apache.arrow.vector.VectorLoader.loadBuffers(VectorLoader.java:109)
... 8 more

To Reproduce
Steps to reproduce the behavior:
mvn clean test -P full-scala-compiler -Dbuild_arrow=OFF -Dbuild_protobuf=OFF -DfailIfNoTests=false -Dexec.skip=true -Dmaven.test.failure.ignore=true -Dtest=none -DwildcardSuites="org.apache.spark.sql.execution.python.ArrowEvalPythonExecSuite"

Additional context
We can locate to this commit: d6bc791
Before this commit, these test can work.
Look like some issue on memory allocation.

The text was updated successfully, but these errors were encountered:

rui-mo · 2021-08-30T09:01:42Z

@zhztheplayer @xuechendi An issue found on master branch.

zhztheplayer · 2021-08-31T07:18:50Z

I was able to expect something may get broken in this way due to d6bc791 but failed to produce. Is this test included in CI now?

A valid solution may be changing the allocation in python runner

gazelle_plugin/native-sql-engine/core/src/main/scala/org/apache/spark/sql/execution/python/ColumnarArrowPythonRunner.scala

Lines 71 to 72 in d6bc791

    
           private val allocator = ArrowUtils.rootAllocator.newChildAllocator( 
        
             s"stdin reader for $pythonExec", 0, Long.MaxValue)

to task-restricted context allocator. @xuechendi Hi Chendi, do you remember why a global allocator was used here? Do you think there is risk changing to a local one?

rui-mo · 2021-08-31T09:13:52Z

We are testing them on Jenkins, and this error can be reproduced by running below cmd in gazelle plugin home.
mvn clean test -P full-scala-compiler -Dbuild_arrow=OFF -Dbuild_protobuf=OFF -DfailIfNoTests=false -Dexec.skip=true -Dmaven.test.failure.ignore=true -Dtest=none -DwildcardSuites="org.apache.spark.sql.execution.python.ArrowEvalPythonExecSuite"

zhztheplayer · 2021-09-07T02:20:51Z

Fixed in f07e6fb

rui-mo added the bug Something isn't working label Aug 30, 2021

zhztheplayer closed this as completed Sep 7, 2021

weiting-chen mentioned this issue Jan 11, 2022

[NSE-493] Three unit tests newly failed on master branch (Python UDF Unit Tests) #497

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Three unit tests newly failed on master branch #493

Three unit tests newly failed on master branch #493

rui-mo commented Aug 30, 2021 •

edited

Loading

rui-mo commented Aug 30, 2021

zhztheplayer commented Aug 31, 2021

rui-mo commented Aug 31, 2021

zhztheplayer commented Sep 7, 2021

Three unit tests newly failed on master branch #493

Three unit tests newly failed on master branch #493

Comments

rui-mo commented Aug 30, 2021 • edited Loading

rui-mo commented Aug 30, 2021

zhztheplayer commented Aug 31, 2021

rui-mo commented Aug 31, 2021

zhztheplayer commented Sep 7, 2021

rui-mo commented Aug 30, 2021 •

edited

Loading