Skip to content

Conversation

@wesm
Copy link

@wesm wesm commented Dec 12, 2016

@BryanCutler @icexelloss started this patch, but is out for the holidays for a couple of weeks. If this is useful for starting a test suite for record batch conversion feel free to pull it into the integration branch.

This ideally needs ARROW-411 -- temporarily this adds arrow-tools as a dependency to get access to functions in the integration tester.

cc @leifwalsh

@wesm wesm force-pushed the arrow-unit-test-proto branch from c12a3a6 to cfc578f Compare December 12, 2016 21:58
@BryanCutler
Copy link
Owner

Great, I think this will help. I'll try it out
cc @yinxusen

pom.xml Outdated
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be scoped to test?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this is only here temporarily pending ARROW-411. Feel free to cherry pick this commit and modify to suit

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is done so the test can use the same root allocator, would it fail if the tests made a different instance? I'm just wondering about the broader usage in Spark, like should Spark manage a single root allocator or just create one each operation and allow the user to override like this?

@wesm
Copy link
Author

wesm commented Dec 14, 2016

I don't think it would make the tests fail. Might make sense to create a child allocator for each operation in Spark

@wesm wesm force-pushed the arrow-unit-test-proto branch from cfc578f to b2975a3 Compare December 15, 2016 18:23
@wesm
Copy link
Author

wesm commented Dec 15, 2016

@BryanCutler I just rebased this on arrow-integration

@BryanCutler
Copy link
Owner

Thanks @wesm, I'll merge this so we can start using the ArrowSuite too. I'll just change the scope on the dependency and comment out the lines that require modification to Arrow, so it doesn't break compilation.

BryanCutler pushed a commit that referenced this pull request Dec 15, 2016
Changed scope of arrow-tools dependency to test

commented out lines to Integration.compareXX that are private to arrow

closes #10
@BryanCutler
Copy link
Owner

closed with 7127b32

@wesm wesm deleted the arrow-unit-test-proto branch December 22, 2016 18:52
BryanCutler pushed a commit that referenced this pull request Jan 24, 2017
Changed scope of arrow-tools dependency to test

commented out lines to Integration.compareXX that are private to arrow

closes #10
BryanCutler pushed a commit that referenced this pull request Feb 23, 2017
Changed scope of arrow-tools dependency to test

commented out lines to Integration.compareXX that are private to arrow

closes #10
BryanCutler pushed a commit that referenced this pull request Oct 7, 2019
…nput of UDF as double in the failed test in udf-aggregate_part1.sql

## What changes were proposed in this pull request?

It still can be flaky on certain environments due to float limitation described at apache#25110 . See apache#25110 (comment)

- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/6584/testReport/org.apache.spark.sql/SQLQueryTestSuite/udf_pgSQL_udf_aggregates_part1_sql___Regular_Python_UDF/

```
Expected "700000000000[6] 1", but got "700000000000[5] 1" Result did not match for query apache#33
SELECT CAST(avg(udf(CAST(x AS DOUBLE))) AS long), CAST(udf(var_pop(CAST(x AS DOUBLE))) AS decimal(10,3))
FROM (VALUES (7000000000005), (7000000000007)) v(x)
```

Here;s what's going on: apache#25110 (comment)

```
scala> Seq("7000000000004.999", "7000000000006.999").toDF().selectExpr("CAST(avg(value) AS long)").show()
+--------------------------+
|CAST(avg(value) AS BIGINT)|
+--------------------------+
|             7000000000005|
+--------------------------+
```

Therefore, this PR just avoid to cast in the specific test.

This is a temp fix. We need more robust way to avoid such cases.

## How was this patch tested?

It passes with Maven in my local before/after this PR. I believe the problem seems similarly the Python or OS installed in the machine. I should test this against PR builder with `test-maven` for sure..

Closes apache#25128 from HyukjinKwon/SPARK-28270-2.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
BryanCutler pushed a commit that referenced this pull request Jan 6, 2020
… Arrow on JDK9+

### What changes were proposed in this pull request?

This PR aims to add `io.netty.tryReflectionSetAccessible=true` to the testing configuration for JDK11 because this is an officially documented requirement of Apache Arrow.

Apache Arrow community documented this requirement at `0.15.0` ([ARROW-6206](apache/arrow#5078)).
> #### For java 9 or later, should set "-Dio.netty.tryReflectionSetAccessible=true".
> This fixes `java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available`. thrown by netty.

### Why are the changes needed?

After ARROW-3191, Arrow Java library requires the property `io.netty.tryReflectionSetAccessible` to be set to true for JDK >= 9. After apache#26133, JDK11 Jenkins job seem to fail.

- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2-jdk-11/676/
- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2-jdk-11/677/
- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2-jdk-11/678/

```scala
Previous exception in task:
sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available&#10;
io.netty.util.internal.PlatformDependent.directBuffer(PlatformDependent.java:473)&#10;
io.netty.buffer.NettyArrowBuf.getDirectBuffer(NettyArrowBuf.java:243)&#10;
io.netty.buffer.NettyArrowBuf.nioBuffer(NettyArrowBuf.java:233)&#10;
io.netty.buffer.ArrowBuf.nioBuffer(ArrowBuf.java:245)&#10;
org.apache.arrow.vector.ipc.message.ArrowRecordBatch.computeBodyLength(ArrowRecordBatch.java:222)&#10;
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins with JDK11.

Closes apache#26552 from dongjoon-hyun/SPARK-ARROW-JDK11.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants