-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-18118][SQL] fix a compilation error due to nested JavaBeans #16032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #69240 has finished for PR 16032 at commit
|
|
LGTM - merging to master/2.1/2.0. Thanks! |
## What changes were proposed in this pull request?
This PR avoids a compilation error due to more than 64KB Java byte code size. This error occur since generated java code `SpecificSafeProjection.apply()` for nested JavaBeans is too big. This PR avoids this compilation error by splitting a big code chunk into multiple methods by calling `CodegenContext.splitExpression` at `InitializeJavaBean.doGenCode`
An object reference for JavaBean is stored to an instance variable `javaBean...`. Then, the instance variable will be referenced in the split methods.
Generated code with this PR
````
/* 22098 */ private void apply130_0(InternalRow i) {
...
/* 22125 */ boolean isNull238 = i.isNullAt(2);
/* 22126 */ InternalRow value238 = isNull238 ? null : (i.getStruct(2, 3));
/* 22127 */ boolean isNull236 = false;
/* 22128 */ test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value236 = null;
/* 22129 */ if (!false && isNull238) {
/* 22130 */
/* 22131 */ final test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value239 = null;
/* 22132 */ isNull236 = true;
/* 22133 */ value236 = value239;
/* 22134 */ } else {
/* 22135 */
/* 22136 */ final test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value241 = false ? null : new test.org.apache.spark.sql.JavaDatasetSuite$Nesting1();
/* 22137 */ this.javaBean14 = value241;
/* 22138 */ if (!false) {
/* 22139 */ apply25_0(i);
/* 22140 */ apply25_1(i);
/* 22141 */ apply25_2(i);
/* 22142 */ }
/* 22143 */ isNull236 = false;
/* 22144 */ value236 = value241;
/* 22145 */ }
/* 22146 */ this.javaBean.setField2(value236);
/* 22147 */
/* 22148 */ }
...
/* 22928 */ public java.lang.Object apply(java.lang.Object _i) {
/* 22929 */ InternalRow i = (InternalRow) _i;
/* 22930 */
/* 22931 */ final test.org.apache.spark.sql.JavaDatasetSuite$NestedComplicatedJavaBean value1 = false ? null : new test.org.apache.spark.sql.JavaDatasetSuite$NestedComplicatedJavaBean();
/* 22932 */ this.javaBean = value1;
/* 22933 */ if (!false) {
/* 22934 */ apply130_0(i);
/* 22935 */ apply130_1(i);
/* 22936 */ apply130_2(i);
/* 22937 */ apply130_3(i);
/* 22938 */ apply130_4(i);
/* 22939 */ }
/* 22940 */ if (false) {
/* 22941 */ mutableRow.setNullAt(0);
/* 22942 */ } else {
/* 22943 */
/* 22944 */ mutableRow.update(0, value1);
/* 22945 */ }
/* 22946 */
/* 22947 */ return mutableRow;
/* 22948 */ }
````
## How was this patch tested?
added a test suite into `JavaDatasetSuite.java`
Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
Closes #16032 from kiszk/SPARK-18118.
(cherry picked from commit f075cd9)
Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>
## What changes were proposed in this pull request?
This PR avoids a compilation error due to more than 64KB Java byte code size. This error occur since generated java code `SpecificSafeProjection.apply()` for nested JavaBeans is too big. This PR avoids this compilation error by splitting a big code chunk into multiple methods by calling `CodegenContext.splitExpression` at `InitializeJavaBean.doGenCode`
An object reference for JavaBean is stored to an instance variable `javaBean...`. Then, the instance variable will be referenced in the split methods.
Generated code with this PR
````
/* 22098 */ private void apply130_0(InternalRow i) {
...
/* 22125 */ boolean isNull238 = i.isNullAt(2);
/* 22126 */ InternalRow value238 = isNull238 ? null : (i.getStruct(2, 3));
/* 22127 */ boolean isNull236 = false;
/* 22128 */ test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value236 = null;
/* 22129 */ if (!false && isNull238) {
/* 22130 */
/* 22131 */ final test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value239 = null;
/* 22132 */ isNull236 = true;
/* 22133 */ value236 = value239;
/* 22134 */ } else {
/* 22135 */
/* 22136 */ final test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value241 = false ? null : new test.org.apache.spark.sql.JavaDatasetSuite$Nesting1();
/* 22137 */ this.javaBean14 = value241;
/* 22138 */ if (!false) {
/* 22139 */ apply25_0(i);
/* 22140 */ apply25_1(i);
/* 22141 */ apply25_2(i);
/* 22142 */ }
/* 22143 */ isNull236 = false;
/* 22144 */ value236 = value241;
/* 22145 */ }
/* 22146 */ this.javaBean.setField2(value236);
/* 22147 */
/* 22148 */ }
...
/* 22928 */ public java.lang.Object apply(java.lang.Object _i) {
/* 22929 */ InternalRow i = (InternalRow) _i;
/* 22930 */
/* 22931 */ final test.org.apache.spark.sql.JavaDatasetSuite$NestedComplicatedJavaBean value1 = false ? null : new test.org.apache.spark.sql.JavaDatasetSuite$NestedComplicatedJavaBean();
/* 22932 */ this.javaBean = value1;
/* 22933 */ if (!false) {
/* 22934 */ apply130_0(i);
/* 22935 */ apply130_1(i);
/* 22936 */ apply130_2(i);
/* 22937 */ apply130_3(i);
/* 22938 */ apply130_4(i);
/* 22939 */ }
/* 22940 */ if (false) {
/* 22941 */ mutableRow.setNullAt(0);
/* 22942 */ } else {
/* 22943 */
/* 22944 */ mutableRow.update(0, value1);
/* 22945 */ }
/* 22946 */
/* 22947 */ return mutableRow;
/* 22948 */ }
````
## How was this patch tested?
added a test suite into `JavaDatasetSuite.java`
Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
Closes #16032 from kiszk/SPARK-18118.
(cherry picked from commit f075cd9)
Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>
| s""" | ||
| ${fieldGen.code} | ||
| ${instanceGen.value}.$setterMethod(${fieldGen.value}); | ||
| this.${javaBeanInstance}.$setterMethod(${fieldGen.value}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's safer to avoid using this in generated code, as the code snippet may be used in an inner class, and we already guarantee the variable name is unique globally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll touch that up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
## What changes were proposed in this pull request?
This PR avoids a compilation error due to more than 64KB Java byte code size. This error occur since generated java code `SpecificSafeProjection.apply()` for nested JavaBeans is too big. This PR avoids this compilation error by splitting a big code chunk into multiple methods by calling `CodegenContext.splitExpression` at `InitializeJavaBean.doGenCode`
An object reference for JavaBean is stored to an instance variable `javaBean...`. Then, the instance variable will be referenced in the split methods.
Generated code with this PR
````
/* 22098 */ private void apply130_0(InternalRow i) {
...
/* 22125 */ boolean isNull238 = i.isNullAt(2);
/* 22126 */ InternalRow value238 = isNull238 ? null : (i.getStruct(2, 3));
/* 22127 */ boolean isNull236 = false;
/* 22128 */ test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value236 = null;
/* 22129 */ if (!false && isNull238) {
/* 22130 */
/* 22131 */ final test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value239 = null;
/* 22132 */ isNull236 = true;
/* 22133 */ value236 = value239;
/* 22134 */ } else {
/* 22135 */
/* 22136 */ final test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value241 = false ? null : new test.org.apache.spark.sql.JavaDatasetSuite$Nesting1();
/* 22137 */ this.javaBean14 = value241;
/* 22138 */ if (!false) {
/* 22139 */ apply25_0(i);
/* 22140 */ apply25_1(i);
/* 22141 */ apply25_2(i);
/* 22142 */ }
/* 22143 */ isNull236 = false;
/* 22144 */ value236 = value241;
/* 22145 */ }
/* 22146 */ this.javaBean.setField2(value236);
/* 22147 */
/* 22148 */ }
...
/* 22928 */ public java.lang.Object apply(java.lang.Object _i) {
/* 22929 */ InternalRow i = (InternalRow) _i;
/* 22930 */
/* 22931 */ final test.org.apache.spark.sql.JavaDatasetSuite$NestedComplicatedJavaBean value1 = false ? null : new test.org.apache.spark.sql.JavaDatasetSuite$NestedComplicatedJavaBean();
/* 22932 */ this.javaBean = value1;
/* 22933 */ if (!false) {
/* 22934 */ apply130_0(i);
/* 22935 */ apply130_1(i);
/* 22936 */ apply130_2(i);
/* 22937 */ apply130_3(i);
/* 22938 */ apply130_4(i);
/* 22939 */ }
/* 22940 */ if (false) {
/* 22941 */ mutableRow.setNullAt(0);
/* 22942 */ } else {
/* 22943 */
/* 22944 */ mutableRow.update(0, value1);
/* 22945 */ }
/* 22946 */
/* 22947 */ return mutableRow;
/* 22948 */ }
````
## How was this patch tested?
added a test suite into `JavaDatasetSuite.java`
Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
Closes apache#16032 from kiszk/SPARK-18118.
## What changes were proposed in this pull request?
This PR avoids a compilation error due to more than 64KB Java byte code size. This error occur since generated java code `SpecificSafeProjection.apply()` for nested JavaBeans is too big. This PR avoids this compilation error by splitting a big code chunk into multiple methods by calling `CodegenContext.splitExpression` at `InitializeJavaBean.doGenCode`
An object reference for JavaBean is stored to an instance variable `javaBean...`. Then, the instance variable will be referenced in the split methods.
Generated code with this PR
````
/* 22098 */ private void apply130_0(InternalRow i) {
...
/* 22125 */ boolean isNull238 = i.isNullAt(2);
/* 22126 */ InternalRow value238 = isNull238 ? null : (i.getStruct(2, 3));
/* 22127 */ boolean isNull236 = false;
/* 22128 */ test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value236 = null;
/* 22129 */ if (!false && isNull238) {
/* 22130 */
/* 22131 */ final test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value239 = null;
/* 22132 */ isNull236 = true;
/* 22133 */ value236 = value239;
/* 22134 */ } else {
/* 22135 */
/* 22136 */ final test.org.apache.spark.sql.JavaDatasetSuite$Nesting1 value241 = false ? null : new test.org.apache.spark.sql.JavaDatasetSuite$Nesting1();
/* 22137 */ this.javaBean14 = value241;
/* 22138 */ if (!false) {
/* 22139 */ apply25_0(i);
/* 22140 */ apply25_1(i);
/* 22141 */ apply25_2(i);
/* 22142 */ }
/* 22143 */ isNull236 = false;
/* 22144 */ value236 = value241;
/* 22145 */ }
/* 22146 */ this.javaBean.setField2(value236);
/* 22147 */
/* 22148 */ }
...
/* 22928 */ public java.lang.Object apply(java.lang.Object _i) {
/* 22929 */ InternalRow i = (InternalRow) _i;
/* 22930 */
/* 22931 */ final test.org.apache.spark.sql.JavaDatasetSuite$NestedComplicatedJavaBean value1 = false ? null : new test.org.apache.spark.sql.JavaDatasetSuite$NestedComplicatedJavaBean();
/* 22932 */ this.javaBean = value1;
/* 22933 */ if (!false) {
/* 22934 */ apply130_0(i);
/* 22935 */ apply130_1(i);
/* 22936 */ apply130_2(i);
/* 22937 */ apply130_3(i);
/* 22938 */ apply130_4(i);
/* 22939 */ }
/* 22940 */ if (false) {
/* 22941 */ mutableRow.setNullAt(0);
/* 22942 */ } else {
/* 22943 */
/* 22944 */ mutableRow.update(0, value1);
/* 22945 */ }
/* 22946 */
/* 22947 */ return mutableRow;
/* 22948 */ }
````
## How was this patch tested?
added a test suite into `JavaDatasetSuite.java`
Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
Closes apache#16032 from kiszk/SPARK-18118.
What changes were proposed in this pull request?
This PR avoids a compilation error due to more than 64KB Java byte code size. This error occur since generated java code
SpecificSafeProjection.apply()for nested JavaBeans is too big. This PR avoids this compilation error by splitting a big code chunk into multiple methods by callingCodegenContext.splitExpressionatInitializeJavaBean.doGenCodeAn object reference for JavaBean is stored to an instance variable
javaBean.... Then, the instance variable will be referenced in the split methods.Generated code with this PR
How was this patch tested?
added a test suite into
JavaDatasetSuite.java