Skip to content

Conversation

@kiszk
Copy link
Member

@kiszk kiszk commented Dec 5, 2017

What changes were proposed in this pull request?

This PR accomplishes the following two items.

  1. Reduce # of global variables from two to one for generated code of Case and Coalesce and remove global variables for generated code of In.
  2. Make lifetime of global variable local within an operation

Item 1. reduces # of constant pool entries in a Java class. Item 2. ensures that an variable is not passed to arguments in a method split by CodegenContext.splitExpressions(), which is addressed by #19865.

How was this patch tested?

Added new tests into PredicateSuite, NullExpressionsSuite, and ConditionalExpressionSuite.

@SparkQA
Copy link

SparkQA commented Dec 5, 2017

Test build #84497 has finished for PR 19901 at commit ae50d97.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 5, 2017

Test build #84504 has finished for PR 19901 at commit 8420c02.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 5, 2017

Test build #84505 has finished for PR 19901 at commit bcbe82c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member Author

kiszk commented Dec 6, 2017

@cloud-fan @viirya @mgaido91 could you please review this?

val conditionMet = ctx.freshName("caseWhenConditionMet")
ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
ctx.addMutableState(ctx.javaType(dataType), ev.value)
val value = ctx.freshName("value")
Copy link
Contributor

@cloud-fan cloud-fan Dec 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caseWhenTmpResult?

// It is initialized to `false` and it is set to `true` when the first condition which
// evaluates to `true` is met and therefore is not needed to go on anymore on the computation
// This variable represents whether the condition is met with true/false or not met.
// It is initialized to -1 and it is set to 0 or 1 when the condition which evaluates to
Copy link
Contributor

@cloud-fan cloud-fan Dec 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable represents whether the evaluated result is null or not. It's a byte value instead
of boolean because it carries an extra information about if the case-when condition is met or not.
It is initialized to `-1`, which means the condition is not met yet and the result is unknown.
When the first condition is met, it is set to `1` if result is null, or `0` if result is not null.
We won't go on anymore on the computation if it's set to `1` or `0`.

// It is initialized to -1 and it is set to 0 or 1 when the condition which evaluates to
// `false` or `true` is met and therefore is not needed to go on anymore on the computation
// of the following conditions.
val conditionMet = ctx.freshName("caseWhenConditionMet")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: resultIsNull

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

"""
},
foldFunctions = { funcCalls =>
funcCalls.map { funcCall =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you keep the previous code style?

| $func
|} while (false);
|return $conditionMet;
""".stripMargin,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this multiline style is now the preferred one.

override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
ctx.addMutableState(ctx.javaType(dataType), ev.value)
val isNull = ctx.freshName("isNull")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coalesceTmpIsNull

${ev.isNull} = true;
${ev.value} = ${ctx.defaultValue(dataType)};
${ctx.JAVA_BOOLEAN} $conditionMet = false;
${ctx.JAVA_BYTE} $conditionMet = -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why turn it from boolean to byte?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm.

""".stripMargin
}.mkString)

extraArguments = (ctx.javaType(dataType), ev.value) :: Nil,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need to pass it as a parameter, just create a local variable at the beginning of the generated method:

${ctx.javaType(dataType)} ${ev.value} = default...
do {
  $func
} while (false);
return ${ev.value};

""".stripMargin
},
foldFunctions = { funcCalls =>
funcCalls.map { funcCall =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, please keep the previous code style.

ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.value)
ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
// inValue -1:isNull, 0:false, 1:true
val inValue = ctx.freshName("value")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: inTmpResult?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// inValue -1 indicates at lease one expr in list is evaluated to null. 0 means no matches found. 1 means the expr in list matches the given value expr.

$codes
} while (false);""")
} while (false);
boolean ${ev.isNull} = ($conditionMet != 0); // TRUE if -1 or 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks weird that we have two possible values for TRUE.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. nvm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe // TRUE if any condition is met and the result is not null, or no any condition is met.

|if (!${ev.isNull}) {
|byte $inValue = (byte)(${valueGen.isNull} ? -1 : 0); // isNull or FALSE
|if ($inValue != -1) {
| $javaDataType $valueArg = ${valueGen.value};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just pass ${valueGen.value}? Why we need another local variable for it?

Copy link
Member Author

@kiszk kiszk Dec 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${valueGen.value} may have an constant 1.0D. It is not easy to pass it to an argument by using spilitExpressions().

@SparkQA
Copy link

SparkQA commented Dec 6, 2017

Test build #84534 has finished for PR 19901 at commit c81d795.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Dec 6, 2017

retest this please.

""".stripMargin
},
foldFunctions = { funcCalls =>
funcCalls.map { funcCall =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you keep the previous code style? i.e. _.map { funcCall =>

""".stripMargin
}.mkString)
returnType = ctx.JAVA_BYTE,
makeSplitFunction = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, makeSplitFunction = func =>

|${ev.value} = false;
|${ev.isNull} = ${valueGen.isNull};
|if (!${ev.isNull}) {
|// TRUE if any condition is met and the result is not null, or no any condition is met.
Copy link
Member

@viirya viirya Dec 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggested comment I think is for CaseWhen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh , sorry for stupid mistake.

| $codes
|} while (false);
|boolean ${ev.isNull} = ($resultIsNull != 0); // TRUE if -1 or 1
|${ctx.javaType(dataType)} ${ev.value} = $tmpResult;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: final

override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
ctx.addMutableState(ctx.javaType(dataType), ev.value)
val coalesceTmpIsNull = ctx.freshName("coalesceTmpIsNull")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: val tmpIsNull = ctx.freshName("coalesceTmpIsNull"), to be consistent with https://github.com/apache/spark/pull/19901/files#diff-a966ee88604a834221e82916ec051d7dR190


returnType = ctx.javaType(dataType),
makeSplitFunction = {
func =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, keep previous code style.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it about makeSplitFunction = func =>. I thought about multline style. Sorry for my misunderstanding.

|do {
| $codes
|} while (false);
|boolean ${ev.isNull} = $coalesceTmpIsNull;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: final

ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
// inTmpResult -1 indicates at lease one expr in list is evaluated to null.
// 0 means no matches found. 1 means the expr in list matches the given value expr.
val inTmpResult = ctx.freshName("inTmpResult")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: val tmpResult = ctx.freshName("inTmpResult")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here I think there is no need for this and we can use the same approach used in coalesce.

val conditionMet = ctx.freshName("caseWhenConditionMet")
ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
ctx.addMutableState(ctx.javaType(dataType), ev.value)
// This variable represents whether the evaluated result is null or not. It's a byte value
Copy link
Contributor

@cloud-fan cloud-fan Dec 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also update the comment:

This variable holds the state of the result, and has 3 possible values:
  -1 means the condition is not met yet and the result is unknown.
  0 means the condition is met and result is not null.
  1 means the condition is met and result is null.
It is initialized to `-1`, and if it's set to `1` or `0`, We won't go on anymore on the computation.

// This generates code like:
// conditionMet = caseWhen_1(i);
// if(conditionMet) {
// caseWhenResultIsNull = caseWhen_1(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these need to be updated

val listGen = list.map(_.genCode(ctx))
ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.value)
ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
// inTmpResult -1 indicates at lease one expr in list is evaluated to null.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be more clear:

inTmpResult has 3 possible values:
  -1 means no matches found and there is at least one value in the list evaluated to null
  0 means no matches found and all values in the list are not null
  1 means one value in the list is matched, and the value is saved to `tmpResult`.

// The evaluation of variables can be stopped when we find a matching value.
val listCode = listGen.map(x =>
s"""
|$tmpResult = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, this against what we've discussed before...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this is my mistake for debugging.

|${ev.isNull} = ${valueGen.isNull};
|if (!${ev.isNull}) {
|byte $tmpResult = (byte)(${valueGen.isNull} ? -1 : 0);
|if ($tmpResult != -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's more readable to write

if (!{ev.isNull}) {
  byte $tmpResult = 0;
  ...
}

@cloud-fan
Copy link
Contributor

LGTM except some minor comments

@mgaido91
Copy link
Contributor

mgaido91 commented Dec 6, 2017

LGTM too

@SparkQA
Copy link

SparkQA commented Dec 6, 2017

Test build #84557 has finished for PR 19901 at commit 6a14e16.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 6, 2017

Test build #84561 has finished for PR 19901 at commit 740f1a0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// inTmpResult has 3 possible values:
// -1 means no matches found and there is at least one value in the list evaluated to null
// 0 means no matches found and all values in the list are not null
// 1 means one value in the list is matched
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we define some constants with meaningful names in doGenCode?

  // -1 means no matches found and there is at least one value in the list evaluated 
  final val HAS_NULL = -1
  // 0 means no matches found and all values in the list are not null
  final val NOT_MATCHED = 0
  // 1 means one value in the list is matched
  final val MATCHED = 1

// This variable holds the state of the result:
// -1 means the condition is not met yet and the result is unknown.
// 0 means the condition is met and result is not null.
// 1 means the condition is met and result is null.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here.

|${x.code}
|if (${x.isNull}) {
| ${ev.isNull} = true;
| $tmpResult = -1; // isNull = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about updating the comments to

// ${ev.isNull} = true;

|} else if (${ctx.genEqual(value.dataType, valueArg, x.value)}) {
| ${ev.isNull} = false;
| ${ev.value} = true;
| $tmpResult = 1; // value = TRUE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about updating the comments to

// ${ev.isNull} = false; ${ev.value} = true;

@gatorsmile
Copy link
Member

LGTM except a few comments about code readability

val HAS_NONNULL = 0
// 1 means the condition is met and result is null.
val HAS_NULL = 1
// It is initialized to `-1`, and if it's set to `1` or `0`, We won't go on anymore on the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since you define the constants above, why not use them in this comment?

It is initialized to `NOT_MATCHED`, and if it's set to `HAS_NULL` or `HAS_NONNULL`...

|do {
| $codes
|} while (false);
|// TRUE if any condition is met and the result is not null, or no any condition is met.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems wrong here. TRUE if any condition is met and the result is null...

ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.value)
ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
// inTmpResult has 3 possible values:
// -1 means no matches found and there is at least one value in the list evaluated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... evaluated to null

|${x.code}
|if (${x.isNull}) {
| ${ev.isNull} = true;
| $tmpResult = $HAS_NULL; // ${ev.isNull} = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// ${ev.isNull} = true; looks not necessary comment for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, seems suggested by other reviewer. Then fine for me.

|} else if (${ctx.genEqual(value.dataType, valueArg, x.value)}) {
| ${ev.isNull} = false;
| ${ev.value} = true;
| $tmpResult = $MATCHED; // ${ev.isNull} = false; ${ev.value} = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

@viirya
Copy link
Member

viirya commented Dec 7, 2017

Few minor comments for code comment, LGTM.

@SparkQA
Copy link

SparkQA commented Dec 7, 2017

Test build #84584 has finished for PR 19901 at commit 31ab853.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 7, 2017

Test build #84588 has finished for PR 19901 at commit c4691b6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val conditionMet = ctx.freshName("caseWhenConditionMet")
ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
ctx.addMutableState(ctx.javaType(dataType), ev.value)
// This variable holds the state of the result:
Copy link
Contributor

@cloud-fan cloud-fan Dec 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resultState holds the state of the result, which has 3 possible values:

@cloud-fan
Copy link
Contributor

thanks, merging to master! you can address the minor comment in your next PR.

@asfgit asfgit closed this in ea2fbf4 Dec 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants