Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Sep 14, 2016

What changes were proposed in this pull request?

This PR fixes all the instances which was fixed in the previous PR.

To make sure, I manually debugged and also checked the Scala source. length in LinearSeqOptimized.scala#L49-L57 is O(n). Also, size calls length via SeqLike.scala#L106.

For debugging, I have created these as below:

ArrayBuffer(1, 2, 3)
Array(1, 2, 3)
List(1, 2, 3)
Seq(1, 2, 3)

and then called size and length for each to debug.

How was this patch tested?

I ran the bash as below on Mac

find . -name *.scala -type f -exec grep -il "while (.*\\.length)" {} \; | grep "src/main"
find . -name *.scala -type f -exec grep -il "while (.*\\.size)" {} \; | grep "src/main"

and then checked each.

@HyukjinKwon
Copy link
Member Author

@srowen Could you please review this?

gamma: Double): CoordinateMatrix = {
require(gamma > 1.0, s"Oversampling should be greater than 1: $gamma")
require(colMags.size == this.numCols(), "Number of magnitudes didn't match column dimension")
require(colMags.length == this.numCols(), "Number of magnitudes didn't match column dimension")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different, yeah. You're just avoiding the implicit conversion. While that's sort of good I think it's separate and maybe something to change or not change everywhere at once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I will revert this change.

@HyukjinKwon
Copy link
Member Author

@srowen Please let me check this out again tomorrow.

@SparkQA
Copy link

SparkQA commented Sep 14, 2016

Test build #65366 has finished for PR 15093 at commit ecd3dde.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 14, 2016

Test build #65369 has finished for PR 15093 at commit 996e292.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

var i = 0
while (i < groupedWindowExpressions.size) {
val ((partitionSpec, orderSpec), windowExpressions) = groupedWindowExpressions(i)
groupedWindowExpressions.foreach { case ((partitionSpec, orderSpec), windowExpressions) =>
Copy link
Member Author

@HyukjinKwon HyukjinKwon Sep 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @srowen , It seems groupedWindowExpressions is Seq[((Seq[Expression], Seq[SortOrder]), ArrayBuffer[NamedExpression])] but the desired output is LogicalPlan. So, it seems I can't directly use fold or reduce simply. So, could this just use foreach rather than others? Otherwise, please teach me for a better expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well currentChild is a Window, and that's the only thing whose computation is changed here. How about:

      val currentChild =
        groupedWindowExpressions.foldLeft(child) {
          case (last, ((partitionSpec, orderSpec), windowExpressions)) =>
            Window(windowExpressions, partitionSpec, orderSpec, last)
        }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's clear and better. I will try. Thanks!

@SparkQA
Copy link

SparkQA commented Sep 15, 2016

Test build #65440 has finished for PR 15093 at commit cb536a4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Sep 16, 2016

OK, looks good.

Some of these are .size -> .length changes, but I suppose those are related and positive. It avoids an extra method invocation, which might matter in a very tight loop.

val currentChild =
groupedWindowExpressions.foldLeft(child) {
case (last, ((partitionSpec, orderSpec), windowExpressions)) =>
// Set currentChild to the newly created Window operator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one last thing, can you remove this comment? and update the comment above and below this expression? they're no longer applicable. Actually currentChild isn't a great name now anyway. eh, windowOps or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I should've checked this closely. I will update soon.

@SparkQA
Copy link

SparkQA commented Sep 17, 2016

Test build #65535 has finished for PR 15093 at commit 8a3d293.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Sep 17, 2016

Test build #65534 has finished for PR 15093 at commit f5e8131.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Sep 17, 2016

Merging to master, and to 2.0 to match the parent JIRA, even though this is probably less important, yet still a clean small win

@asfgit asfgit closed this in 86c2d39 Sep 17, 2016
asfgit pushed a commit that referenced this pull request Sep 17, 2016
…th/size which is O(n)

This PR fixes all the instances which was fixed in the previous PR.

To make sure, I manually debugged and also checked the Scala source. `length` in [LinearSeqOptimized.scala#L49-L57](https://github.com/scala/scala/blob/2.11.x/src/library/scala/collection/LinearSeqOptimized.scala#L49-L57) is O(n). Also, `size` calls `length` via [SeqLike.scala#L106](https://github.com/scala/scala/blob/2.11.x/src/library/scala/collection/SeqLike.scala#L106).

For debugging, I have created these as below:

```scala
ArrayBuffer(1, 2, 3)
Array(1, 2, 3)
List(1, 2, 3)
Seq(1, 2, 3)
```

and then called `size` and `length` for each to debug.

I ran the bash as below on Mac

```bash
find . -name *.scala -type f -exec grep -il "while (.*\\.length)" {} \; | grep "src/main"
find . -name *.scala -type f -exec grep -il "while (.*\\.size)" {} \; | grep "src/main"
```

and then checked each.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #15093 from HyukjinKwon/SPARK-17480-followup.

(cherry picked from commit 86c2d39)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@SparkQA
Copy link

SparkQA commented Sep 17, 2016

Test build #65536 has finished for PR 15093 at commit 8a3d293.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tdas
Copy link
Contributor

tdas commented Sep 17, 2016

@HyukjinKwon @srowen
This PR when merged into branch-2.0 somehow created an empty file QuantileSummaries.scala that is failing the lint test as the Apache license header does not exist -
Commit - a3bba37
Empty file - https://github.com/apache/spark/blob/a3bba372abce926351335d0a2936b70988f19b23/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala

I am not sure how exactly backporting a patch led to an empty file, but this does not seem right. I am reverting this commit in branch 2.0. Please make a new PR to fix this in branch 2.0 correctly.

@srowen
Copy link
Member

srowen commented Sep 17, 2016

Oh weird! no idea why that happened. Yeah I'll take care of it from here.

@tdas
Copy link
Contributor

tdas commented Sep 17, 2016

I reverted already!

asfgit pushed a commit that referenced this pull request Sep 17, 2016
…th/size which is O(n)

This PR fixes all the instances which was fixed in the previous PR.

To make sure, I manually debugged and also checked the Scala source. `length` in [LinearSeqOptimized.scala#L49-L57](https://github.com/scala/scala/blob/2.11.x/src/library/scala/collection/LinearSeqOptimized.scala#L49-L57) is O(n). Also, `size` calls `length` via [SeqLike.scala#L106](https://github.com/scala/scala/blob/2.11.x/src/library/scala/collection/SeqLike.scala#L106).

For debugging, I have created these as below:

```scala
ArrayBuffer(1, 2, 3)
Array(1, 2, 3)
List(1, 2, 3)
Seq(1, 2, 3)
```

and then called `size` and `length` for each to debug.

I ran the bash as below on Mac

```bash
find . -name *.scala -type f -exec grep -il "while (.*\\.length)" {} \; | grep "src/main"
find . -name *.scala -type f -exec grep -il "while (.*\\.size)" {} \; | grep "src/main"
```

and then checked each.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #15093 from HyukjinKwon/SPARK-17480-followup.

(cherry picked from commit 86c2d39)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@srowen
Copy link
Member

srowen commented Sep 17, 2016

Yep, saw that. I re-merged this, and yes during conflict resolution QuantileSummaries.scala comes up as a file added only in the master branch, but when I choose to not take the change in the IDE, I see it actually resulted in adding an empty file. I made sure that was not part of the commit and pushed again. Looks as intended now: 5fd354b

@HyukjinKwon HyukjinKwon deleted the SPARK-17480-followup branch September 18, 2016 14:05
@HyukjinKwon HyukjinKwon restored the SPARK-17480-followup branch September 18, 2016 14:05
@HyukjinKwon HyukjinKwon deleted the SPARK-17480-followup branch September 18, 2016 14:08
wgtmac pushed a commit to wgtmac/spark that referenced this pull request Sep 19, 2016
…th/size which is O(n)

## What changes were proposed in this pull request?

This PR fixes all the instances which was fixed in the previous PR.

To make sure, I manually debugged and also checked the Scala source. `length` in [LinearSeqOptimized.scala#L49-L57](https://github.com/scala/scala/blob/2.11.x/src/library/scala/collection/LinearSeqOptimized.scala#L49-L57) is O(n). Also, `size` calls `length` via [SeqLike.scala#L106](https://github.com/scala/scala/blob/2.11.x/src/library/scala/collection/SeqLike.scala#L106).

For debugging, I have created these as below:

```scala
ArrayBuffer(1, 2, 3)
Array(1, 2, 3)
List(1, 2, 3)
Seq(1, 2, 3)
```

and then called `size` and `length` for each to debug.

## How was this patch tested?

I ran the bash as below on Mac

```bash
find . -name *.scala -type f -exec grep -il "while (.*\\.length)" {} \; | grep "src/main"
find . -name *.scala -type f -exec grep -il "while (.*\\.size)" {} \; | grep "src/main"
```

and then checked each.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#15093 from HyukjinKwon/SPARK-17480-followup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants