Skip to content

Conversation

@nongli
Copy link
Contributor

@nongli nongli commented Mar 31, 2016

What changes were proposed in this pull request?

Currently, we determine if the RDD will produce batches or rows by looking at the first
value. After other clean ups, this is not necessary anymore. This simplifies the code
and let's us measure time spent in the first batch.

… runtime check.

Currently, we determine if the RDD will produce batches or rows by looking at the first
value. After other clean ups, this is not necessary anymore. This simplifies the code
and let's us measure time spent in the first batch.
|if ($input.hasNext()) {
| $scanRows((InternalRow) $input.next());
|}
| while (!shouldStop() && $input.hasNext()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put !shouldStop() into the end of the while-loop to avoid performance degradation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a few variants and didn't see a difference. I think only the batched scan is fast enough to be sensitive to this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thank you for your information. A result of the experiments is useful. It may be good to leave the result in a comment.

@SparkQA
Copy link

SparkQA commented Mar 31, 2016

Test build #54666 has finished for PR 12098 at commit 303312c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants