Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented Jul 7, 2021

What changes were proposed in this pull request?

We add the interface for DS v2 metrics in SPARK-34366. It is only added for reading path, though. This patch extends the metrics interface to writing path.

Why are the changes needed?

Complete DS v2 metrics interface support in writing path.

Does this PR introduce any user-facing change?

No. For developer, yes, as this adds metrics support at DS v2 writing path.

How was this patch tested?

Added test.

@viirya viirya marked this pull request as draft July 7, 2021 02:52
@viirya
Copy link
Member Author

viirya commented Jul 7, 2021

cc @cloud-fan

@SparkQA

This comment has been minimized.

@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45238/

@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45238/

@viirya viirya force-pushed the v2-write-metrics branch from 245db51 to 05800b4 Compare July 7, 2021 04:47
@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45250/

@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45250/

@github-actions github-actions bot added the WEB UI label Jul 7, 2021
@viirya viirya changed the title [WIP][SPARK-36030][SQL] Support DS v2 metrics at writing path [SPARK-36030][SQL] Support DS v2 metrics at writing path Jul 7, 2021
@viirya viirya marked this pull request as ready for review July 7, 2021 06:41
@viirya
Copy link
Member Author

viirya commented Jul 7, 2021

cc @maropu too

@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45254/

@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45254/

@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Test build #140740 has finished for PR 33239 at commit 05800b4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class ContinuousWriteRDD(var prev: RDD[InternalRow], writerFactory: StreamingDataWriterFactory,
  • case class WriteToContinuousDataSource(write: StreamingWrite, query: LogicalPlan,
  • case class WriteToContinuousDataSourceExec(write: StreamingWrite, query: SparkPlan,

@SparkQA
Copy link

SparkQA commented Jul 7, 2021

Test build #140743 has finished for PR 33239 at commit f656335.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Jul 8, 2021

I also ran a query using the CustomMetricsDataSource added in the test. The UI looks like:

Screen Shot 2021-07-07 at 6 45 45 PM

@viirya
Copy link
Member Author

viirya commented Jul 9, 2021

@cloud-fan Do we consider this in 3.2 to make the API complete in this release?

@viirya
Copy link
Member Author

viirya commented Jul 15, 2021

gentle ping @cloud-fan

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks reasonable to me, @viirya . I left a few comments about test cases.

It would be great if we can get your advice, @cloud-fan , @gengliangwang , @HeartSaVioR , @Ngone51 , @maropu , @HyukjinKwon , @sunchao , @huaxingao .

@viirya
Copy link
Member Author

viirya commented Jul 19, 2021

Thank you @dongjoon-hyun! I will address the comments.

taskAttemptContext: TaskAttemptContext,
committer: FileCommitProtocol) extends DataWriter[InternalRow] {
committer: FileCommitProtocol,
customMetrics: Map[String, SQLMetric]) extends DataWriter[InternalRow] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add tests for the changes in this File?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to add one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added custom metric for writing to InMemory table for test purpose. The tests are in FileFormatDataWriterMetricSuite.

@gengliangwang
Copy link
Member

@viirya Thanks for the work! Overall this LGTM

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Test build #141294 has finished for PR 33239 at commit fe9cc4e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class InMemorySimpleCustomMetric extends CustomMetric

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45827/

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45827/

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Test build #141298 has finished for PR 33239 at commit bccc98b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Test build #141313 has finished for PR 33239 at commit f62f057.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Jul 20, 2021

retest this please

@viirya
Copy link
Member Author

viirya commented Jul 20, 2021

Thanks @dongjoon-hyun @gengliangwang @sunchao. I will merge this after tests pass.

cc @cloud-fan

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45864/

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45864/

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Test build #141350 has finished for PR 33239 at commit 25cc546.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Jul 20, 2021

retest this please

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Test build #141359 has finished for PR 33239 at commit 25cc546.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Jul 20, 2021

retest this please

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45873/

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45876/

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45873/

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45876/

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45878/

@SparkQA
Copy link

SparkQA commented Jul 20, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45878/

@SparkQA
Copy link

SparkQA commented Jul 21, 2021

Test build #141362 has finished for PR 33239 at commit 25cc546.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Jul 21, 2021

Thanks. Merging to master/3.2.

@viirya viirya closed this in 2653201 Jul 21, 2021
viirya added a commit that referenced this pull request Jul 21, 2021
### What changes were proposed in this pull request?

We add the interface for DS v2 metrics in SPARK-34366. It is only added for reading path, though. This patch extends the metrics interface to writing path.

### Why are the changes needed?

Complete DS v2 metrics interface support in writing path.

### Does this PR introduce _any_ user-facing change?

No. For developer, yes, as this adds metrics support at DS v2 writing path.

### How was this patch tested?

Added test.

Closes #33239 from viirya/v2-write-metrics.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
(cherry picked from commit 2653201)
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
@cloud-fan
Copy link
Contributor

late LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants