-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-39635][SQL] Support driver metrics in DS v2 custom metric API #37205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Seems some style issues: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep this blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, this also fixes the checkstyle issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the JIRA number? I.e. "SPARK-39635: Report driver metrics ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for BatchScanExec, there seems other places we need doing this too. E.g., MicroBatchScanExec, ContinuousScanExec which are based on DataSourceV2ScanExecBase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For writer, there is V2TableWriteExec, V2ExistingTableWriteExec, WriteToContinuousDataSourceExec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@viirya @cloud-fan I will create a follow up PR for the write paths.
As for read paths, I have updated ContinuousScanExec, MicroBatchScanExec similar to BatchScanExec.
Can you please point me to test case that I can run to verify if the behavior is as expected.
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the API, it looks good to me. Just a few places in read/write paths we need to update the driver metrics too.
|
cc @cloud-fan |
|
apart from that, what about SupportsMetadata interface in DSV2 base implementation? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems this is redundant?
return new CustomTaskMetric[]{};There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed as per suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Returns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm what if the driver metric is not in metrics? this will throw NoSuchElementException?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I think so. Considering this is implemented by the developers for DS v2 data sources and not from end-users, I don't think this would happen. Otherwise it is caught during development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe reportDriverMetrics? there is also no custom in PartitionReader.currentMetricsValues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reverted this file as I plan to cover write paths in a seperate PR
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I'm okay to have a follow-up for write paths.
|
@karuppayya Can you rebase with master to retrigger CI? Thanks. |
sunchao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too
|
I will merge this in next few days if no more comments. cc @cloud-fan |
|
Thanks. Merging to master. |
cloud-fan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
late LGTM
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, late LGTM. Thank you all.
### What changes were proposed in this pull request? Add `reportDriverMetrics` method to `Write` API and post custom metrics from driver after v2 write commits. ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? #37205 supported reporting custom driver metrics when reading from v2 table. This is to support that when writing to v2 table. ### How was this patch tested? UT. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #48573 from manuzhang/v2write-metrics. Lead-authored-by: Wenchen Fan <cloud0fan@gmail.com> Co-authored-by: Zhang, Manu <tianlzhang@ebay.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Expose custom metrics available on driver from DS v2 data sources on SQL UI.
Why are the changes needed?
115ed89 introduces a mechanism to add custom metrics for DS v2 data sources. But it only supports executor metrics and there is currently no mechanism to expose driver metrics from the API.
Does this PR introduce any user-facing change?
Yes

How was this patch tested?
Added unit tests