-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why is GpuCoalesceBatches performance is bad sometimes ? #6107
Comments
GpuCoalesceBatches is not the slow part here. The actual concatenation of the batches, measured by the Since the image only shows this specific node I can't say offhand which node was the slowest, but it wasn't this one. |
Should this issue move to https://github.com/nvidia/spark-rapids? |
Yes, this is a question on the Spark plugin not cudf. |
i am sorry , i carelessly sent it in the wrong place |
here is CPU SparkSQL history UI , the whole query only spent 1.3m |
That's the wrong direction in the query plan to look. The I can't readily explain why the worst-case coalesce batch time collect time was only 19.3s but the worst-case build time was 59.6s and the join itself isn't most of the difference. Tasks waiting for their turn on the GPU can factor into these metrics, but I would expect any time spent waiting for the GPU to be accounted for in both node's metrics. I'll spend some time diffing into the details you uploaded, but we may need an nsight systems profile trace to get more details on where the time is being spent in the join vs. collect. |
@chenrui17 can you file this issue in https://github.com/NVIDIA/spark-rapids? This issue needs to be tracked there. |
What is your question?
I test tpcds query93, i found that GpuCoalesceBatches is long tail , I want to know why and how to know what's happening
The text was updated successfully, but these errors were encountered: