You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's an enormous task deserialization time during optimizations—specifically the last collect from RollupDataWriter.compact().
The IndexStatus.cubeStatuses is packaged within each task, and their size increases as the metadata size increases.
How to reproduce?
Try to optimize a relatively large table and compare the Task Deserialization Time from the second collect with that from the execute.; The values from execute should be an order of magnitude smaller.
What went wrong?
There's an enormous task deserialization time during optimizations—specifically the last
collect
fromRollupDataWriter.compact().
The
IndexStatus.cubeStatuses
is packaged within each task, and their size increases as the metadata size increases.How to reproduce?
Try to optimize a relatively large table and compare the
Task Deserialization Time
from the second collect with that from theexecute.
; The values fromexecute
should be an order of magnitude smaller.2. Branch and commit id:
main,
b7f19063. Spark version:
3.5.0
4. Hadoop version:
3.3.4
5. How are you running Spark?
locally,
distributed
The text was updated successfully, but these errors were encountered: