-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29055][CORE] Update driver/executors' storage memory when block is removed from BlockManager #25973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…k is removed from BlockManager
|
Test build #111601 has finished for PR 25973 at commit
|
That won't be true when #25779 is pushed. Does that fix this issue too? |
|
Actually my fix is for cached blocks, broadcast blocks would also need something similar for everything to be fixed. Anyway, I think fixing |
Sorry I forgot #25779 while working. Investigation started from seeking memory leak on driver and I was too focused about fixing the issue without trying to look back. I'll take a look at the change being proposed in #25779 (Looks like it's about to be finished and expected to be merged soon though.)
Got it. Would you like to deal with it by yourself, or let me deal with this? |
|
After investigating more things around my original change, this LGTM. It's good to have all block updates contain consistent information (delta vs. actual values). There's still SPARK-29319 but that's a separate issue. Merging to master / 2.4. |
…k is removed from BlockManager This patch proposes to fix the issue that storage memory is not decreasing even block is removed in BlockManager. Originally the issue is found while removed broadcast doesn't reflect the storage memory on driver/executors. AppStatusListener expects the value of memory in events on block update as "delta" so that it adjusts driver/executors' storage memory based on delta, but when removing block BlockManager reports the delta as 0, so the storage memory is not decreased. `BlockManager.dropFromMemory` deals with this correctly, so some of path of freeing memory has been updated correctly. The storage memory in metrics in AppStatusListener is now out of sync which lets end users easy to confuse as memory leak is happening. No. Modified UTs. Also manually tested via running simple query repeatedly and observe executor page of Spark UI to see the value of storage memory is decreasing as well. Please refer the description of [SPARK-29055](https://issues.apache.org/jira/browse/SPARK-29055) to get simple reproducer. Closes #25973 from HeartSaVioR/SPARK-29055. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> (cherry picked from commit a4601cb) Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
|
Thanks for reviewing and merging! |
|
Is there gonna be a 2.4.5 release soon? |
so maybe not. You may want to ask about this in the mailing list if you are affected by the origin bug and it is so critical for you to fix it. |
|
Will post in the mailing list. thanks |
What changes were proposed in this pull request?
This patch proposes to fix the issue that storage memory is not decreasing even block is removed in BlockManager. Originally the issue is found while removed broadcast doesn't reflect the storage memory on driver/executors.
AppStatusListener expects the value of memory in events on block update as "delta" so that it adjusts driver/executors' storage memory based on delta, but when removing block BlockManager reports the delta as 0, so the storage memory is not decreased.
BlockManager.dropFromMemorydeals with this correctly, so some of path of freeing memory has been updated correctly.Why are the changes needed?
The storage memory in metrics in AppStatusListener is now out of sync which lets end users easy to confuse as memory leak is happening.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Modified UTs. Also manually tested via running simple query repeatedly and observe executor page of Spark UI to see the value of storage memory is decreasing as well.
Please refer the description of SPARK-29055 to get simple reproducer.