-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-20994] Remove redundant characters in OpenBlocks to save memory for shuffle service. #18231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…for shuffle service.
|
n my cluster, we are suffering from OOM of shuffle-service. |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can that really save much memory? Seems trivial
|
Test build #77795 has finished for PR 18231 at commit
|
| mapIdAndReduceIds = new byte[blockIds.length][]; | ||
| if (blockIds.length > 0) { | ||
| for (int i = 0; i< blockIds.length; i++) { | ||
| mapIdAndReduceIds[i] = (blockIdParts[2] + "_" + blockIdParts[3]).getBytes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of storing this as a byte array, how about storing them as ints or longs (depending on what's the actual data type of the id)?
e.g., instead of:
private byte[][] mapIdAndReduceIds;
Which results in blockIds.length + 1 arrays in total, you could have a single one where for each block id you have two entries, one for map id and one for reduce id, or something along those lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this is a good idea. In current change, I make it to be int[blockIds.length][2]. I'm not sure if I understand your comment correctly. Please take another look :)
| private byte[][] mapIdAndReduceIds; | ||
|
|
||
| ManagedBufferIterator(String appId, String execId, String[] blockIds) { | ||
| this.appId = appId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if you see a lot of these in your heap dump too? You could potentially intern appId and execId for some extra memory savings, if you see a lot of those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vanzin
There's one appId and execId per stream. I don't see a lot in my heap dump. Do you have any thoughts for interning this? :)
…d and reduceId pairs.
|
Test build #77806 has finished for PR 18231 at commit
|
|
@srowen |
|
@vanzin |
|
That's 12 bytes. Are there millions of these? |
|
Actually it's more than 12 bytes. |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pardon, I'm missing how this saves memory somewhere -- where is a string stored that's now a shorter string?
| } | ||
| this.shuffleId = blockId0Parts[1]; | ||
| mapIdAndReduceIds = new int[blockIds.length][2]; | ||
| if (blockIds.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is superfluous
| String[] blockId0Parts = blockIds[0].split("_"); | ||
| if (blockId0Parts.length < 4) { | ||
| throw new IllegalArgumentException("Unexpected block id format: " + blockIds[0]); | ||
| } else if (!blockId0Parts[0].equals("shuffle")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the 'else' here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have some kinds of BlockId, I guess it's better to have a check here and we can parse the blockId correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Sean means that since you're throwing in the previous block, else is redundant.
| } | ||
| } | ||
|
|
||
| private class ManagedBufferIterator implements Iterator<ManagedBuffer> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why break this out -- it's not necessary for the change right? just for clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the iterator is becoming a little bit complicated. So I break this out and give a constructor.
|
@srowen Sorry, I didn't make it clear.
|
|
The current iterator doesn't have any state except for an int. What are you referring to? |
|
I mean the blockIds in |
|
I get it. But that doesn't make the reference in OpenBlocks go away. This only helps anything is msg/msgObj can be garbage collected earlier. Is that the case? right now this is allocating additional memory, not instead of the existing memory. |
|
The blockIds cannot be freed because they are referenced in the iterator. In current change they are not. We reference the mapIdAndReduceIds instead. Thus the blockIds in OpenBlocks can be garbage collected. |
|
That's not the question though. The question is whether they could be freed even after this change. msg still references it. That's what you need to establish, if only by some empirical testing. |
|
there is no where referencing |
|
I'm not clear that's true, no. Not, at least, in the lifetime of the iterator. That's what has to be true for this to help anything. Do you have evidence this is true? for example if you have tests that clearly show the memory is released earlier, that would be good evidence. |
|
Yes, I think it's great to do some tests and give a good evidence. |
|
@srowen |
|
Test build #77811 has finished for PR 18231 at commit
|
| private String execId; | ||
| private String shuffleId; | ||
| // An array containing mapId and reduceId pairs. | ||
| private int[][] mapIdAndReduceIds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I mean a single array. e.g.
int[] mapIdAndReduceIds;
mapIdAndReduceIds = new int[blockIds.length * 2];
mapIdAndReduceIds[0] = mapId1;
mapIdAndReduceIds[1] = reduceId1;
mapIdAndReduceIds[2] = mapId2;
mapIdAndReduceIds[3] = reduceId2;
etc etc etc
Reason being that if you really have millions of these, each "child" array in your two-dimensional array wastes 16 (or 20?) bytes (16 bytes of object overhead + 4 bytes for the array length). Looking in jvisualvm, an empty array actually consumes 24 bytes, so it seems the JVM is aligning things and wasting an extra 4 bytes per array...
|
Test build #77831 has finished for PR 18231 at commit
|
|
@srowen I don't see any references to the original |
|
There isn't a reference here anymore; there could be elsewhere. It sounds like there's good reason to believe there is not another reference hanging around though. |
Only if there was a bug in the RPC layer, since this is an RPC handler and the message should not be referenced by the RPC code after the method returns. |
|
Test build #77863 has finished for PR 18231 at commit
|
|
Test build #77862 has finished for PR 18231 at commit
|
| * Obtains a FileSegmentManagedBuffer from (shuffleId, mapId, reduceId). We make assumptions | ||
| * about how the hash and sort based shuffles store their data. | ||
| */ | ||
| public ManagedBuffer getBlockData(String appId, String execId, int shuffleId, int mapId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: style. See constructor at top of file for the style when param lists are long.
| * assumptions about how the hash and sort based shuffles store their data. | ||
| * format "shuffle_ShuffleId_MapId_ReduceId" (from ShuffleBlockId). | ||
| */ | ||
| public ManagedBuffer getBlockData(String appId, String execId, String blockId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this method used anywhere else? I only see ExternalShuffleBlockHandler using this class, and it now uses the new method. If only unit tests use this, then remove this method and fix the unit tests.
|
Test build #78017 has finished for PR 18231 at commit
|
3239653 to
a2af617
Compare
|
Test build #78022 has finished for PR 18231 at commit
|
vanzin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit otherwise LGTM. I'll leave it overnight in case others want to take a look.
| } | ||
| this.shuffleId = Integer.parseInt(blockId0Parts[1]); | ||
| mapIdAndReduceIds = new int[2 * blockIds.length]; | ||
| for (int i = 0; i< blockIds.length; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: space before <
|
(also PR title has a typo, should be "redundant") |
|
Test build #78079 has finished for PR 18231 at commit
|
jiangxb1987
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change seems reasonable, but the code style still need to be improved. Also cc @cloud-fan to make a pass.
| this.appId = appId; | ||
| this.execId = execId; | ||
| String[] blockId0Parts = blockIds[0].split("_"); | ||
| if (blockId0Parts.length < 4) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about use require(blockId0Parts.length < 4, "Unexpected block id format: " + blockIds[0]) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to throw the IllegalArgumentException.
Pardon, I'm not sure how to use require in java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, didn't notice they are java code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we be more strict and use blockId0Parts.length != 4?
| if (blockId0Parts.length < 4) { | ||
| throw new IllegalArgumentException("Unexpected block id format: " + blockIds[0]); | ||
| } | ||
| if (!blockId0Parts[0].equals("shuffle")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| mapIdAndReduceIds = new int[2 * blockIds.length]; | ||
| for (int i = 0; i < blockIds.length; i++) { | ||
| String[] blockIdParts = blockIds[i].split("_"); | ||
| if (Integer.parseInt(blockIdParts[1]) != shuffleId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blockIdParts[1].toInt != shuffleId ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to do that in java, right?
| } | ||
| this.shuffleId = Integer.parseInt(blockId0Parts[1]); | ||
| mapIdAndReduceIds = new int[2 * blockIds.length]; | ||
| for (int i = 0; i < blockIds.length; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about rewrite this to be imperative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pardon, could you give an example?
| throw new RuntimeException( | ||
| String.format("Executor is not registered (appId=%s, execId=%s)", appId, execId)); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should keep the original format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two blank lines originally. I guess it's appropriate to remove one?
| }; | ||
|
|
||
| String[] blockIds = { "shuffle_2_3_4", "shuffle_6_7_8" }; | ||
| String[] blockIds = { "shuffle_0_1_2", "shuffle_0_3_4" }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change ,we cannot shuffle blocks with multiple shuffleIds
|
@jiangxb1987 |
| this.shuffleId = Integer.parseInt(blockId0Parts[1]); | ||
| mapIdAndReduceIds = new int[2 * blockIds.length]; | ||
| for (int i = 0; i < blockIds.length; i++) { | ||
| String[] blockIdParts = blockIds[i].split("_"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we check blockIdParts[0] == "shufle"?
|
|
||
| @Override | ||
| public boolean hasNext() { | ||
| return index < mapIdAndReduceIds.length / 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can keep a pos, and increase it by 2 in next, so here we can just write pos < mapIdAndReduceIds.length to save a division.
|
LGTM except some minor comments |
|
@cloud-fan |
|
Test build #78155 has started for PR 18231 at commit |
| this.appId = appId; | ||
| this.execId = execId; | ||
| String[] blockId0Parts = blockIds[0].split("_"); | ||
| if (blockId0Parts.length < 4 || !blockId0Parts[0].equals("shuffle")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use blockId0Parts.length != 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
|
Test build #78157 has started for PR 18231 at commit |
|
Jenkins, retest this please |
|
Test build #78166 has finished for PR 18231 at commit
|
|
thanks, merging to master! |
|
@cloud–fan |
What changes were proposed in this pull request?
In current code, blockIds in
OpenBlocksare stored in the iterator on shuffle service.There are some redundant characters in blockId(
"shuffle_" + shuffleId + "_" + mapId + "_" + reduceId). This pr proposes to improve the footprint and alleviate the memory pressure on shuffle service.