-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26169 Fix MapReduceBackupCopyJob.BackupDistCp.getKey() concatenates strings #3562
base: master
Are you sure you want to change the base?
Conversation
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@@ -331,12 +331,12 @@ protected Path createInputFileListing(Job job) throws IOException { | |||
private Text getKey(Path path) { | |||
int level = conf.getInt(NUMBER_OF_LEVELS_TO_PRESERVE_KEY, 1); | |||
int count = 0; | |||
String relPath = ""; | |||
StringBuilder relPath = new StringBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that obvious to me that using StringBuilder in this way is faster (looking at the jira, I assumed this is the purpose of the change) than simple string concatenation, because StringBuilder.insert() may involve multiple array copy.
How about instantiating StringBuilder with a capacity (=level) as parameter so that we don't need to re-allocate & copy arrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the length of path.getName()
is not fixed, how to determine the initial capacity?
Unless the Path is traversed again and the length of the name is calculated, the performance will obviously not improve. Splicing strings in a loop will generate a new string each time. In my opinion, using StringBuilder
will improve performance, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a possible way is to store the path name in a list and then concat the list of strings from tail to head?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But is it acceptable to apply for the space of the collection and the time cost of traversing the collection again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Apache9 I changed to use a collection to store strings, PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. Thank you.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
MapReduceBackupCopyJob.BackupDistCp.getKey()
concatenates strings in a loop, should consider usingStringBuilder
to concatenate strings, thanks