-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-24791 Improve HFileOutputFormat2 to avoid always call getTableRelativePath method #2167
Conversation
This can imporve a lot A simple test case before improve cost: 8566118066 7954087533 7990385173 8121662923 8094065619 8107938198 8045528125 8116824329 8122912440 8108117395 before all cost: 81227.639801 ms before vs after: 29198.60908098455:1 The test code in |
why don't trigger the related link to HBASE JIRA ? |
mind have a look ? thanks @Apache9 @infraio @guangxuCheng @pankaj72981 |
@@ -32,8 +32,6 @@ | |||
public final static LongAdder tot_mgr_log_split_batch_start = new LongAdder(); | |||
public final static LongAdder tot_mgr_log_split_batch_success = new LongAdder(); | |||
public final static LongAdder tot_mgr_log_split_batch_err = new LongAdder(); | |||
public final static LongAdder tot_mgr_new_unexpected_wals = new LongAdder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove HBASE-24790 related change in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for view @pankaj72981
I will remove the HBASE-24790 related change in this PR
🎊 +1 overall
This message was automatically generated. |
@@ -248,7 +248,6 @@ public void write(ImmutableBytesWritable row, V cell) throws IOException { | |||
tableNameBytes = Bytes.toBytes(writeTableNames); | |||
} | |||
String tableName = Bytes.toString(tableNameBytes); | |||
Path tableRelPath = getTableRelativePath(tableNameBytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a nice find.
BTW reading this part of code, I think so much of optimization we can do. It might not be relevant as this change. Though good to do IMO
...
} else {
tableNameBytes = Bytes.toBytes(writeTableNames);
}
String tableName = Bytes.toString(tableNameBytes);
...
Only in case of multiTable, we need create tableName and tableNameBytes again and again for every write. So we can make sure this is created at 1st and do not create every time for not multiTable case.
private Path getTableRelativePath(byte[] tableNameBytes) {
String tableName = Bytes.toString(tableNameBytes);
String[] tableNameParts = tableName.split(":");
Path tableRelPath = new Path(tableName.split(":")[0]);
if (tableNameParts.length > 1) {
tableRelPath = new Path(tableRelPath, tableName.split(":")[1]);
}
return tableRelPath;
}
split is done 3 times. We can just refer tableNameParts[0], tableNameParts[1] and other places.
Can u pls include these kind of fixes also in PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tableName split 3 times int getTableRelativePath method has fixed in this PR
i will do some other fixes in next commit , the code there is a little mess
…elativePath method
remove the HBASE-24790 code and do more code clean by review @anoopsjohn @pankaj72981 please have a look, thanks |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@@ -222,6 +222,7 @@ public RegionLocator getRegionLocator() { | |||
private final Map<byte[], WriterLength> writers = new TreeMap<>(Bytes.BYTES_COMPARATOR); | |||
private final Map<byte[], byte[]> previousRows = new TreeMap<>(Bytes.BYTES_COMPARATOR); | |||
private final long now = EnvironmentEdgeManager.currentTime(); | |||
private byte[] tableNameBytes = Bytes.toBytes(writeTableNames);; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra semicolon at end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok,thanks for ponit it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has finished
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
} else { | ||
LOG.debug("Use favored nodes writer: {}", initialIsa.getHostString()); | ||
wl = getNewWriter(tableNameBytes, family, conf, new InetSocketAddress[] { initialIsa | ||
}); | ||
favoredNodes = new InetSocketAddress[] { initialIsa}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this apply when LOCALITY_SENSITIVE_CONF_KEY check on line 274 is false ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if the LOCALITY_SENSITIVE_CONF_KEY
is false or get locality failed,the favoredNodes
will be null
if LOCALITY_SENSITIVE_CONF_KEY
= true and get localicy success ,the favoredNodes
will be not null
same logic with before , just code clean
before this pr, too much wl = getNewWriter
in the code
Can you see why javadoc for hbase-mapreduce didn't pass ? Thanks |
thanks ,@tedyu in my local env
all passed how to trigger a retry ? |
rery qa |
The failure of javadoc for jdk11 is expected, we still have problems building javadoc with java11, that's why we just issue a -0, not -1. |
+1 |
The code hash finished , mind have a look, thanks @anoopsjohn @pankaj72981 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Nice work
Added few minor comments.
if (conf.getBoolean(LOCALITY_SENSITIVE_CONF_KEY, DEFAULT_LOCALITY_SENSITIVE)) { | ||
HRegionLocation loc = null; | ||
|
||
String tableName = Bytes.toString(tableNameBytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a format issue here?
} } | ||
|
||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here also.
.withCompression(compression).withChecksumType(HStore.getChecksumType(conf)) | ||
.withBytesPerCheckSum(HStore.getBytesPerChecksum(conf)).withBlockSize(blockSize) | ||
.withColumnFamily(family).withTableName(tableName); | ||
HFileContextBuilder contextBuilder = new HFileContextBuilder().withCompression(compression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls check format issue at these changed/added lines once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i move the compression set config to here and has been use hbase-eclipse-format format the code here
@@ -222,6 +222,7 @@ public RegionLocator getRegionLocator() { | |||
private final Map<byte[], WriterLength> writers = new TreeMap<>(Bytes.BYTES_COMPARATOR); | |||
private final Map<byte[], byte[]> previousRows = new TreeMap<>(Bytes.BYTES_COMPARATOR); | |||
private final long now = EnvironmentEdgeManager.currentTime(); | |||
private byte[] tableNameBytes = Bytes.toBytes(writeTableNames); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do this under writeMultipleTables check?
private byte[] tableNameBytes = (writeMultipleTables)? null: Bytes.toBytes(writeTableNames);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes,i want to make code simple, so just keep init no matter multiple or not
It is a good ponit to use use writeMultipleTables check,will address it
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
…elativePath method (apache#2167) Signed-off-by: Anoop <anoopsamjohn@apache.org> Signed-off-by: Ted Yu <tyu@apache.org> Change-Id: I684e0b79d2eb7d8cc10d2cdd3ef9ececc1cfd49b
…elativePath method (apache#2167) Signed-off-by: Anoop <anoopsamjohn@apache.org> Signed-off-by: Ted Yu <tyu@apache.org> Change-Id: Ibf946930c20b15595a846d7b62e8b59cee42be07
…elativePath method (apache#2167) Signed-off-by: Anoop <anoopsamjohn@apache.org> Signed-off-by: Ted Yu <tyu@apache.org> Change-Id: Ief23ecd94cba02af177a8187ad27a1d13fd0243e
…elativePath method (apache#2167) Signed-off-by: Anoop <anoopsamjohn@apache.org> Signed-off-by: Ted Yu <tyu@apache.org> Change-Id: Ic1b21413b73c43d5591ae37800a2432b8a2ca688
HBASE-24791 Improve HFileOutputFormat2 to avoid always call getTableRelativePath method