Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-24791 Improve HFileOutputFormat2 to avoid always call getTableRelativePath method #2167

Merged
merged 3 commits into from
Aug 3, 2020
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,7 @@ static <V extends Cell> RecordWriter<ImmutableBytesWritable, V> createRecordWrit
private final Map<byte[], WriterLength> writers = new TreeMap<>(Bytes.BYTES_COMPARATOR);
private final Map<byte[], byte[]> previousRows = new TreeMap<>(Bytes.BYTES_COMPARATOR);
private final long now = EnvironmentEdgeManager.currentTime();
private byte[] tableNameBytes = Bytes.toBytes(writeTableNames);;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra semicolon at end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,thanks for ponit it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has finished


@Override
public void write(ImmutableBytesWritable row, V cell) throws IOException {
Expand All @@ -235,7 +236,6 @@ public void write(ImmutableBytesWritable row, V cell) throws IOException {
byte[] rowKey = CellUtil.cloneRow(kv);
int length = (PrivateCellUtil.estimatedSerializedSizeOf(kv)) - Bytes.SIZEOF_INT;
byte[] family = CellUtil.cloneFamily(kv);
byte[] tableNameBytes = null;
if (writeMultipleTables) {
tableNameBytes = MultiTableHFileOutputFormat.getTableName(row.get());
tableNameBytes = TableName.valueOf(tableNameBytes).getNameWithNamespaceInclAsString()
Expand All @@ -244,11 +244,7 @@ public void write(ImmutableBytesWritable row, V cell) throws IOException {
throw new IllegalArgumentException("TableName " + Bytes.toString(tableNameBytes) +
" not expected");
}
} else {
tableNameBytes = Bytes.toBytes(writeTableNames);
}
String tableName = Bytes.toString(tableNameBytes);
Path tableRelPath = getTableRelativePath(tableNameBytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a nice find.
BTW reading this part of code, I think so much of optimization we can do. It might not be relevant as this change. Though good to do IMO
...
} else {
tableNameBytes = Bytes.toBytes(writeTableNames);
}
String tableName = Bytes.toString(tableNameBytes);
...
Only in case of multiTable, we need create tableName and tableNameBytes again and again for every write. So we can make sure this is created at 1st and do not create every time for not multiTable case.

private Path getTableRelativePath(byte[] tableNameBytes) {
String tableName = Bytes.toString(tableNameBytes);
String[] tableNameParts = tableName.split(":");
Path tableRelPath = new Path(tableName.split(":")[0]);
if (tableNameParts.length > 1) {
tableRelPath = new Path(tableRelPath, tableName.split(":")[1]);
}
return tableRelPath;
}
split is done 3 times. We can just refer tableNameParts[0], tableNameParts[1] and other places.
Can u pls include these kind of fixes also in PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tableName split 3 times int getTableRelativePath method has fixed in this PR

i will do some other fixes in next commit , the code there is a little mess

byte[] tableAndFamily = getTableNameSuffixedWithFamily(tableNameBytes, family);

WriterLength wl = this.writers.get(tableAndFamily);
Expand All @@ -257,9 +253,9 @@ public void write(ImmutableBytesWritable row, V cell) throws IOException {
if (wl == null) {
Path writerPath = null;
if (writeMultipleTables) {
Path tableRelPath = getTableRelativePath(tableNameBytes);
writerPath = new Path(outputDir,new Path(tableRelPath, Bytes.toString(family)));
}
else {
} else {
writerPath = new Path(outputDir, Bytes.toString(family));
}
fs.mkdirs(writerPath);
Expand All @@ -274,39 +270,36 @@ public void write(ImmutableBytesWritable row, V cell) throws IOException {

// create a new WAL writer, if necessary
if (wl == null || wl.writer == null) {
InetSocketAddress[] favoredNodes = null;
if (conf.getBoolean(LOCALITY_SENSITIVE_CONF_KEY, DEFAULT_LOCALITY_SENSITIVE)) {
HRegionLocation loc = null;

String tableName = Bytes.toString(tableNameBytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a format issue here?

if (tableName != null) {
try (Connection connection = ConnectionFactory.createConnection(conf);
RegionLocator locator =
connection.getRegionLocator(TableName.valueOf(tableName))) {
RegionLocator locator = connection.getRegionLocator(TableName.valueOf(tableName))) {
loc = locator.getRegionLocation(rowKey);
} catch (Throwable e) {
LOG.warn("Something wrong locating rowkey {} in {}",
Bytes.toString(rowKey), tableName, e);
LOG.warn("Something wrong locating rowkey {} in {}", Bytes.toString(rowKey),
tableName, e);
loc = null;
} }

}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also.

if (null == loc) {
LOG.trace("Failed get of location, use default writer {}", Bytes.toString(rowKey));
wl = getNewWriter(tableNameBytes, family, conf, null);
} else {
LOG.debug("First rowkey: [{}]", Bytes.toString(rowKey));
InetSocketAddress initialIsa =
new InetSocketAddress(loc.getHostname(), loc.getPort());
if (initialIsa.isUnresolved()) {
LOG.trace("Failed resolve address {}, use default writer", loc.getHostnamePort());
wl = getNewWriter(tableNameBytes, family, conf, null);
} else {
LOG.debug("Use favored nodes writer: {}", initialIsa.getHostString());
wl = getNewWriter(tableNameBytes, family, conf, new InetSocketAddress[] { initialIsa
});
favoredNodes = new InetSocketAddress[] { initialIsa};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this apply when LOCALITY_SENSITIVE_CONF_KEY check on line 274 is false ?

Copy link
Contributor Author

@utf7 utf7 Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the LOCALITY_SENSITIVE_CONF_KEY is false or get locality failed,the favoredNodes will be null
if LOCALITY_SENSITIVE_CONF_KEY = true and get localicy success ,the favoredNodes will be not null

same logic with before , just code clean

before this pr, too much wl = getNewWriter in the code

}
}
} else {
wl = getNewWriter(tableNameBytes, family, conf, null);
}
wl = getNewWriter(tableNameBytes, family, conf, favoredNodes);

}

// we now have the proper WAL writer. full steam ahead
Expand All @@ -321,9 +314,9 @@ public void write(ImmutableBytesWritable row, V cell) throws IOException {
private Path getTableRelativePath(byte[] tableNameBytes) {
String tableName = Bytes.toString(tableNameBytes);
String[] tableNameParts = tableName.split(":");
Path tableRelPath = new Path(tableName.split(":")[0]);
Path tableRelPath = new Path(tableNameParts[0]);
if (tableNameParts.length > 1) {
tableRelPath = new Path(tableRelPath, tableName.split(":")[1]);
tableRelPath = new Path(tableRelPath, tableNameParts[1]);
}
return tableRelPath;
}
Expand Down Expand Up @@ -377,15 +370,14 @@ private WriterLength getNewWriter(byte[] tableName, byte[] family, Configuration
encoding = encoding == null ? datablockEncodingMap.get(tableAndFamily) : encoding;
encoding = encoding == null ? DataBlockEncoding.NONE : encoding;
HFileContextBuilder contextBuilder = new HFileContextBuilder()
.withCompression(compression).withChecksumType(HStore.getChecksumType(conf))
.withCompression(compression).withDataBlockEncoding(encoding).withChecksumType(HStore.getChecksumType(conf))
.withBytesPerCheckSum(HStore.getBytesPerChecksum(conf)).withBlockSize(blockSize)
.withColumnFamily(family).withTableName(tableName);

if (HFile.getFormatVersion(conf) >= HFile.MIN_FORMAT_VERSION_WITH_TAGS) {
contextBuilder.withIncludesTags(true);
}

contextBuilder.withDataBlockEncoding(encoding);
HFileContext hFileContext = contextBuilder.build();
if (null == favoredNodes) {
wl.writer = new StoreFileWriter.Builder(conf, CacheConfig.DISABLED, fs)
Expand Down