Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First of all, and a bit of a sidebar, I have a growing suspicion that this fork doesn't need to exist. It seems that its main purpose is to match dependency versions with EMR, but it's possible to do that in our own pom files using the properties that we're modifying in this. I'm going to experiment with that later.
The purpose of this PR is to match our hbase-server version with the jar provided on EMR. While our HBase is listed as 2.2.6, the library code we have from Amazon is for 2.4.1+. There was a breaking change introduced in HBase 2.4.1 which is present in the jars labeled 2.2.6, and this mismatch prevents
bulkLoadThinRows()
from working.Specifically what is going on is that a couple of helpers moved from
HStore
toStoreUtils
, eggetChecksumType()
. This change was introduced in HBase 2.4.1 via this PR: apache/hbase#2800. If you search that PR's files, you'll see thatgetChecksumType()
moved.hbase-connectors relies on
getChecksumType()
and a few other helpers inbulkLoadThinRows()
, which we use for the Competitor Historical Migration jobs. With these being out of sync, we can't run the bulk load.To fix that, this fast forwards and brings in apache#88.
This also includes reverting a few things in our local fork that I don't think make much sense:
${revision}
and setting it once in the parent properties