Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Requires: https://github.com/jpdna/bdghbase commit 809c554
This PR is a stub for any changes to ADAM to make use of HBase code as a separate repo, and discussion of that approach.
I decided to try splitting out the HBase code into a different repo, bdghbase above.
The motivation I see for this are:
We may have a number of different data backends including HBase - Kudu for one. It may be better not to put these all into the main ADAM repo.
I wasn't able to yet find a way to get "provided" dependency for hbase library code to work, so it will be bloating the compiled hbase module for now , better to keep that out of main ADAM, even if its working fine.
Testing with CI is going to be a pain for hbase, for the moment tests are going to require that an hbase instance is accessible, as I can't seem to get the mock "mini" hbase cluster to work. Thus this complicates testing and CI, so better to keep that complexity in separate repo.
nice to have fast compilation time for the hbase code under development
Choice of separate repo or not seems irrelevant to how adam-shell or downstream applications using ADAM as a library would be built, I can interact with ADAM and hbase from adam-shell just as before using this PR.
I plan to add a "hbase" profile that will turn of the hbase dependencies in ADAM, sound good?
I will clearly have to publish bdghbase to maven was currently do for bdg-utils and other seperate bdg repos.
I'm not sure how to handle the CLI in this instance for hbase. We could include the vcf2Hbase CLI code as it currently is in this PR, and perhaps it would only work in the -hbase profile was turned on. We'd want the command and cli help to also be different based on the profile.
This doesn't seem ideal to me though - if bdghbase (or bdgkudu) is to be a separate repo, I feel like the CLI code should be in that repo as well, but I am not sure how to best integrate it with the current ADAM CLI, thoughts?
For now I am looking for general comments on this approach. I'll ask later for review of the hbase code, as I am nearly done addressing the comments in: #1246