-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-25972 Dual File Compaction #5545
Conversation
🎊 +1 overall
This message was automatically generated. |
* older put cells and delete markers). | ||
*/ | ||
@InterfaceAudience.Private | ||
public class DualFileStoreEngine extends StoreEngine<DefaultStoreFlusher, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we extend DefaultStoreEngine instead? There are few methods that are getting duplicated (entirely or partially) from there (needsCompaction, createComponents, createCompactor).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it but it did not work. The signature of the class requires DefaultCompactor. This means I need to extend DefaultCompactor to implement DualFileCompactor. However, DefaultCompactor supports only StoreFileWriter, a single file writer. So I ended up with a new engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all IA.Private classes, you could do a small refactor to make it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or perhaps its fine as is because it would be great to get this ported till 2.5 release line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @bbeaudreault explained. We can avoid code duplication with a little refactoring on DefaultStoreEngine. As we would already need to change it a bit to avoid code duplication anyways.
Or perhaps its fine as is because it would be great to get this ported till 2.5 release line?
We may allow that on the brackport for 2.5 only, but I don't think this would justify adding such technical debit in master branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still trying but I could not find a good solution yet. The reason is that DefaultCompactor is a single file writer compactor and DualFileCompactor is a multi file writer as DateTierCompactor and StripeCompactor. Both DateTierStoreEngine and StripeStoreEngine do not inherit from DefaultStoreEngine. That is why I think DualFileStoreEngine should not inherit from DefaultStoreEngine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to reduce code duplication, I refactored StoreEngine instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to refactor this so DefaultCompactor becomes a dual file compactor that only compacts a single file? Can we try that instead? I do not think we need a new Store Engine for this. That introduces some configuration trouble for operators. Ideally operators do not need to change their store engine in order to take advantage of this and other improvements that are generally applicable and are in other respects backwards compatible.
Related comment on design doc: https://docs.google.com/document/d/1Ea42tEBh2X2fCq0_tXSe1BgEqBz58oswJULEbA8-MfI/edit?disco=AAABAgKl--o
HAS_LATEST_VERSION handling could be introduced into the default store engine and compatibility is assured given how you handle HFiles that lack this metadata. Also older versions that don't know about and ignore HAS_LATEST_VERSION will also function correctly because all HFiles will be examined as before.
It seems that we can incrementally upgrade or downgrade from a store engine that understands HAS_LATEST_VERSION and one that does not, unless I am missing something, which is certainly possible. Is my understanding correct? If so I am wondering if we really need a new StoreEngine implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DualFileCompactor is a multi file writer as DateTierCompactor and StripeCompactor. Both DateTierStoreEngine and StripeStoreEngine do not inherit from DefaultStoreEngine
StoreEngine and related interfaces have evolved organically and the current state is maybe not ideal.
If we take the above approach, to refactor the default compactor interface, perhaps these called out compactors and engines can be refactored to take a cleaner approach (imho). If we look at the pattern that other recent refactors have taken there might be cause to consolidate common logic into an abstract class named appropriately and have the store engines inherit from that. This could be follow up work. Not asking for this work to be performed for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the updated version where I refactored the existing code without introducing a new compactor or store engine.
} | ||
boolean succ = false; | ||
try { | ||
for (int i = 0, n = files.size(); i < n; i++) { | ||
for (int i = 0, n = files.size(); i < n && !sortedFiles.isEmpty(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this extra check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because we skip the files with older cell versions and delete markers for the regular scans with max versions = 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then iterate the sortedFiles directly. What happens if sortedFiles is not empty, but files > sortedFiles? Won't you get a NPE on lines #153/155?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not want to iterate over over sorted files for the same reason the existing code does not iterate over sorted files since sorted files are modified within the loop. I do not see the NPE issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I eliminated the extra check.
@@ -135,11 +136,17 @@ public static List<StoreFileScanner> getScannersForStoreFiles(Collection<HStoreF | |||
for (HStoreFile file : files) { | |||
// The sort function needs metadata so we need to open reader first before sorting the list. | |||
file.initReader(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could have a DualStoreFileManager as well. That way, we could already keep separate list of store files there and it would contain the logic about which files should be returned for the scanner. It would also avoid us from having to open a reader on files we may not be interested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion! I will add DualStoreFileManager.
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
* older put cells and delete markers). | ||
*/ | ||
@InterfaceAudience.Private | ||
public class DualFileStoreEngine extends StoreEngine<DefaultStoreFlusher, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to refactor this so DefaultCompactor becomes a dual file compactor that only compacts a single file? Can we try that instead? I do not think we need a new Store Engine for this. That introduces some configuration trouble for operators. Ideally operators do not need to change their store engine in order to take advantage of this and other improvements that are generally applicable and are in other respects backwards compatible.
Related comment on design doc: https://docs.google.com/document/d/1Ea42tEBh2X2fCq0_tXSe1BgEqBz58oswJULEbA8-MfI/edit?disco=AAABAgKl--o
HAS_LATEST_VERSION handling could be introduced into the default store engine and compatibility is assured given how you handle HFiles that lack this metadata. Also older versions that don't know about and ignore HAS_LATEST_VERSION will also function correctly because all HFiles will be examined as before.
It seems that we can incrementally upgrade or downgrade from a store engine that understands HAS_LATEST_VERSION and one that does not, unless I am missing something, which is certainly possible. Is my understanding correct? If so I am wondering if we really need a new StoreEngine implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, based on my understanding of the design (see doc here: https://docs.google.com/document/d/1Ea42tEBh2X2fCq0_tXSe1BgEqBz58oswJULEbA8-MfI/edit ) we can integrate this change into the default store engine without requiring opt in and everyone will benefit from the optimization. Compatibility is assured given how we handle HFiles that lack the new metadata. Older HBase versions that don't know about and ignore HAS_LATEST_VERSION will also function correctly because all HFiles will be examined as before. Upgrade to and downgrade from a HAS_LATEST_VERSION capable version does not pose a correctness problem. Operation with mixed HFiles from different versions is also fine. Its simply that the performance benefit is fully realized once upgraded to HAS_LATEST_VERSION capable version and compaction has run on all live regions.
However, in case someone is concerned about potential impacts, please prepare to make the new behavior opt in via a site configuration setting. Hopefully we can achieve a consensus and avoid that.
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@apurtell, as per your feedback, I eliminated DualFileStoreEngine, DualFileCompactor, and DualFileStoreFileManager. Their functionality is now integrated into DefaultStoreEngine, DefaultCompactor and DefaultStoreFileManager respectively with some refactoring. Dual file compaction can be turned on/off using a config parameter. By default, it is turned on now. I did that to make sure that existing tests will exercise the new code. We can change the default value before we merge the PR. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, took another look after quite some time, reasonable to get this merged soon.
for (int i = 0; i < size; i++) { | ||
Cell firstCell = firstRowList.get(i); | ||
Cell secondCell = secondRowList.get(i); | ||
assert (CellUtil.matchingRowColumn(firstCell, secondCell)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: These asserts could be assertTrue or assertEquals, but this is not blocker for merging the PR, can be done later in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the change for this comment.
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Signed-off-by: Andrew Purtell <apurtell@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Viraj Jasani <vjasani@apache.org>
// the delete-family marker. In this case, there is no need to add the | ||
// delete-family-version marker to the live version file. This case happens only with | ||
// the new version behavior. | ||
liveFileWriter.append(cell); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kadirozde . Is there any inconsistency between the code and the comments here? The comment says In this case, there is no need to add the delete-family-version marker to the live version file.
, but the actual code still writes to the liveFileWriter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I would like to ask, in this scenario, why don't we need to add the delete-family-version marker to the live version file? Thx!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the cell should have been added to the historical file. Please feel free to file a jira and fix it. It not a data integrity issue but it should be fixed. If you prefer me to do, please let me know. Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I'm glad to fix this issue. However, I would like to ask why there is no need to add the delete-family-version marker to the live version file here? Won't there be any correctness issues? The deletion order in the new version behavior itself takes the sequence id into account. Here, the sequence id of the delete-family-version marker is larger than that of the delete-family marker. Won't directly putting the delete-family-version marker into the historical file lead to the loss of deletion information? Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have delete-family and delete-family-version markers with the same timestamp then the set of cells that is deleted by the delete-family-version marker is also covered (that is, deleted) by the delete-family marker. The reason we need delete markers in the live version file is to make sure that we do not return the cells previously stored in live files but deleted later would not be returned by the regular (not raw) scans that scan the latest cell versions. There might more than one delete marker that deletes a given cell. However, we need only one of them to mask this deleted cell during scans. Please note between live and historical files we store all delete markers. The raw scans scan both live and historical files. However, regular scans for the latest cell versions scan only live files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Yes, under the default version behavior, I'm sure that the delete-family marker can overwrite the delete-family-version marker. However, under the new version behavior, I'm not sure whether such a situation will occur: there is a delete-family marker with a timestamp (ts) of 10 and a sequence ID (seqId) of 100, a Put with a ts of 10 and a seqId of 101, and a delete-family-version marker with a ts of 10 and a seqId of 102. At this time, the Put is invisible. If during compaction, the delete-family-version marker is placed into the historical file, but the delete-family marker and the Put are in the live version file, then there will be a problem. Under the new version behavior, the Put will become visible. I would like to ask how this situation is avoided. During compaction, if the delete-family marker and the delete-family-version marker appear simultaneously, will the Put also definitely appear (so that the Put will definitely be placed into the historical file)? What kind of mechanism guarantees this? Thx!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for being diligent! I think there is no bug here. Now it is clear to me there is no inconsistency between the comment and the code as opposed to what I thought initially. The reason we write this delete version marker to cover the case you mentioned above. I agree my comment could have been phrased better as follow.
// This means both the delete-family and delete-family-version markers have the same
// timestamp but the sequence id of delete-family-version marker is higher than that of
// the delete-family marker. With the new version behavior, the delete-family-version
// maker is not deleted (masked) by the delete-family marker with the same timestamp
// as in the case here. That is why we need to write this delete marker to the live version
// file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! But I think the handling of the Delete type is similar to this, but the Delete is placed in the historical file. The code is here:
hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileWriter.java
Line 432 in 2531f93
getHistoricalFileWriter().append(cell); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right! Let us fix that and rephrase the comments to make them clear. Also please check other cases too. I will review your PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'm glad to fix this problem. However, I want to confirm that the correct behavior should be adding the Delete operation here to the live version file, right?
Design doc