-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26969:Eliminate MOB renames when SFT is enabled #4418
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have gone through a first review round, got a few nits, plus some clarification questions on some of the changes.
hbase-server/src/main/java/org/apache/hadoop/hbase/mob/MobFileCleanerChore.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/mob/MobFileCleanerChore.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/mob/MobFileCleanerChore.java
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/mob/RSMobFileCleanerChore.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/mob/RSMobFileCleanerChore.java
Outdated
Show resolved
Hide resolved
@@ -1867,6 +1870,10 @@ executorService.new ExecutorConfig().setExecutorType(ExecutorType.RS_SNAPSHOT_OP | |||
choreService.scheduleChore(brokenStoreFileCleaner); | |||
} | |||
|
|||
if (this.rsMobFileCleanerChore != null) { | |||
choreService.scheduleChore(rsMobFileCleanerChore); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now we will always have this running at RSes, plus the old MobFileCleanerChore running on Master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new MobFileCleanerChore is not the same as the old one. The main cleanup is done by the RSMobFileCleanerChore running on each RS, cleaning up mob files created by regions hosted on that RS. This should cover most of the usecases.
But I still had to run a restricted version of the old MobFileCleanerChore to clean up after mobs made by now-archived regions, because to make sure those are not referenced anymore every hfile have to be checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I still had to run a restricted version of the old MobFileCleanerChore to clean up after mobs made by now-archived regions, because to make sure those are not referenced anymore every hfile have to be checked.
So, MobFileCleanerChore seems to be checking all existing hfiles. Wouldn't that supersede the RSMobFileCleanerChore work? And what if both run at the same time, could both try to archive a same subset of files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me clarify the differences between the 2 cleaners.
RSMobFileCleanerChore:
- runs on RS to have access to currently written files and active storefile list
- can only archive mob files created by regions hosted on the current RS
- only reads hfiles belonging to regions hosted by the current RS when looking for references
This allow it to do the majority of the cleanup necessary as efficiently as possible
MobFileCleanerChore:
- runs on Master
- can only archive mob files created by archived regions (regions no longer existing in the /data folder). Thanks to this these mob files can no longer be "currently written" so we do not need the data only available on the RS
- reads every single hfile in /data with a mob enabled CF. This is necessary, because this is the only way if any of these mobs has active references
So yes, there is an overlap, the same hfiles could be read by both cleansers but the a mob file could be archived with either one of them. The cleaner on the master is wasteful and not especially elegant, but thanks to the lack of centralized mob tracking we do not have a better way of collecting the references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can only archive mob files created by archived regions (regions no longer existing in the /data folder). Thanks to this these mob files can no longer be "currently written" so we do not need the data only available on the RS
Got it.
The cleaner on the master is wasteful and not especially elegant, but thanks to the lack of centralized mob tracking we do not have a better way of collecting the references.
Can we trust META here to list SPLIT/MERGE parents? We could then iterate through these list and check for mob files for these regions. Just brainstorming here, ain't sure if the region dir is left lingering in the FS when SPLIT/MERGE parents are cleared from META (in which case, would make such alternative unfeasible).
💔 -1 overall
This message was automatically generated. |
} | ||
//collecting files, MOB included currently being written | ||
regionMobs.addAll( | ||
store.getStoreFilesBeingWritten().stream().map(path -> path.getName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this here is because we need to add all currently being written files because we have no means to know if those have mob refs or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is here because mob compaction/flush runs as part of non-mob store compaction/flush that created it. And because of how these work the most convenient way to keep track of compacting/flushing mob files was to add them to the list of compacting/flushing non-mob files. So this list has everything. But only part of the list is actually interesting for us at any given time.
LOG.debug("Found: {} active mob refs for table={}", | ||
referencedMOBs.values().stream().map(inner -> inner.values()) | ||
.flatMap(lists -> lists.stream()).mapToInt(lists -> lists.size()).sum(), | ||
htd.getTableName().getNameAsString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: check if debug is enabled, otherwise we'll always be computing the debug param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a very nice catch, thanks!
referencedMOBs.values().stream().forEach(innerMap -> innerMap.values().stream() | ||
.forEach(mobFileList -> mobFileList.stream().forEach(LOG::trace))); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above :)
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Have been going back and forth with @BukrosSzabolcs on this, discussing what would be alternatives for a more efficient solution. Unfortunately, our conclusion is that unless we can implement some sort of "global" registry of referenced mob files, there's not much left we can do. Such refactoring would be more suitable under a whole MOB redesign. So for now, I think we are good to go with this PR, which main objective is to avoid usage of temp dirs/renames when FILE SFT is enabled, making MOB behaviour aligned with the conventional regions behaviour. Regarding the latest UT failures, these should actually be addressed by HBASE-27017 commit. I'm good to approve this once we rebase it and get a green UT run. Thanks for being super patient, @BukrosSzabolcs ! |
Get rid of renames in the MOB compaction and flush. Also refactor the MOB cleaner to be compatible with FileBased StoreFileTracker
improve logging
5b1267f
to
79e15bf
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
recheck |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
recheck |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
The reported test failures seems unrelated. |
HBASE-25970:MOB data loss - incorrect concatenation of MOB_FILE_REFS Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
Get rid of renames in the MOB compaction and flush. Also refactor the
MOB cleaner to be compatible with FileBased StoreFileTracker