-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-21070 Fix SnapshotFileCache for HBase backed by S3 #209
Conversation
💔 -1 overall
This message was automatically generated. |
hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotFileCache.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotFileCache.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotFileCache.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotFileCache.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotFileCache.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotFileCache.java
Outdated
Show resolved
Hide resolved
Updated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
A simple question here, does S3 FileSystem support atomic rename? If not, what if we list and read the snapshot directories during a rename processing? |
@Apache9 No S3 Filesystems don't support a true atomic rename (at least none that I know of :) ). I think I have changed my mind and prefer removing the modification time since I think the performance gain is marginal. I think I will remove SnapshotFileCache changes, but leave the test so we don't regress. |
SnapshotFileCache depends on getting the last modified time of the snapshot directory, however, S3 FileSystem's do not update the last modified time of the top 'folder' when objects are added/removed. This commit adds a test for the previously fixed SnapshotFileCache.
Updated to only include the test. Verified that it fails without this patch and passes with it. |
💔 -1 overall
This message was automatically generated. |
@z-york If S3 does not support atomic rename, then I think there are problems here... Maybe we could see an empty directory when refreshing cache? |
Anyway, I think the test here is good. Maybe we can commit the test first, and open new issue to address the atomic rename related problems. In general, I think we'd better not rely on any non-trivial behavior of filesystem so we can build HBase on any possible filesystems. Of course the WAL file system is an exception. |
@Apache9 I took a look at the code base and do not think there is an issue with atomic rename here. The issue would be: The snapshotDir exists, but the snapshot info hasn't been copied over which will fail here: https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotDescriptionUtils.java#L361 which will throw. refresh cache is called by the RefreshCacheTask (which will clear on IOException), trigger refresh for testing, and getUnreferencedFiles. GetUnreferencedFiles will return no files for deletion if a corruptSnapshotException is thrown and it will be picked up the next time it is ran: https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotHFileCleaner.java#L70 So I think it is safe. |
Anyways, I will merge this for now. Feel free to open a JIRA and tag me if you feel there are other cases this doesn't cover. |
SnapshotFileCache depends on getting the last modified time of the snapshot directory, however, S3 FileSystem's do not update the last modified time of the top 'folder' when objects are added/removed. This commit adds a test for the previously fixed SnapshotFileCache.
SnapshotFileCache depends on getting the last modified time of the snapshot directory, however, S3 FileSystem's do not update the last modified time of the top 'folder' when objects are added/removed. This commit adds a test for the previously fixed SnapshotFileCache.
…ache#209) SnapshotFileCache depends on getting the last modified time of the snapshot directory, however, S3 FileSystem's do not update the last modified time of the top 'folder' when objects are added/removed. This commit adds a test for the previously fixed SnapshotFileCache.
SnapshotFileCache depends on getting the last modified time of the
snapshot directory, however, S3 FileSystem's do not update the
last modified time of the top 'folder' when objects are added/removed.