Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-21070 Fix SnapshotFileCache for HBase backed by S3 #209

Merged
merged 1 commit into from
May 6, 2019

Conversation

z-york
Copy link
Contributor

@z-york z-york commented Apr 30, 2019

SnapshotFileCache depends on getting the last modified time of the
snapshot directory, however, S3 FileSystem's do not update the
last modified time of the top 'folder' when objects are added/removed.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 291 Docker mode activated.
_ Prechecks _
+1 hbaseanti 0 Patch does not have any anti-patterns.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+1 mvninstall 271 master passed
+1 compile 57 master passed
+1 checkstyle 71 master passed
+1 shadedjars 285 branch has no errors when building our shaded downstream artifacts.
+1 findbugs 333 master passed
+1 javadoc 59 master passed
_ Patch Compile Tests _
+1 mvninstall 233 the patch passed
+1 compile 49 the patch passed
+1 javac 49 the patch passed
-1 checkstyle 65 hbase-server: The patch generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2)
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedjars 262 patch has no errors when building our shaded downstream artifacts.
+1 hadoopcheck 747 Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0.
+1 findbugs 226 the patch passed
+1 javadoc 31 the patch passed
_ Other Tests _
-1 unit 16017 hbase-server in the patch failed.
+1 asflicense 33 The patch does not generate ASF License warnings.
19113
Reason Tests
Failed junit tests hadoop.hbase.master.TestAssignmentManagerMetrics
hadoop.hbase.client.TestFromClientSide3
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/1/artifact/out/Dockerfile
GITHUB PR #209
Optional Tests dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux a83d97447668 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 GNU/Linux
Build tool maven
Personality /testptch/patchprocess/precommit/personality/provided.sh
git revision master / 70296a2
maven version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z)
Default Java 1.8.0_181
findbugs v3.1.11
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/1/artifact/out/diff-checkstyle-hbase-server.txt
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/1/artifact/out/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/1/testReport/
Max. process+thread count 4669 (vs. ulimit of 10000)
modules C: hbase-server U: hbase-server
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/1/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@z-york
Copy link
Contributor Author

z-york commented May 1, 2019

Updated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 265 Docker mode activated.
_ Prechecks _
+1 hbaseanti 0 Patch does not have any anti-patterns.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+1 mvninstall 263 master passed
+1 compile 53 master passed
+1 checkstyle 68 master passed
+1 shadedjars 264 branch has no errors when building our shaded downstream artifacts.
+1 findbugs 185 master passed
+1 javadoc 33 master passed
_ Patch Compile Tests _
+1 mvninstall 240 the patch passed
+1 compile 52 the patch passed
+1 javac 52 the patch passed
-1 checkstyle 72 hbase-server: The patch generated 2 new + 1 unchanged - 1 fixed = 3 total (was 2)
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedjars 254 patch has no errors when building our shaded downstream artifacts.
+1 hadoopcheck 516 Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0.
+1 findbugs 222 the patch passed
+1 javadoc 30 the patch passed
_ Other Tests _
-1 unit 16425 hbase-server in the patch failed.
+1 asflicense 28 The patch does not generate ASF License warnings.
19043
Reason Tests
Failed junit tests hadoop.hbase.quotas.TestSpaceQuotas
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/2/artifact/out/Dockerfile
GITHUB PR #209
Optional Tests dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux 8ce1b8757b3e 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 GNU/Linux
Build tool maven
Personality /testptch/patchprocess/precommit/personality/provided.sh
git revision master / 4379fe4
maven version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z)
Default Java 1.8.0_181
findbugs v3.1.11
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/2/artifact/out/diff-checkstyle-hbase-server.txt
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/2/artifact/out/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/2/testReport/
Max. process+thread count 5055 (vs. ulimit of 10000)
modules C: hbase-server U: hbase-server
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/2/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@apurtell apurtell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@Apache9
Copy link
Contributor

Apache9 commented May 3, 2019

A simple question here, does S3 FileSystem support atomic rename? If not, what if we list and read the snapshot directories during a rename processing?

@z-york
Copy link
Contributor Author

z-york commented May 3, 2019

@Apache9 No S3 Filesystems don't support a true atomic rename (at least none that I know of :) ).

I think I have changed my mind and prefer removing the modification time since I think the performance gain is marginal. I think I will remove SnapshotFileCache changes, but leave the test so we don't regress.

SnapshotFileCache depends on getting the last modified time of the
snapshot directory, however, S3 FileSystem's do not update the
last modified time of the top 'folder' when objects are added/removed.
This commit adds a test for the previously fixed SnapshotFileCache.
@z-york
Copy link
Contributor Author

z-york commented May 3, 2019

Updated to only include the test. Verified that it fails without this patch and passes with it.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 246 Docker mode activated.
_ Prechecks _
+1 hbaseanti 0 Patch does not have any anti-patterns.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ master Compile Tests _
+1 mvninstall 269 master passed
+1 compile 51 master passed
+1 checkstyle 67 master passed
+1 shadedjars 258 branch has no errors when building our shaded downstream artifacts.
+1 findbugs 170 master passed
+1 javadoc 32 master passed
_ Patch Compile Tests _
+1 mvninstall 234 the patch passed
+1 compile 55 the patch passed
+1 javac 55 the patch passed
-1 checkstyle 69 hbase-server: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedjars 269 patch has no errors when building our shaded downstream artifacts.
+1 hadoopcheck 521 Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0.
+1 findbugs 217 the patch passed
+1 javadoc 32 the patch passed
_ Other Tests _
-1 unit 17218 hbase-server in the patch failed.
+1 asflicense 29 The patch does not generate ASF License warnings.
19819
Reason Tests
Failed junit tests hadoop.hbase.regionserver.TestRegionReplicaFailover
hadoop.hbase.client.TestFromClientSideWithCoprocessor
hadoop.hbase.client.TestSnapshotDFSTemporaryDirectory
hadoop.hbase.client.TestFromClientSide
hadoop.hbase.util.TestFromClientSide3WoUnsafe
hadoop.hbase.master.TestAssignmentManagerMetrics
hadoop.hbase.client.TestFromClientSide3
hadoop.hbase.client.TestAdmin1
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/3/artifact/out/Dockerfile
GITHUB PR #209
Optional Tests dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux 8cd95c908978 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 GNU/Linux
Build tool maven
Personality /testptch/patchprocess/precommit/personality/provided.sh
git revision master / 68f14c1
maven version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z)
Default Java 1.8.0_181
findbugs v3.1.11
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/3/artifact/out/diff-checkstyle-hbase-server.txt
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/3/artifact/out/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/3/testReport/
Max. process+thread count 4921 (vs. ulimit of 10000)
modules C: hbase-server U: hbase-server
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-209/3/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@Apache9
Copy link
Contributor

Apache9 commented May 4, 2019

@z-york If S3 does not support atomic rename, then I think there are problems here... Maybe we could see an empty directory when refreshing cache?

@Apache9
Copy link
Contributor

Apache9 commented May 4, 2019

Anyway, I think the test here is good. Maybe we can commit the test first, and open new issue to address the atomic rename related problems. In general, I think we'd better not rely on any non-trivial behavior of filesystem so we can build HBase on any possible filesystems. Of course the WAL file system is an exception.

@z-york
Copy link
Contributor Author

z-york commented May 6, 2019

@Apache9 I took a look at the code base and do not think there is an issue with atomic rename here. The issue would be: The snapshotDir exists, but the snapshot info hasn't been copied over which will fail here: https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotDescriptionUtils.java#L361 which will throw. refresh cache is called by the RefreshCacheTask (which will clear on IOException), trigger refresh for testing, and getUnreferencedFiles. GetUnreferencedFiles will return no files for deletion if a corruptSnapshotException is thrown and it will be picked up the next time it is ran: https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotHFileCleaner.java#L70

So I think it is safe.

@z-york
Copy link
Contributor Author

z-york commented May 6, 2019

Anyways, I will merge this for now. Feel free to open a JIRA and tag me if you feel there are other cases this doesn't cover.

@z-york z-york merged commit 67c937f into apache:master May 6, 2019
@z-york z-york deleted the snapshot branch May 6, 2019 18:52
asfgit pushed a commit that referenced this pull request May 6, 2019
SnapshotFileCache depends on getting the last modified time of the
snapshot directory, however, S3 FileSystem's do not update the
last modified time of the top 'folder' when objects are added/removed.
This commit adds a test for the previously fixed SnapshotFileCache.
asfgit pushed a commit that referenced this pull request May 7, 2019
SnapshotFileCache depends on getting the last modified time of the
snapshot directory, however, S3 FileSystem's do not update the
last modified time of the top 'folder' when objects are added/removed.
This commit adds a test for the previously fixed SnapshotFileCache.
infraio pushed a commit to infraio/hbase that referenced this pull request Aug 17, 2020
…ache#209)

SnapshotFileCache depends on getting the last modified time of the
snapshot directory, however, S3 FileSystem's do not update the
last modified time of the top 'folder' when objects are added/removed.
This commit adds a test for the previously fixed SnapshotFileCache.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants