Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-23355 Bypass the prefetch operation if HFiles are generated through flush or compaction #909

Closed
wants to merge 1 commit into from

Conversation

chenxu14
Copy link
Contributor

@chenxu14 chenxu14 commented Dec 6, 2019

No description provided.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 8s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ master Compile Tests _
+1 💚 mvninstall 5m 29s master passed
+1 💚 compile 0m 58s master passed
+1 💚 checkstyle 1m 21s master passed
+1 💚 shadedjars 4m 37s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 45s master passed
+0 🆗 spotbugs 4m 12s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 11s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 4m 57s the patch passed
+1 💚 compile 0m 55s the patch passed
+1 💚 javac 0m 55s the patch passed
+1 💚 checkstyle 1m 19s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 4m 36s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 15m 50s Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.
+1 💚 javadoc 0m 35s the patch passed
+1 💚 findbugs 4m 16s the patch passed
_ Other Tests _
-1 ❌ unit 269m 40s hbase-server in the patch failed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
327m 37s
Reason Tests
Failed junit tests hadoop.hbase.client.TestAdmin2
Subsystem Report/Notes
Docker Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-909/1/artifact/out/Dockerfile
GITHUB PR #909
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux 5c8623d7c56a 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-909/out/precommit/personality/provided.sh
git revision master / 9c82a65
Default Java 1.8.0_181
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-909/1/artifact/out/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-909/1/testReport/
Max. process+thread count 5040 (vs. ulimit of 10000)
modules C: hbase-server U: hbase-server
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-909/1/console
versions git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@@ -107,6 +107,8 @@

private RegionCoprocessorHost coprocessorHost;

private boolean prefetchOnOpen = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trick is here. By default it will be false and only for region open and the bulk load you will make it true. Good. LGTM.


public ReaderContext(Path filePath, FSDataInputStreamWrapper fsdis, long fileSize,
HFileSystem hfs, boolean primaryReplicaReader, ReaderType type) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because the PreadREader uses the Context you are adding it here and StorefileInfo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, use it to determin whether the prefetch operation is needed.
Thanks for your review @ramkrish86

@saintstack
Copy link
Contributor

Patch looks good. Where is the new addition being exploited though? I don't see it in here. Also, while we have tests to prove the new addtions work, what about the original supposition by Anoop -- double cache. Are we NOT double caching after this fix?

Thanks.

@chenxu14
Copy link
Contributor Author

Where is the new addition being exploited though? 

We declared a prefetchOnOpen variable in ReaderContext, it’s default value is false (means Prefetch is not performed by default), But when region opens(code in HStore#openStoreFiles) or bulkload happend, we will modify the prefetchOnOpen value according to CacheConf#shouldPrefetchOnOpen()

what about the original supposition by Anoop -- double cache. Are we NOT double caching after this fix?

The double cache what I understand is that we may cache the same block twice through cacheOnWrite and prefetchOnFlush(Pardon the name) or prefetchOnCompaction,
So here we ignore the flush and compaction case when do prefect.

@ramkrish86
Copy link
Contributor

As I said in the other JIRA, already double cache was not happening at the code level - means an already cached block is never cached by the HFileReaderImpl#readBlock() call. But this patch by design will avoid the caching to happen during compaction and flushes.

@saintstack
Copy link
Contributor

How to progress here? We do prefetch on open but not anywhere else which seems good. You like this patch @ramkrish86 ?

@anoopsjohn
Copy link
Contributor

So after this patch if prefetch config is ON, that will be honored at region open time alone. And also for bulk loaded files. correct? Say the cache on write (flush) and cache on compaction are turned off, we will NOT do eager caching at all? Sorry its been some time since I see this so totally forgot.

@anoopsjohn
Copy link
Contributor

another thing. Not related to this item directly. when we open a replica region, that will also open the HFiles there and will do the prefetch. Should we not do? Anyways another topic of discuss and so another jira. cc @saintstack

@chenxu14
Copy link
Contributor Author

chenxu14 commented Jan 6, 2020

So after this patch if prefetch config is ON, that will be honored at region open time alone. And also for bulk loaded files. correct?

yes, that is

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ master Compile Tests _
+1 💚 mvninstall 8m 19s master passed
+1 💚 compile 1m 27s master passed
+1 💚 checkstyle 1m 27s master passed
+1 💚 shadedjars 5m 11s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 37s master passed
+0 🆗 spotbugs 4m 39s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 38s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 5m 26s the patch passed
+1 💚 compile 1m 0s the patch passed
+1 💚 javac 1m 0s the patch passed
+1 💚 checkstyle 1m 14s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 5m 4s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 17m 24s Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.
+1 💚 javadoc 0m 36s the patch passed
+1 💚 findbugs 4m 46s the patch passed
_ Other Tests _
+1 💚 unit 98m 46s hbase-server in the patch passed.
+1 💚 asflicense 0m 29s The patch does not generate ASF License warnings.
163m 48s
Subsystem Report/Notes
Docker Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-909/1/artifact/out/Dockerfile
GITHUB PR #909
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux bca9b2143a46 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-909/out/precommit/personality/provided.sh
git revision master / 5b4545d
Default Java 1.8.0_181
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-909/1/testReport/
Max. process+thread count 6576 (vs. ulimit of 10000)
modules C: hbase-server U: hbase-server
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-909/1/console
versions git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@ndimiduk
Copy link
Member

This one seems good for block cache efficiency. Can we get a refresh on the patch, and maybe a PR for branch-2? Reviewers are happy?

@saintstack
Copy link
Contributor

Any update @chenxu14 This patch is almost there (I closed others of yours just now that have not had updates... can reopen if you around).

@chenxu14 chenxu14 closed this Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants