-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-23355 Bypass the prefetch operation if HFiles are generated through flush or compaction #909
Conversation
through flush or compaction
💔 -1 overall
This message was automatically generated. |
@@ -107,6 +107,8 @@ | |||
|
|||
private RegionCoprocessorHost coprocessorHost; | |||
|
|||
private boolean prefetchOnOpen = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trick is here. By default it will be false and only for region open and the bulk load you will make it true. Good. LGTM.
|
||
public ReaderContext(Path filePath, FSDataInputStreamWrapper fsdis, long fileSize, | ||
HFileSystem hfs, boolean primaryReplicaReader, ReaderType type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because the PreadREader uses the Context you are adding it here and StorefileInfo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, use it to determin whether the prefetch operation is needed.
Thanks for your review @ramkrish86
Patch looks good. Where is the new addition being exploited though? I don't see it in here. Also, while we have tests to prove the new addtions work, what about the original supposition by Anoop -- double cache. Are we NOT double caching after this fix? Thanks. |
We declared a prefetchOnOpen variable in ReaderContext, it’s default value is false (means Prefetch is not performed by default), But when region opens(code in HStore#openStoreFiles) or bulkload happend, we will modify the prefetchOnOpen value according to CacheConf#shouldPrefetchOnOpen()
The double cache what I understand is that we may cache the same block twice through cacheOnWrite and prefetchOnFlush(Pardon the name) or prefetchOnCompaction, |
As I said in the other JIRA, already double cache was not happening at the code level - means an already cached block is never cached by the HFileReaderImpl#readBlock() call. But this patch by design will avoid the caching to happen during compaction and flushes. |
How to progress here? We do prefetch on open but not anywhere else which seems good. You like this patch @ramkrish86 ? |
So after this patch if prefetch config is ON, that will be honored at region open time alone. And also for bulk loaded files. correct? Say the cache on write (flush) and cache on compaction are turned off, we will NOT do eager caching at all? Sorry its been some time since I see this so totally forgot. |
another thing. Not related to this item directly. when we open a replica region, that will also open the HFiles there and will do the prefetch. Should we not do? Anyways another topic of discuss and so another jira. cc @saintstack |
yes, that is |
🎊 +1 overall
This message was automatically generated. |
This one seems good for block cache efficiency. Can we get a refresh on the patch, and maybe a PR for branch-2? Reviewers are happy? |
Any update @chenxu14 This patch is almost there (I closed others of yours just now that have not had updates... can reopen if you around). |
No description provided.