-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-23157 WAL unflushed seqId tracking may wrong when Durability.AS… #762
Conversation
@saintstack @binlijin PTAL. Thanks. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea. Worried about incrementing sequenceids outside of mvcc.
// then it will be recorded as the lowestUnflushedSeqId by the above update method, which is | ||
// less than the current maxFlushedSeqId. And if next time we only flush the family with this | ||
// unusual lowestUnflushedSeqId, the maxFlushedSeqId will go backwards. | ||
// This is an unexpected behavior so we should fix it, otherwise it may cause unexpected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How we going to fix it? Can't we drive through the flush marker before completing the flush?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we need to do a wal sync under the updateLock.writeLock, which will impact the performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... maybe a slight hiccup for the async wal case -- maybe, given sync'ing to HDFS is such an erratic affair -- but we will also be more 'correct' having pushed out all in the ring buffer ahead of this flush sync w/o having to rewrite their sequenceid.
// This is an unexpected behavior so we should fix it, otherwise it may cause unexpected | ||
// behavior in other area and then cause data loss maybe. | ||
// The solution here is a bit hack but fine. Just replace the lowestUnflushedSeqId with | ||
// maxFlushedSeqId + 1 if it is lesser. Durability.ASYNC_WAL means we do not care data loss for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we +1 it and not just let it be the maxFlushedSeqID? The maxFlushedSeqId comes from mvcc. +1'ing outside of mvcc management could have an interesting side-effect not seen currently where next edit given out by mvcc might clash w/ this +1'd record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment below, I explain why +1 is safe. The way we get the maxFlushedSeqId is mvcc -1, so here we just add it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I read that. Seems fragile. Logic is spread about being here and up in caller. Messing w/ sequenceid w/o going via mvcc will bite us later I think.
We have to do this math?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is all about the meaning of these sequence ids. The maxFlushedSeqId means the maximum flush id that have been flushed, so it must be the current lowest sequence id in memstore - 1, otherwise there will be data loss for the edge record. And here we store the current lowest sequence id in memstore so we have to +1 to get the maxFlushedSeqId.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the edits on the ring buffer that are outstanding meant to be part of this flush or are they supposed to be counted in the next flush? Are you adding the +1 so they are not part of the current flush?
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceIdAccounting.java
Show resolved
Hide resolved
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/AbstractTestFSWAL.java
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
LGTM |
💔 -1 overall
This message was automatically generated. |
Oh, this one is still pending? Any other concerns? @saintstack |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took another look after doing a bit of study. I could go for this 'hack' if we removed the +1. I could work on a follow-on that does the suggested drive through of the sync edit if that'd help.
// then it will be recorded as the lowestUnflushedSeqId by the above update method, which is | ||
// less than the current maxFlushedSeqId. And if next time we only flush the family with this | ||
// unusual lowestUnflushedSeqId, the maxFlushedSeqId will go backwards. | ||
// This is an unexpected behavior so we should fix it, otherwise it may cause unexpected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... maybe a slight hiccup for the async wal case -- maybe, given sync'ing to HDFS is such an erratic affair -- but we will also be more 'correct' having pushed out all in the ring buffer ahead of this flush sync w/o having to rewrite their sequenceid.
// means we have flushed all the stores so the seq id for actual data should be at least plus 1. | ||
// And if we do not flush all the stores, then the maxFlushedSeqId is calculated by | ||
// lowestUnflushedSeqId - 1, so here let's plus the 1 back. | ||
Long wrappedSeqId = Long.valueOf(maxFlushedSeqId + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The +1 makes me wary. Just add these edits w/ maxFlushedSeqId? Wouldn't that be safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained why we need to +1 here, so what's your real problem? If you do not want to +1, you need to change bunch of comments and field names...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
Your comment/explanation here is good. It is also where it should be, in the class named SequenceIdAccounting. Though hard to follow, it should be possible to read this one class to find how sequnceid accounting is done in the system including tricks to keep the system working when the likes of ASYNC_WAL is enabled.
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/AbstractTestFSWAL.java
Show resolved
Hide resolved
@saintstack Let's finish this one? |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
// means we have flushed all the stores so the seq id for actual data should be at least plus 1. | ||
// And if we do not flush all the stores, then the maxFlushedSeqId is calculated by | ||
// lowestUnflushedSeqId - 1, so here let's plus the 1 back. | ||
Long wrappedSeqId = Long.valueOf(maxFlushedSeqId + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
Your comment/explanation here is good. It is also where it should be, in the class named SequenceIdAccounting. Though hard to follow, it should be possible to read this one class to find how sequnceid accounting is done in the system including tricks to keep the system working when the likes of ASYNC_WAL is enabled.
…YNC_WAL is used (#762) Signed-off-by: stack <stack@apache.org>
…YNC_WAL is used (#762) Signed-off-by: stack <stack@apache.org>
…YNC_WAL is used (#762) Signed-off-by: stack <stack@apache.org>
…YNC_WAL is used (apache#762) Signed-off-by: stack <stack@apache.org>
…YNC_WAL is used (apache#762) Signed-off-by: stack <stack@apache.org> (cherry picked from commit 1cbe383) Change-Id: Ibdf7ab100974408b97d745d35fb55e59f2a6b82c
…YNC_WAL is used