-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
engine: set the MVCC timestamp on reads due to historical intents #52358
engine: set the MVCC timestamp on reads due to historical intents #52358
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good, save for one point about impact of this change.
Reviewed 5 of 5 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/storage/pebble_mvcc_scanner.go, line 581 at r1 (raw file):
// was the only case in which a value could have been returned from a read // without an MVCC timestamp. if p.curKey.Timestamp.IsEmpty() {
Instead of adding this case in here, I'd much rather add it in the value, found := p.getFromIntentHistory()
case, and just modify p.curKey.Timestamp
to the meta timestamp. Doing this change here has a much wider impact as it's used in all of the conditionals. It should be fine either way, but it seems safer (both for correctness and performance) to keep the impact as minimal as possible, and the change as close to the RocksDB version as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal)
pkg/storage/pebble_mvcc_scanner.go, line 581 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Instead of adding this case in here, I'd much rather add it in the
value, found := p.getFromIntentHistory()
case, and just modifyp.curKey.Timestamp
to the meta timestamp. Doing this change here has a much wider impact as it's used in all of the conditionals. It should be fine either way, but it seems safer (both for correctness and performance) to keep the impact as minimal as possible, and the change as close to the RocksDB version as possible.
I was worried about mucking with p.curKey.Timestamp
. Should add code to set it back to zero after this call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal)
pkg/storage/pebble_mvcc_scanner.go, line 581 at r1 (raw file):
Previously, ajwerner wrote…
I was worried about mucking with
p.curKey.Timestamp
. Should add code to set it back to zero after this call?
missing an I
between Should
and add
.
9c64cf6
to
0005908
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @itsbilal)
pkg/storage/pebble_mvcc_scanner.go, line 581 at r1 (raw file):
Previously, ajwerner wrote…
missing an
I
betweenShould
andadd
.
It should be safe without having to zero it again. We will eventually call advanceKey
in all cases inside the if ...; found { }
conditional anyway (and if we don't there, we will call it once in scan()
), which will set curTS
to the next (or previous) value's TS. It's really only used for adding to the result set.
The other option is to inline addAndAdvance
in that case and add this part. But that brings back the same issue of duplicated logic that caused us to not inline it in the first place.
pkg/storage/pebble_mvcc_scanner.go, line 434 at r2 (raw file):
// intent was written. However, this was the only case in which a value // could have been returned from a read without an MVCC timestamp. p.curKey.Timestamp = metaTS
Might also be worthwhile throwing in a short point about why this is safe; that, in all cases, we only use curTS
beyond this point to add to the result set before resetting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @ajwerner and @itsbilal)
0005908
to
934dca7
Compare
Before this commit, we'd return a zero-value MVCC timestamp when reading an intent from the intent history. This was problematic because elsewhere in the code we assume that we will always get a non-zero MVCC timestamp when a read returns a value. This is especially bizarre given that a read of the latest intent will return its write timestamp. The semantics here are such that we'll return the timestamp of the MVCC metadata for the row. I think this is the most reasonable thing to do as that timestamp ought to reflect the timestamp we return when we return the current intent and furthermore is the only timestamp we really have around. We could return the transactions current read or write timestamp but those both seem like worse choices. It's also worth noting that in the case where we need to read a non-zero value, we don't really care what that value is and the fact that we are reading this intent itself is somewhat suspect. That being said, I still think we should make this change in addition to any change made to prevent the code referenced in cockroachdb#49266 from needing it. Fixes cockroachdb#49266. Informs cockroachdb#50102. Release note: None
934dca7
to
55467a8
Compare
TFTR! bors r=itsbilal |
bors r+ |
This PR was included in a batch that was canceled, it will be automatically retried |
Build succeeded: |
Before this commit, we'd return a zero-value MVCC timestamp when reading an
intent from the intent history. This was problematic because elsewhere in the
code we assume that we will always get a non-zero MVCC timestamp when a read
returns a value. This is especially bizarre given that a read of the latest
intent will return its write timestamp.
The semantics here are such that we'll return the timestamp of the MVCC
metadata for the row. I think this is the most reasonable thing to do as
that timestamp ought to reflect the timestamp we return when we return the
current intent and furthermore is the only timestamp we really have around.
We could return the transactions current read or write timestamp but those
both seem like worse choices.
It's also worth noting that in the case where we need to read a non-zero
value, we don't really care what that value is and the fact that we are
reading this intent itself is somewhat suspect. That being said, I still
think we should make this change in addition to any change made to prevent
the code referenced in #49266 from needing it.
Fixes #49266.
Informs #50102.
Release note: None