Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sstable: fix sstable.Writer key order checks for nil keys #1135

Merged
merged 1 commit into from
May 11, 2021

Conversation

petermattis
Copy link
Collaborator

Fix the sstable.Writer.addPoint key order check which was incorrectly
firing when a nil user key was added to the DB. The problem was that
comparing the zero value of w.meta.LargestPoint vs a nil user-key was
simply invalid. Instead, we need to avoid this check when a point record
has never been added to the sstable, similar to what is done in
sstable.Writer.addTombstone.

Also fix the setting of w.meta.SmallestPoint to ensure that
w.meta.SmallestPoint.UserKey is always non-nil after a point record
has been added. This is necessary for WriterMetadata.Smallest to work
correctly so that the sstable bounds can be set correctly if the
smallest record in an sstable as a nil or zero-length user key.

Fixes #1133

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@petermattis petermattis requested a review from jbowens May 10, 2021 12:51
@petermattis
Copy link
Collaborator Author

Note that this doesn't affect CRDB because we never add nil or zero-length user keys.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r1.
Reviewable status: 1 of 2 files reviewed, all discussions resolved (waiting on @jbowens)

@sumeerbhola
Copy link
Collaborator

sumeerbhola left a comment

very odd. The actual comment seems to have been lost.

@petermattis
Copy link
Collaborator Author

very odd. The actual comment seems to have been lost.

I had that happen to me on Reviewable the other day too.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 1 unresolved discussion (waiting on @jbowens and @petermattis)


sstable/writer.go, line 223 at r1 (raw file):

func (w *Writer) addPoint(key InternalKey, value []byte) error {
	if !w.disableKeyOrderChecks && w.meta.LargestPoint.UserKey != nil {

My understanding is that we are not distinguishing between InternalKey.UserKey values that are nil and the zero length slice, because we can't make that distinction once data is persisted.
Would it be simpler to do the following at the start of this function, instead of doing nil checks later:

if key.UserKey == nil {
  key.UserKey = zeroLengthSlice
}

Copy link
Collaborator Author

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 1 unresolved discussion (waiting on @jbowens and @sumeerbhola)


sstable/writer.go, line 223 at r1 (raw file):

My understanding is that we are not distinguishing between InternalKey.UserKey values that are nil and the zero length slice, because we can't make that distinction once data is persisted.

That is correct at the pebble API level, but we have been using UserKey == nil in the sstable code to distinguish between unset and set. Not using the difference would be a significantly larger lift (plumbing through booleans about whether a key has ever been set). Do you think it is worth it?

Would it be simpler to do the following at the start of this function, instead of doing nil checks later:

And then do key.Clone(). Interestingly, (InternalKey{UserKey: []byte{}}).Clone() returns a result where InternalKey.UserKey == nil. That was mildly surprising to me. Seems like append([]byte(nil), []byte{}) returns nil.

@hiqsociety
Copy link

since this is for pebble, is it possible to do what's best at the level of pebble? afterall, speed efficiency is what it's aiming for etc.
question : which is more important? to be compatible / compliant with CRDB api layer or for pebble to be complete standalone?

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 1 unresolved discussion (waiting on @jbowens and @petermattis)


sstable/writer.go, line 223 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

My understanding is that we are not distinguishing between InternalKey.UserKey values that are nil and the zero length slice, because we can't make that distinction once data is persisted.

That is correct at the pebble API level, but we have been using UserKey == nil in the sstable code to distinguish between unset and set. Not using the difference would be a significantly larger lift (plumbing through booleans about whether a key has ever been set). Do you think it is worth it?

Would it be simpler to do the following at the start of this function, instead of doing nil checks later:

And then do key.Clone(). Interestingly, (InternalKey{UserKey: []byte{}}).Clone() returns a result where InternalKey.UserKey == nil. That was mildly surprising to me. Seems like append([]byte(nil), []byte{}) returns nil.

I took another look here and what I said earlier was incomplete.

  • I understand the use of {Largest,Smalles}PointUserKey comparisons with nil in some of these if-blocks which are confirming that this is the first point added. I am not suggesting changing that.
  • Related to the previous bullet I did "git grep" with "UserKey == nil" and with "UserKey != nil" and there was nothing surprising there.
  • It is the
      	if w.meta.SmallestPoint.UserKey == nil {
      		w.meta.SmallestPoint.UserKey = make([]byte, 0)
      	}
    
buried deeper in the code here that is bothering me since it seems less principled that ensuring at the start of `addPoint` that we change a nil UserKey to a non-nil empty slice. The Clone behavior is surprising, but I think we can fix that by changing `if k.UserKey == nil` to `if len(k.UserKey) == 0`.


<!-- Sent from Reviewable.io -->

Fix the `sstable.Writer.addPoint` key order check which was incorrectly
firing when a nil user key was added to the DB. The problem was that
comparing the zero value of `w.meta.LargestPoint` vs a nil user-key was
simply invalid. Instead, we need to avoid this check when a point record
has never been added to the sstable, similar to what is done in
`sstable.Writer.addTombstone`.

Also fix the setting of `w.meta.SmallestPoint` to ensure that
`w.meta.SmallestPoint.UserKey` is always non-nil after a point record
has been added. This is necessary for `WriterMetadata.Smallest` to work
correctly so that the sstable bounds can be set correctly if the
smallest record in an sstable as a nil or zero-length user key.

Fixes cockroachdb#1133
Copy link
Collaborator Author

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 1 unresolved discussion (waiting on @jbowens and @sumeerbhola)


sstable/writer.go, line 223 at r1 (raw file):

Previously, sumeerbhola wrote…

I took another look here and what I said earlier was incomplete.

  • I understand the use of {Largest,Smalles}PointUserKey comparisons with nil in some of these if-blocks which are confirming that this is the first point added. I am not suggesting changing that.
  • Related to the previous bullet I did "git grep" with "UserKey == nil" and with "UserKey != nil" and there was nothing surprising there.
  • It is the
      	if w.meta.SmallestPoint.UserKey == nil {
      		w.meta.SmallestPoint.UserKey = make([]byte, 0)
      	}
    
buried deeper in the code here that is bothering me since it seems less principled that ensuring at the start of `addPoint` that we change a nil UserKey to a non-nil empty slice. The Clone behavior is surprising, but I think we can fix that by changing `if k.UserKey == nil` to `if len(k.UserKey) == 0`.
</blockquote></details>

I adjusted how this works. Not quite your entire suggestion, but it does remove the second nested `if w.meta.SmallestPoint.UserKey == nil` check. PTAL.


<!-- Sent from Reviewable.io -->

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 2 files at r1, 2 of 2 files at r2.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jbowens and @sumeerbhola)

@petermattis
Copy link
Collaborator Author

TFTR!

@petermattis petermattis merged commit 3ebefb1 into cockroachdb:master May 11, 2021
@petermattis petermattis deleted the pmattis/flush-empty-key branch May 11, 2021 18:12
@hiqsociety
Copy link

Do i get bountry reward for discovering this bug?

@petermattis
Copy link
Collaborator Author

@hiqsociety There is no bug bounty program for Pebble or CockroachDB. I can only offer my heartbeat thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2021/05/09 20:45:41 background error: pebble: keys must be added in order: #0,DEL, #204716,SET
4 participants