Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db: refactor ingest overlap checking #3580

Merged
merged 1 commit into from
May 3, 2024

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented May 2, 2024

Refactor the overlap checking to avoid using the levelIter to populate the range deletion iterator. This was subtle and confusing. This refactor will also benefit the implementation of #2112.

Informs #2112.
Informs #2863.

@jbowens jbowens requested a review from a team as a code owner May 2, 2024 22:53
@jbowens jbowens requested a review from sumeerbhola May 2, 2024 22:53
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

New overlap code looks great. I didn't bother trying to understand the code that was removed.

Reviewable status: 0 of 5 files reviewed, 6 unresolved discussions (waiting on @jbowens and @sumeerbhola)


overlap.go line 17 at r1 (raw file):

)

// An overlapChecker provides facilities for checking whether any keys within a

Can this live in manifest?


overlap.go line 31 at r1 (raw file):

}

func (c *overlapChecker) determineAnyOverlapInLevel(

[nit] I'd add a short comment and maybe capitalize it since it's the one we use. It's important to mention that we don't just check sst bounds, but check for actual overlap against the contents.


overlap.go line 42 at r1 (raw file):

	} else if c.comparer.ImmediateSuccessor != nil {
		si := c.comparer.Split(bounds.End.Key)
		c.upperBoundBuf = c.comparer.ImmediateSuccessor(c.upperBoundBuf[:0], bounds.End.Key[:si])

Nice!


overlap.go line 52 at r1 (raw file):

		for subLevel := 0; subLevel < len(c.v.L0SublevelFiles); subLevel++ {
			manifestIter := c.v.L0Sublevels.Levels[subLevel].Iter()
			pointOverlap, err := c.determinePointKeyOverlapInLevel(

Would it be worth first checking if there is any overlap with sst file bounds? Or would the level iterator basically not do any I/O in that case anyway?


overlap.go line 54 at r1 (raw file):

			pointOverlap, err := c.determinePointKeyOverlapInLevel(
				ctx, bounds, manifest.Level(0), manifestIter)
			if err != nil {

[nit] this can be if err != nil || pointOverlap { return pointOverlap, err }, for consistency with some of the other code below


overlap.go line 61 at r1 (raw file):

			rangeOverlap, err := c.determineRangeKeyOverlapInLevel(
				ctx, bounds, manifest.Level(0), manifestIter)
			if err != nil {

ditto

Refactor the overlap checking to avoid using the levelIter to populate the
range deletion iterator. This was subtle and confusing. This refactor will also
benefit the implementation of cockroachdb#2112.

Informs cockroachdb#2112.
Informs cockroachdb#2863.
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: 0 of 5 files reviewed, 3 unresolved discussions (waiting on @RaduBerinde and @sumeerbhola)


overlap.go line 17 at r1 (raw file):

Previously, RaduBerinde wrote…

Can this live in manifest?

it can't because of dependencies on levelIter and keyspanimpl.LevelIter


overlap.go line 31 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] I'd add a short comment and maybe capitalize it since it's the one we use. It's important to mention that we don't just check sst bounds, but check for actual overlap against the contents.

Done!


overlap.go line 52 at r1 (raw file):

Previously, RaduBerinde wrote…

Would it be worth first checking if there is any overlap with sst file bounds? Or would the level iterator basically not do any I/O in that case anyway?

yeah, the level iterator won't do any I/O in that case


overlap.go line 54 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] this can be if err != nil || pointOverlap { return pointOverlap, err }, for consistency with some of the other code below

done


overlap.go line 61 at r1 (raw file):

Previously, RaduBerinde wrote…

ditto

done

@jbowens jbowens merged commit 8d7f7f6 into cockroachdb:master May 3, 2024
11 checks passed
@jbowens jbowens deleted the overlapchecker branch May 3, 2024 14:10
@RaduBerinde
Copy link
Member

overlap.go line 141 at r2 (raw file):

) (bool, error) {
	// NB: The spans surfaced by the fragment iterator are non-overlapping.
	span, err := iter.SeekGE(bounds.Start)

Unless I'm missing something, this could open an arbitrary number of files on the level looking for the next RangeDel span, no? In particular, in L6 this will usually mean looking through the entire level. Is this accurate?

This code is about to be refactored on master, but do you think the old 24.1 code suffered from the same issue?

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 4 unresolved discussions


overlap.go line 141 at r2 (raw file):
ah, yeah, I think you're right. When configured to surface range deletions the keyspanimpl.LevelIter avoids unnecessarily loading files by generating empty spans for known empty spans, but this doesn't apply when configured for range deletions.

but do you think the old 24.1 code suffered from the same issue?

The old 24.1 code used a levelIter so it would've avoided stepping to the next file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants