-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: Wire up L0SubLevels with iterator and version creation #624
*: Wire up L0SubLevels with iterator and version creation #624
Conversation
23f1cfc
to
3ab9a4d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, but then I wrote a chunk of this code. 😃
Reviewable status: 0 of 30 files reviewed, 6 unresolved discussions (waiting on @itsbilal, @jbowens, @petermattis, and @sumeerbhola)
db.go, line 711 at r2 (raw file):
levels := buf.levels[:] addLevelIterForFiles := func(files []*manifest.FileMetadata, level int) {
Can you check that this structure isn't causing extra allocations? Sometimes that can happen with the use of closures. There is a script in the Cockroach repo which can show which allocations escape to the heap. Another technique is to find a benchmark which uses this function and to run the benchmark with -memprofile mem.prof -memprofilerate=1
which will cause every memory allocation to be sampled.
metrics_test.go, line 173 at r2 (raw file):
// Some iterators (eg. levelIter) do not instantiate the underlying // iterator until the first positioning call. Call First() to // ensure full instantiation.
Why is full instantiation needed for this test?
Update: Ok, I think I understand the need here. Instantiation didn't capture the need for me. How about something like: Position the iterator so that levelIters will have loaded an sstable.
internal/manifest/version.go, line 209 at r2 (raw file):
refs int32 // The level 0 sstables organized in a series of sublevels. Similar to the
s/organized/are organized/g
internal/manifest/version.go, line 217 at r2 (raw file):
// sublevel n+1. // // L0SubLevels.Files contains L0 files ordered by sublevels.
Should probably also mention that L0Sublevels.Files
contains every table in Files[0]
.
internal/manifest/version.go, line 218 at r2 (raw file):
// // L0SubLevels.Files contains L0 files ordered by sublevels. L0SubLevels *L0SubLevels
Making this a pointer implies that the zero value for Version
needs some initialization. Is being able to copy the pointer when L0 isn't changing saving a lot vs making this a value?
internal/manifest/version.go, line 437 at r2 (raw file):
// Pass 1 as level instead of 0 to trigger the level-related SST overlap // checks. if err := CheckOrdering(cmp, format, 1, v.L0SubLevels.Files[sublevel]); err != nil {
Passing 1
will cause the error messages to be prefixed with L1
which is incorrect. Perhaps we should define a level
type here which includes both level
and sublevel
components. Then if the sublevel
component is some magic value we'd engage in the legacy L0 ordering check. This type could also provide the String()
method which formats L<level>[.<sublevel>]
correctly. I'm imagining something like:
const invalidSublevel = int32(-1)
type level struct {
level int32
sublevel int32
}
func makeSublevel(l, s int32) level {
return level{
level: l,
sublevel: s,
}
}
func makeLevel(level int32) level {
return level{
level: l,
sublevel: invalidSublevel,
}
}
func (l *level) String() string {
if l.sublevel == invalidSublevel {
return fmt.Sprintf("L%d", l.level)
}
return fmt.Sprintf("L%d.%d", l.level, l.sublevel)
}
The name level
isn't create for this type. levelAndSublevel
is more explicit, but also a bit of a mouthful to speak and type. Perhaps you have a better suggestion.
3ab9a4d
to
3119b4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: 0 of 31 files reviewed, 6 unresolved discussions (waiting on @jbowens, @petermattis, and @sumeerbhola)
db.go, line 711 at r2 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Can you check that this structure isn't causing extra allocations? Sometimes that can happen with the use of closures. There is a script in the Cockroach repo which can show which allocations escape to the heap. Another technique is to find a benchmark which uses this function and to run the benchmark with
-memprofile mem.prof -memprofilerate=1
which will cause every memory allocation to be sampled.
Just confirmed with a benchmark in cockroach. All the allocations in this function are either in the iterAllocPool.Get
above, or down at the buf.merging.init
. This closure doesn't add any.
metrics_test.go, line 173 at r2 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Why is full instantiation needed for this test?
Update: Ok, I think I understand the need here. Instantiation didn't capture the need for me. How about something like:
Position the iterator so that levelIters will have loaded an sstable.
Done. Yeah, that's basically it. If I didn't add this part, all the cache hit stats would be 0.
internal/manifest/version.go, line 209 at r2 (raw file):
Previously, petermattis (Peter Mattis) wrote…
s/organized/are organized/g
Done.
internal/manifest/version.go, line 217 at r2 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Should probably also mention that
L0Sublevels.Files
contains every table inFiles[0]
.
Done.
internal/manifest/version.go, line 218 at r2 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Making this a pointer implies that the zero value for
Version
needs some initialization. Is being able to copy the pointer when L0 isn't changing saving a lot vs making this a value?
I think suggesting initialization in general is a good idea - member variables like Compare
and Formatter
would make less sense at their zero values as well. The fact that we're able to reuse the relatively hefty struct in a no-change case is an added bonus.
internal/manifest/version.go, line 437 at r2 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Passing
1
will cause the error messages to be prefixed withL1
which is incorrect. Perhaps we should define alevel
type here which includes bothlevel
andsublevel
components. Then if thesublevel
component is some magic value we'd engage in the legacy L0 ordering check. This type could also provide theString()
method which formatsL<level>[.<sublevel>]
correctly. I'm imagining something like:const invalidSublevel = int32(-1) type level struct { level int32 sublevel int32 } func makeSublevel(l, s int32) level { return level{ level: l, sublevel: s, } } func makeLevel(level int32) level { return level{ level: l, sublevel: invalidSublevel, } } func (l *level) String() string { if l.sublevel == invalidSublevel { return fmt.Sprintf("L%d", l.level) } return fmt.Sprintf("L%d.%d", l.level, l.sublevel) }
The name
level
isn't create for this type.levelAndSublevel
is more explicit, but also a bit of a mouthful to speak and type. Perhaps you have a better suggestion.
That would involve exporting the type, and for a little helper struct, it seemed too overkill. I just added a sublevel
argument and exported InvalidSublevel
so callers can use it. PTAL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 31 files reviewed, 3 unresolved discussions (waiting on @itsbilal, @jbowens, and @sumeerbhola)
db.go, line 711 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Just confirmed with a benchmark in cockroach. All the allocations in this function are either in the
iterAllocPool.Get
above, or down at thebuf.merging.init
. This closure doesn't add any.
Great. Thanks for checking.
internal/manifest/version.go, line 218 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
I think suggesting initialization in general is a good idea - member variables like
Compare
andFormatter
would make less sense at their zero values as well. The fact that we're able to reuse the relatively hefty struct in a no-change case is an added bonus.
Ack.
internal/manifest/version.go, line 437 at r2 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
That would involve exporting the type, and for a little helper struct, it seemed too overkill. I just added a
sublevel
argument and exportedInvalidSublevel
so callers can use it. PTAL?
Ok. I think this is good for now. There is an argument to be made that we should expert a Level
type so that we can ensure consistent formatting of L<level>[.<sublevel>]
everywhere, but it doesn't have to be done in this PR.
internal/manifest/testdata/version_check_ordering, line 19 at r4 (raw file):
a.SET.1-b.SET.2 ---- L0 files 000001 and 000002 are not properly ordered: <#3-#4> vs <#1-#2>
I don't see any test cases which have an error referencing L0.<sublevel>
. Are those possible to create with the current test setup? I suspect we'll need to modify the test harness so that we can explicitly specify the files in a sublevel.
170c602
to
3ca6ed7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 31 files reviewed, 3 unresolved discussions (waiting on @jbowens, @petermattis, and @sumeerbhola)
internal/manifest/testdata/version_check_ordering, line 19 at r4 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I don't see any test cases which have an error referencing
L0.<sublevel>
. Are those possible to create with the current test setup? I suspect we'll need to modify the test harness so that we can explicitly specify the files in a sublevel.
Yup, that's what I did - updated the test harness to be able to work around the initialization code and specifically specify sublevels. Thanks!
This change effectively lowers read amplification in L0 by taking advantage of knowledge about sublevels contained in Version.L0SubLevels and using it to create levelIters for each sublevel instead of directly adding one iterator per file to the mergingIter. The get_iter is also updated to take advantage of this information. The Version struct is also updated, as a result of this change, to hold a reference to an L0SubLevels instance. That reference is initialized during Version initialization in version_edit. Most of the test changes are a result of always printing the sublevel alongside the level for L0.
3ca6ed7
to
089202d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 29 files reviewed, all discussions resolved (waiting on @jbowens and @sumeerbhola)
TFTR! |
This change effectively lowers read amplification in L0 by taking
advantage of knowledge about sublevels contained in
Version.L0SubLevels and using it to create levelIters for each sublevel
instead of directly adding one iterator per file to the mergingIter.
The get_iter is also updated to take advantage of this information.
The Version struct is also updated, as a result of this change, to
hold a reference to an L0SubLevels instance. That reference is initialized
during Version initialization in version_edit.
Most of the test changes are a result of always printing the sublevel
alongside the level for L0.
The first commit is from #614; only review the second commit.