-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add catalog.Index interface, replace catalog.TableDescriptor methods for indexes #58273
Conversation
261f14b
to
a548db9
Compare
0c2c6b6
to
0af0bfd
Compare
ae9728d
to
adf0ec1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The biggest thing missing in the first commit, as I review it, is unit testing of all of these methods. This code is an opportunity for us to have good unit testing. There are some different approaches you could take. You could do it externally using an _test
package to create a server and use SQL syntax to construct descriptors. That's probably less painful than doing it by hand.
Reviewed 15 of 15 files at r1, 7 of 7 files at r2, 15 of 15 files at r3, 7 of 7 files at r4, 8 of 8 files at r5, 15 of 15 files at r6, 16 of 16 files at r7, 38 of 38 files at r8, 39 of 39 files at r9, 42 of 42 files at r10.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @postamar)
pkg/sql/truncate.go, line 198 at r8 (raw file):
oldIndexes := make([]descpb.IndexDescriptor, len(tableDesc.ActiveIndexes())) for _, idx := range tableDesc.ActiveIndexes() {
It seems somewhat subtle that these are going to line up in a contiguous manner. I think, maybe, I don't have a clear enough mental model for Ordinal
.
pkg/sql/catalog/descriptor.go, line 150 at r1 (raw file):
Quoted 16 lines of code…
ForEachIndex(opts IndexOpts, f func(idx Index) error) error ForEachActiveIndex(f func(idx Index) error) error ForEachNonDropIndex(f func(idx Index) error) error ForEachPartialIndex(f func(idx Index) error) error ForEachPublicNonPrimaryIndex(f func(idx Index) error) error ForEachWritableNonPrimaryIndex(f func(idx Index) error) error ForEachDeletableNonPrimaryIndex(f func(idx Index) error) error ForEachDeleteOnlyNonPrimaryIndex(f func(idx Index) error) error FindIndex(opts IndexOpts, test func(idx Index) bool) Index FindActiveIndex(test func(idx Index) bool) Index FindNonDropIndex(test func(idx Index) bool) Index FindPartialIndex(test func(idx Index) bool) Index FindPublicNonPrimaryIndex(test func(idx Index) bool) Index FindWritableNonPrimaryIndex(test func(idx Index) bool) Index FindDeletableNonPrimaryIndex(test func(idx Index) bool) Index FindDeleteOnlyNonPrimaryIndex(test func(idx Index) bool) Index
Should these be functions implemented on top of the interface rather than in the interface itself? Seems like all of the implementations are going to end up interacting with the slices. At some point there is a cost to growing the interface to have all of the functionality.
pkg/sql/catalog/descriptor.go, line 220 at r1 (raw file):
// Index is an interface around the index descriptor types. type Index interface {
A bit of grouping might be good here. Eventually we'll want comments too.
pkg/sql/catalog/descriptor.go, line 215 at r8 (raw file):
IndexDescDeepCopy() descpb.IndexDescriptor Ordinal() int
In particular I think it's really important to comment this. I believe that it corresponds to indexes in AllIndexes()
and for active indexes it corresponds to ActiveIndexes()
pkg/sql/catalog/tabledesc/index.go, line 39 at r1 (raw file):
// IndexDescDeepCopy returns a deep copy of the underlying protobuf descriptor. func (w index) IndexDescDeepCopy() descpb.IndexDescriptor {
These methods should almost definitely be on the pointer and not the struct.
pkg/sql/catalog/tabledesc/index.go, line 314 at r1 (raw file):
// indexCache contains lazily precomputed slices of catalog.Index interfaces. // A field value of nil indicates that the slice hasn't been precomputed yet. type indexCache struct {
There are some tricks we use take to eliminate some allocations here. Let's revisit them later. Ideally the allocations shouldn't matter if we follow through on #53747 / #55960
pkg/sql/catalog/tabledesc/table_desc.go, line 318 at r1 (raw file):
// catalog.Index interface. func (desc *wrapper) PrimaryIndexInterface() catalog.Index { return index{desc: &desc.PrimaryIndex}
This is stealthily going to incur an allocation every time you call it. Instead I think it may make sense to stick this index object inside the wrapper and then return the pointer here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some testing, among other things. Coverage is not strictly complete but it goes most of the way. Depending on how you define unit testing this might not qualify as such but IMHO it's as low-level a test as can be done while remaining meaningful.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/truncate.go, line 198 at r8 (raw file):
Previously, ajwerner wrote…
oldIndexes := make([]descpb.IndexDescriptor, len(tableDesc.ActiveIndexes())) for _, idx := range tableDesc.ActiveIndexes() {
It seems somewhat subtle that these are going to line up in a contiguous manner. I think, maybe, I don't have a clear enough mental model for
Ordinal
.
I've added more comments on how Ordinal() works which I'm hoping will clear this up.
pkg/sql/catalog/descriptor.go, line 150 at r1 (raw file):
Previously, ajwerner wrote…
ForEachIndex(opts IndexOpts, f func(idx Index) error) error ForEachActiveIndex(f func(idx Index) error) error ForEachNonDropIndex(f func(idx Index) error) error ForEachPartialIndex(f func(idx Index) error) error ForEachPublicNonPrimaryIndex(f func(idx Index) error) error ForEachWritableNonPrimaryIndex(f func(idx Index) error) error ForEachDeletableNonPrimaryIndex(f func(idx Index) error) error ForEachDeleteOnlyNonPrimaryIndex(f func(idx Index) error) error FindIndex(opts IndexOpts, test func(idx Index) bool) Index FindActiveIndex(test func(idx Index) bool) Index FindNonDropIndex(test func(idx Index) bool) Index FindPartialIndex(test func(idx Index) bool) Index FindPublicNonPrimaryIndex(test func(idx Index) bool) Index FindWritableNonPrimaryIndex(test func(idx Index) bool) Index FindDeletableNonPrimaryIndex(test func(idx Index) bool) Index FindDeleteOnlyNonPrimaryIndex(test func(idx Index) bool) Index
Should these be functions implemented on top of the interface rather than in the interface itself? Seems like all of the implementations are going to end up interacting with the slices. At some point there is a cost to growing the interface to have all of the functionality.
Done
pkg/sql/catalog/descriptor.go, line 220 at r1 (raw file):
Previously, ajwerner wrote…
A bit of grouping might be good here. Eventually we'll want comments too.
Done
pkg/sql/catalog/descriptor.go, line 215 at r8 (raw file):
Previously, ajwerner wrote…
In particular I think it's really important to comment this. I believe that it corresponds to indexes in
AllIndexes()
and for active indexes it corresponds toActiveIndexes()
Done
pkg/sql/catalog/tabledesc/index.go, line 39 at r1 (raw file):
Previously, ajwerner wrote…
These methods should almost definitely be on the pointer and not the struct.
I agree with you in principle, however in this case the index
struct is basically a pointer to the desc along with a few metadata. It's tiny and should remain tiny. I'll do the change if you insist but imho it feels like we're already calling by reference.
pkg/sql/catalog/tabledesc/index.go, line 314 at r1 (raw file):
Previously, ajwerner wrote…
There are some tricks we use take to eliminate some allocations here. Let's revisit them later. Ideally the allocations shouldn't matter if we follow through on #53747 / #55960
I've refined this somewhat (I think? Please LMK) and in the process fixed a small regression of mine in index_encoding.go
.
pkg/sql/catalog/tabledesc/table_desc.go, line 318 at r1 (raw file):
Previously, ajwerner wrote…
This is stealthily going to incur an allocation every time you call it. Instead I think it may make sense to stick this index object inside the wrapper and then return the pointer here.
OK
CI had one failure on an acceptance test which appears to be unrelated to these changes, see #58489 . |
Have you rebased master? Try rebasing and pushing again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tests and comments. Just nits from me now.
Reviewed 1 of 3 files at r11, 19 of 19 files at r12, 2 of 2 files at r13.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/sql/catalog/descriptor.go, line 458 at r13 (raw file):
} if err := f(idx); err != nil { return err
One thing we may want to support here is https://godoc.org/github.com/cockroachdb/cockroach/pkg/util/iterutil#Done. It allows users to supply an error to stop iteration that gets swallowed. This is a pattern we've been picking up and is better than having the closures put a bool somewhere and then early returning.
In short this would look like:
if iterutils.Done(err) {
err = nil
}
return err
pkg/sql/catalog/descriptor.go, line 466 at r13 (raw file):
func forEachIndex(slice []Index, f func(idx Index) error) error { for _, idx := range slice { if err := f(idx); err != nil {
same iterutil comment here
pkg/sql/catalog/descriptor.go, line 516 at r13 (raw file):
// Indexes are visited in their canonical order, see Index.Ordinal(). func FindIndex(desc TableDescriptor, opts IndexOpts, test func(idx Index) bool) Index { for _, idx := range desc.AllIndexes() {
If you had the iterutil.Done
thing then you could just call ForEachIndex
inside of this.
pkg/sql/catalog/descriptor.go, line 535 at r13 (raw file):
func findIndex(slice []Index, test func(idx Index) bool) Index { for _, idx := range slice { if test(idx) {
Ditto the iterutil.Done
thing here
pkg/sql/catalog/tabledesc/index_test.go, line 52 at r13 (raw file):
CONSTRAINT pk PRIMARY KEY (c1 ASC, c2 ASC, c3 ASC), INDEX s1 (c4 DESC, c5 DESC), INVERTED INDEX s2 (c6),
super nit: indentation
pkg/sql/rowenc/index_encoding.go, line 550 at r13 (raw file):
interleaves := make([]catalog.Index, 0, len(desc.ActiveIndexes())) interleaves = append(interleaves, desc.ActiveIndexes()...)
extremely minor nit: just do this all in one expression?
interleaves := append(make([]catalog.Index, 0, len(desc.ActiveIndexes())),
desc.ActiveIndexes()...)
4428fdc
to
aa87b0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick turnaround. I rebased on master, looking good now.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/catalog/descriptor.go, line 458 at r13 (raw file):
Previously, ajwerner wrote…
One thing we may want to support here is https://godoc.org/github.com/cockroachdb/cockroach/pkg/util/iterutil#Done. It allows users to supply an error to stop iteration that gets swallowed. This is a pattern we've been picking up and is better than having the closures put a bool somewhere and then early returning.
In short this would look like:
if iterutils.Done(err) { err = nil } return err
Done.
pkg/sql/catalog/descriptor.go, line 466 at r13 (raw file):
Previously, ajwerner wrote…
same iterutil comment here
Done.
pkg/sql/catalog/descriptor.go, line 516 at r13 (raw file):
Previously, ajwerner wrote…
If you had the
iterutil.Done
thing then you could just callForEachIndex
inside of this.
Agreed, though I tried rewriting this using ForEachIndex but it actually turned out less clear. This did lead me to move the logic surrounding opts
into its own helper function.
pkg/sql/catalog/descriptor.go, line 535 at r13 (raw file):
Previously, ajwerner wrote…
Ditto the
iterutil.Done
thing here
See above.
pkg/sql/catalog/tabledesc/index_test.go, line 52 at r13 (raw file):
Previously, ajwerner wrote…
super nit: indentation
Done.
pkg/sql/rowenc/index_encoding.go, line 550 at r13 (raw file):
Previously, ajwerner wrote…
extremely minor nit: just do this all in one expression?
interleaves := append(make([]catalog.Index, 0, len(desc.ActiveIndexes())), desc.ActiveIndexes()...)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 15 files at r1, 97 of 97 files at r14, 5 of 15 files at r16, 8 of 15 files at r19, 7 of 16 files at r20, 15 of 38 files at r21, 36 of 42 files at r23, 1 of 3 files at r24, 20 of 20 files at r25, 2 of 2 files at r26.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/sql/catalog/tabledesc/table_desc.go, line 313 at r26 (raw file):
// in their canonical order. This is equivalent to the Indexes array in the // proto. func (desc *wrapper) PublicNonPrimaryIndexes() []catalog.Index {
nit: is there a reason why these aren't PublicSecondaryIndexes
, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @lucy-zhang)
pkg/sql/catalog/tabledesc/table_desc.go, line 313 at r26 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
nit: is there a reason why these aren't
PublicSecondaryIndexes
, etc.?
Historical reasons, I guess. There was already a GetPublicNonPrimaryIndexes
method so I just stuck with that and tried to be consistent.
Previously, the catalog.TableDescriptor interface and its implementing types would liberally return descpb.IndexDescriptor values, pointers and slices. In an effort to stop manipulating such protos directly, this patch introduces a catalog.Index interface to encapsulate it. In order to enventually propagate this change throughout the code base, this patch marks some existing catalog.TableDescriptor methods as deprecated and introduces new ones to eventually replace them. Partially addresses cockroachdb#57465. Release note: None
Previously, these methods would be called on a table descriptor interface (or backing struct) to obtain a slice of descpb.IndexDescriptors. This patch removes these calls, along with the method definitions, in favour of new methods which use the catalog.Index interface type instead. Partially addresses cockroachdb#57465. Release note: None
Previously, this method would be called on a table descriptor interface (or backing struct) to obtain a slice of descpb.IndexDescriptor pointers. This patch removes these calls, along with the method definitions, in favour of new methods which use the catalog.Index interface type instead. Partially addresses cockroachdb#57465. Release note: None
Previously, this method would be called on a table descriptor interface (or backing struct) to apply a function on the *descpb.IndexDescriptor of each of the table's non-drop indexes. This patch removes these calls, along with the method definition, in favour of new methods which use the catalog.Index interface type instead. Partially addresses cockroachdb#57465. Release note: None
Previously, this method would be called on a table descriptor interface (or backing struct) to apply a function on the *descpb.IndexDescriptor of each of the table's indexes. This patch removes these calls, along with the method definition, in favour of new methods which use the catalog.Index interface type instead. Partially addresses cockroachdb#57465. Release note: None
Previously, this method would be called on a table descriptor interface (or backing struct) to return a target *descpb.IndexDescriptor. This patch removes these calls, along with the method definition, in favour of new methods which use the catalog.Index interface type instead. Partially addresses cockroachdb#57465. Release note: None
Previously, this method would be called on a table descriptor interface (or backing struct) to return a target *descpb.IndexDescriptor. This patch removes these calls, along with the method definition, in favour of new methods which use the catalog.Index interface type instead. Partially addresses cockroachdb#57465. Release note: None
Previously, this method would be called on a table descriptor interface (or backing struct) to obtain a slice of descpb.IndexDescriptors. This patch removes these calls, along with the method definition, in favour of new methods which use the catalog.Index interface type instead. Partially addresses cockroachdb#57465. Release note: None
Previously, this method would be called on a table descriptor interface (or backing struct) to obtain the primary index descriptor in the form of a *descpb.IndexDescriptor. This patch removes these calls, along with the method definition, in favour of new methods which use the catalog.Index interface type instead. Partially addresses cockroachdb#57465. Release note: None
Previously, the name PrimaryIndexInterface had been chosen because a deprecated GetPrimaryIndex() method already existed and because PrimaryIndex is an existing field in descpb.IndexDescriptor which is embedded in the structs implementing the catalog.TableDescriptor interface. This patch renames the method to GetPrimaryIndex: although its return type differs it has the same meaning as the recently-removed method of the same name. Release note: None
This patch changes the indexCache struct in a bid to reduce the number of memory allocations that take place when accessing its contents. The previous behaviour of GetPrimaryIndex() was particularly bad in that it resulted in an allocation in every call, which is easily avoided. Release note: None
The catalog.TableDescriptor has ForEach*Index and Find*Index methods which can just as well be expressed as functions, thereby lightening that interface definition somewhat. This patch also adds iterutils.StopIteration() support for the above ForEach*Index functions, and otherwise also adds comments to many catalog.Index methods. Release note: None
This commit introduces unit tests for the new catalog.Index interface, which were missing in my earlier commits. Release note: None
aa87b0f
to
5a09894
Compare
Rebased on master because of merge conflict. |
bors r+ |
Build succeeded: |
In an effort to not directly use
descpb.IndexDescriptor protos
everywhere, this multi-commit PR:catalog.Index
interface to encapsulate them, and a bunch of new methods incatalog.TableDescriptor
to replace those which havedescpb.IndexDescriptor
in their signatures, this is done in the first commit;This breakdown into multiple commits is done to ease review, after all most of the contents of this diff are effectively little more than noise. Still, there are a few non-straightforward changes, but I figured they were worth it, though.
This PR is motivated by #57465, see also its parent issue #56306 for more details.
Subsequent work should consist in propagating this
descpb.IndexDescriptor -> catalog.Index
change further throughout the codebase.