Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add catalog.Index interface, replace catalog.TableDescriptor methods for indexes #58273

Merged
merged 13 commits into from
Jan 7, 2021

Conversation

postamar
Copy link
Contributor

@postamar postamar commented Dec 24, 2020

In an effort to not directly use descpb.IndexDescriptor protos everywhere, this multi-commit PR:

  1. introduces a new catalog.Index interface to encapsulate them, and a bunch of new methods in catalog.TableDescriptor to replace those which have descpb.IndexDescriptor in their signatures, this is done in the first commit;
  2. replaces calls to the old methods with calls to the new methods throughout the codebase, this is done one method at a time in the subsequent commits.

This breakdown into multiple commits is done to ease review, after all most of the contents of this diff are effectively little more than noise. Still, there are a few non-straightforward changes, but I figured they were worth it, though.

This PR is motivated by #57465, see also its parent issue #56306 for more details.

Subsequent work should consist in propagating this descpb.IndexDescriptor -> catalog.Index change further throughout the codebase.

@postamar postamar requested a review from a team December 24, 2020 17:35
@postamar postamar requested a review from a team as a code owner December 24, 2020 17:35
@postamar postamar requested review from dt and removed request for a team December 24, 2020 17:35
@postamar postamar marked this pull request as draft December 24, 2020 17:35
@postamar postamar removed request for a team and dt December 24, 2020 17:35
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@postamar postamar force-pushed the use-new-index-interface branch 3 times, most recently from 261f14b to a548db9 Compare December 24, 2020 20:20
@postamar postamar changed the title [WIP] Use new index interface [WIP] sql: add catalog.Index interface, replace methods with descpb.IndexDescriptor types with new ones in catalog.TableDescriptor Dec 24, 2020
@postamar postamar force-pushed the use-new-index-interface branch 2 times, most recently from 0c2c6b6 to 0af0bfd Compare December 24, 2020 20:59
@postamar postamar force-pushed the use-new-index-interface branch 3 times, most recently from ae9728d to adf0ec1 Compare January 4, 2021 16:13
@postamar postamar changed the title [WIP] sql: add catalog.Index interface, replace methods with descpb.IndexDescriptor types with new ones in catalog.TableDescriptor sql: add catalog.Index interface, replace catalog.TableDescriptor methods for indexes Jan 4, 2021
@postamar postamar marked this pull request as ready for review January 4, 2021 17:10
@postamar postamar requested a review from a team January 4, 2021 17:10
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest thing missing in the first commit, as I review it, is unit testing of all of these methods. This code is an opportunity for us to have good unit testing. There are some different approaches you could take. You could do it externally using an _test package to create a server and use SQL syntax to construct descriptors. That's probably less painful than doing it by hand.

Reviewed 15 of 15 files at r1, 7 of 7 files at r2, 15 of 15 files at r3, 7 of 7 files at r4, 8 of 8 files at r5, 15 of 15 files at r6, 16 of 16 files at r7, 38 of 38 files at r8, 39 of 39 files at r9, 42 of 42 files at r10.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @postamar)


pkg/sql/truncate.go, line 198 at r8 (raw file):

	oldIndexes := make([]descpb.IndexDescriptor, len(tableDesc.ActiveIndexes()))
	for _, idx := range tableDesc.ActiveIndexes() {

It seems somewhat subtle that these are going to line up in a contiguous manner. I think, maybe, I don't have a clear enough mental model for Ordinal.


pkg/sql/catalog/descriptor.go, line 150 at r1 (raw file):

Quoted 16 lines of code…
	ForEachIndex(opts IndexOpts, f func(idx Index) error) error
	ForEachActiveIndex(f func(idx Index) error) error
	ForEachNonDropIndex(f func(idx Index) error) error
	ForEachPartialIndex(f func(idx Index) error) error
	ForEachPublicNonPrimaryIndex(f func(idx Index) error) error
	ForEachWritableNonPrimaryIndex(f func(idx Index) error) error
	ForEachDeletableNonPrimaryIndex(f func(idx Index) error) error
	ForEachDeleteOnlyNonPrimaryIndex(f func(idx Index) error) error
	FindIndex(opts IndexOpts, test func(idx Index) bool) Index
	FindActiveIndex(test func(idx Index) bool) Index
	FindNonDropIndex(test func(idx Index) bool) Index
	FindPartialIndex(test func(idx Index) bool) Index
	FindPublicNonPrimaryIndex(test func(idx Index) bool) Index
	FindWritableNonPrimaryIndex(test func(idx Index) bool) Index
	FindDeletableNonPrimaryIndex(test func(idx Index) bool) Index
	FindDeleteOnlyNonPrimaryIndex(test func(idx Index) bool) Index

Should these be functions implemented on top of the interface rather than in the interface itself? Seems like all of the implementations are going to end up interacting with the slices. At some point there is a cost to growing the interface to have all of the functionality.


pkg/sql/catalog/descriptor.go, line 220 at r1 (raw file):

// Index is an interface around the index descriptor types.
type Index interface {

A bit of grouping might be good here. Eventually we'll want comments too.


pkg/sql/catalog/descriptor.go, line 215 at r8 (raw file):

	IndexDescDeepCopy() descpb.IndexDescriptor

	Ordinal() int

In particular I think it's really important to comment this. I believe that it corresponds to indexes in AllIndexes() and for active indexes it corresponds to ActiveIndexes()


pkg/sql/catalog/tabledesc/index.go, line 39 at r1 (raw file):

// IndexDescDeepCopy returns a deep copy of the underlying protobuf descriptor.
func (w index) IndexDescDeepCopy() descpb.IndexDescriptor {

These methods should almost definitely be on the pointer and not the struct.


pkg/sql/catalog/tabledesc/index.go, line 314 at r1 (raw file):

// indexCache contains lazily precomputed slices of catalog.Index interfaces.
// A field value of nil indicates that the slice hasn't been precomputed yet.
type indexCache struct {

There are some tricks we use take to eliminate some allocations here. Let's revisit them later. Ideally the allocations shouldn't matter if we follow through on #53747 / #55960


pkg/sql/catalog/tabledesc/table_desc.go, line 318 at r1 (raw file):

// catalog.Index interface.
func (desc *wrapper) PrimaryIndexInterface() catalog.Index {
	return index{desc: &desc.PrimaryIndex}

This is stealthily going to incur an allocation every time you call it. Instead I think it may make sense to stick this index object inside the wrapper and then return the pointer here.

Copy link
Contributor Author

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some testing, among other things. Coverage is not strictly complete but it goes most of the way. Depending on how you define unit testing this might not qualify as such but IMHO it's as low-level a test as can be done while remaining meaningful.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/truncate.go, line 198 at r8 (raw file):

Previously, ajwerner wrote…
	oldIndexes := make([]descpb.IndexDescriptor, len(tableDesc.ActiveIndexes()))
	for _, idx := range tableDesc.ActiveIndexes() {

It seems somewhat subtle that these are going to line up in a contiguous manner. I think, maybe, I don't have a clear enough mental model for Ordinal.

I've added more comments on how Ordinal() works which I'm hoping will clear this up.


pkg/sql/catalog/descriptor.go, line 150 at r1 (raw file):

Previously, ajwerner wrote…
	ForEachIndex(opts IndexOpts, f func(idx Index) error) error
	ForEachActiveIndex(f func(idx Index) error) error
	ForEachNonDropIndex(f func(idx Index) error) error
	ForEachPartialIndex(f func(idx Index) error) error
	ForEachPublicNonPrimaryIndex(f func(idx Index) error) error
	ForEachWritableNonPrimaryIndex(f func(idx Index) error) error
	ForEachDeletableNonPrimaryIndex(f func(idx Index) error) error
	ForEachDeleteOnlyNonPrimaryIndex(f func(idx Index) error) error
	FindIndex(opts IndexOpts, test func(idx Index) bool) Index
	FindActiveIndex(test func(idx Index) bool) Index
	FindNonDropIndex(test func(idx Index) bool) Index
	FindPartialIndex(test func(idx Index) bool) Index
	FindPublicNonPrimaryIndex(test func(idx Index) bool) Index
	FindWritableNonPrimaryIndex(test func(idx Index) bool) Index
	FindDeletableNonPrimaryIndex(test func(idx Index) bool) Index
	FindDeleteOnlyNonPrimaryIndex(test func(idx Index) bool) Index

Should these be functions implemented on top of the interface rather than in the interface itself? Seems like all of the implementations are going to end up interacting with the slices. At some point there is a cost to growing the interface to have all of the functionality.

Done


pkg/sql/catalog/descriptor.go, line 220 at r1 (raw file):

Previously, ajwerner wrote…

A bit of grouping might be good here. Eventually we'll want comments too.

Done


pkg/sql/catalog/descriptor.go, line 215 at r8 (raw file):

Previously, ajwerner wrote…

In particular I think it's really important to comment this. I believe that it corresponds to indexes in AllIndexes() and for active indexes it corresponds to ActiveIndexes()

Done


pkg/sql/catalog/tabledesc/index.go, line 39 at r1 (raw file):

Previously, ajwerner wrote…

These methods should almost definitely be on the pointer and not the struct.

I agree with you in principle, however in this case the index struct is basically a pointer to the desc along with a few metadata. It's tiny and should remain tiny. I'll do the change if you insist but imho it feels like we're already calling by reference.


pkg/sql/catalog/tabledesc/index.go, line 314 at r1 (raw file):

Previously, ajwerner wrote…

There are some tricks we use take to eliminate some allocations here. Let's revisit them later. Ideally the allocations shouldn't matter if we follow through on #53747 / #55960

I've refined this somewhat (I think? Please LMK) and in the process fixed a small regression of mine in index_encoding.go.


pkg/sql/catalog/tabledesc/table_desc.go, line 318 at r1 (raw file):

Previously, ajwerner wrote…

This is stealthily going to incur an allocation every time you call it. Instead I think it may make sense to stick this index object inside the wrapper and then return the pointer here.

OK

@postamar
Copy link
Contributor Author

postamar commented Jan 7, 2021

CI had one failure on an acceptance test which appears to be unrelated to these changes, see #58489 .

@ajwerner
Copy link
Contributor

ajwerner commented Jan 7, 2021

Have you rebased master? Try rebasing and pushing again

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests and comments. Just nits from me now.

Reviewed 1 of 3 files at r11, 19 of 19 files at r12, 2 of 2 files at r13.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)


pkg/sql/catalog/descriptor.go, line 458 at r13 (raw file):

		}
		if err := f(idx); err != nil {
			return err

One thing we may want to support here is https://godoc.org/github.com/cockroachdb/cockroach/pkg/util/iterutil#Done. It allows users to supply an error to stop iteration that gets swallowed. This is a pattern we've been picking up and is better than having the closures put a bool somewhere and then early returning.

In short this would look like:

if iterutils.Done(err) {
    err = nil
}
return err

pkg/sql/catalog/descriptor.go, line 466 at r13 (raw file):

func forEachIndex(slice []Index, f func(idx Index) error) error {
	for _, idx := range slice {
		if err := f(idx); err != nil {

same iterutil comment here


pkg/sql/catalog/descriptor.go, line 516 at r13 (raw file):

// Indexes are visited in their canonical order, see Index.Ordinal().
func FindIndex(desc TableDescriptor, opts IndexOpts, test func(idx Index) bool) Index {
	for _, idx := range desc.AllIndexes() {

If you had the iterutil.Done thing then you could just call ForEachIndex inside of this.


pkg/sql/catalog/descriptor.go, line 535 at r13 (raw file):

func findIndex(slice []Index, test func(idx Index) bool) Index {
	for _, idx := range slice {
		if test(idx) {

Ditto the iterutil.Done thing here


pkg/sql/catalog/tabledesc/index_test.go, line 52 at r13 (raw file):

			CONSTRAINT pk PRIMARY KEY (c1 ASC, c2 ASC, c3 ASC),
			INDEX s1 (c4 DESC, c5 DESC),
      INVERTED INDEX s2 (c6),

super nit: indentation


pkg/sql/rowenc/index_encoding.go, line 550 at r13 (raw file):

	interleaves := make([]catalog.Index, 0, len(desc.ActiveIndexes()))
	interleaves = append(interleaves, desc.ActiveIndexes()...)

extremely minor nit: just do this all in one expression?

	interleaves := append(make([]catalog.Index, 0, len(desc.ActiveIndexes())), 
                desc.ActiveIndexes()...)

Copy link
Contributor Author

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround. I rebased on master, looking good now.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/catalog/descriptor.go, line 458 at r13 (raw file):

Previously, ajwerner wrote…

One thing we may want to support here is https://godoc.org/github.com/cockroachdb/cockroach/pkg/util/iterutil#Done. It allows users to supply an error to stop iteration that gets swallowed. This is a pattern we've been picking up and is better than having the closures put a bool somewhere and then early returning.

In short this would look like:

if iterutils.Done(err) {
    err = nil
}
return err

Done.


pkg/sql/catalog/descriptor.go, line 466 at r13 (raw file):

Previously, ajwerner wrote…

same iterutil comment here

Done.


pkg/sql/catalog/descriptor.go, line 516 at r13 (raw file):

Previously, ajwerner wrote…

If you had the iterutil.Done thing then you could just call ForEachIndex inside of this.

Agreed, though I tried rewriting this using ForEachIndex but it actually turned out less clear. This did lead me to move the logic surrounding opts into its own helper function.


pkg/sql/catalog/descriptor.go, line 535 at r13 (raw file):

Previously, ajwerner wrote…

Ditto the iterutil.Done thing here

See above.


pkg/sql/catalog/tabledesc/index_test.go, line 52 at r13 (raw file):

Previously, ajwerner wrote…

super nit: indentation

Done.


pkg/sql/rowenc/index_encoding.go, line 550 at r13 (raw file):

Previously, ajwerner wrote…

extremely minor nit: just do this all in one expression?

	interleaves := append(make([]catalog.Index, 0, len(desc.ActiveIndexes())), 
                desc.ActiveIndexes()...)

Done.

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 15 files at r1, 97 of 97 files at r14, 5 of 15 files at r16, 8 of 15 files at r19, 7 of 16 files at r20, 15 of 38 files at r21, 36 of 42 files at r23, 1 of 3 files at r24, 20 of 20 files at r25, 2 of 2 files at r26.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)


pkg/sql/catalog/tabledesc/table_desc.go, line 313 at r26 (raw file):

// in their canonical order. This is equivalent to the Indexes array in the
// proto.
func (desc *wrapper) PublicNonPrimaryIndexes() []catalog.Index {

nit: is there a reason why these aren't PublicSecondaryIndexes, etc.?

Copy link
Contributor Author

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @lucy-zhang)


pkg/sql/catalog/tabledesc/table_desc.go, line 313 at r26 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

nit: is there a reason why these aren't PublicSecondaryIndexes, etc.?

Historical reasons, I guess. There was already a GetPublicNonPrimaryIndexes method so I just stuck with that and tried to be consistent.

Marius Posta added 13 commits January 7, 2021 16:01
Previously, the catalog.TableDescriptor interface and its implementing
types would liberally return descpb.IndexDescriptor values, pointers and
slices. In an effort to stop manipulating such protos directly, this
patch introduces a catalog.Index interface to encapsulate it. In order
to enventually propagate this change throughout the code base, this
patch marks some existing catalog.TableDescriptor methods as deprecated
and introduces new ones to eventually replace them.

Partially addresses cockroachdb#57465.

Release note: None
Previously, these methods would be called on a table descriptor interface
(or backing struct) to obtain a slice of descpb.IndexDescriptors. This
patch removes these calls, along with the method definitions, in favour
of new methods which use the catalog.Index interface type instead.

Partially addresses cockroachdb#57465.

Release note: None
Previously, this method would be called on a table descriptor interface
(or backing struct) to obtain a slice of descpb.IndexDescriptor pointers.
This patch removes these calls, along with the method definitions, in
favour of new methods which use the catalog.Index interface type instead.

Partially addresses cockroachdb#57465.

Release note: None
Previously, this method would be called on a table descriptor interface
(or backing struct) to apply a function on the *descpb.IndexDescriptor
of each of the table's non-drop indexes.
This patch removes these calls, along with the method definition, in
favour of new methods which use the catalog.Index interface type instead.

Partially addresses cockroachdb#57465.

Release note: None
Previously, this method would be called on a table descriptor interface
(or backing struct) to apply a function on the *descpb.IndexDescriptor
of each of the table's indexes.
This patch removes these calls, along with the method definition, in
favour of new methods which use the catalog.Index interface type instead.

Partially addresses cockroachdb#57465.

Release note: None
Previously, this method would be called on a table descriptor interface
(or backing struct) to return a target *descpb.IndexDescriptor.
This patch removes these calls, along with the method definition, in
favour of new methods which use the catalog.Index interface type instead.

Partially addresses cockroachdb#57465.

Release note: None
Previously, this method would be called on a table descriptor interface
(or backing struct) to return a target *descpb.IndexDescriptor.
This patch removes these calls, along with the method definition, in
favour of new methods which use the catalog.Index interface type instead.

Partially addresses cockroachdb#57465.

Release note: None
Previously, this method would be called on a table descriptor interface
(or backing struct) to obtain a slice of descpb.IndexDescriptors. This
patch removes these calls, along with the method definition, in favour
of new methods which use the catalog.Index interface type instead.

Partially addresses cockroachdb#57465.

Release note: None
Previously, this method would be called on a table descriptor interface
(or backing struct) to obtain the primary index descriptor in the form
of a *descpb.IndexDescriptor.
This patch removes these calls, along with the method definition, in
favour of new methods which use the catalog.Index interface type instead.

Partially addresses cockroachdb#57465.

Release note: None
Previously, the name PrimaryIndexInterface had been chosen because
a deprecated GetPrimaryIndex() method already existed and because
PrimaryIndex is an existing field in descpb.IndexDescriptor which is
embedded in the structs implementing the catalog.TableDescriptor
interface.

This patch renames the method to GetPrimaryIndex: although its return
type differs it has the same meaning as the recently-removed method of
the same name.

Release note: None
This patch changes the indexCache struct in a bid to reduce the number
of memory allocations that take place when accessing its contents. The
previous behaviour of GetPrimaryIndex() was particularly bad in that it
resulted in an allocation in every call, which is easily avoided.

Release note: None
The catalog.TableDescriptor has ForEach*Index and Find*Index methods
which can just as well be expressed as functions, thereby lightening
that interface definition somewhat.

This patch also adds iterutils.StopIteration() support for the above
ForEach*Index functions, and otherwise also adds comments to many
catalog.Index methods.

Release note: None
This commit introduces unit tests for the new catalog.Index interface,
which were missing in my earlier commits.

Release note: None
@postamar
Copy link
Contributor Author

postamar commented Jan 7, 2021

Rebased on master because of merge conflict.

@postamar
Copy link
Contributor Author

postamar commented Jan 7, 2021

bors r+

@craig
Copy link
Contributor

craig bot commented Jan 7, 2021

Build succeeded:

@craig craig bot merged commit c4947ad into cockroachdb:master Jan 7, 2021
@postamar postamar deleted the use-new-index-interface branch January 8, 2021 14:24
postamar pushed a commit to postamar/cockroach that referenced this pull request Jan 8, 2021
This reverts commit c4947ad, reversing
changes made to d8b5cb0.

Release note: None
craig bot pushed a commit that referenced this pull request Jan 8, 2021
58642: sql: Revert "Merge #58273" r=postamar a=postamar

This reverts commit c4947ad, reversing
changes made to d8b5cb0.

Release note: None

Co-authored-by: Marius Posta <marius@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants