sql, *: make CockroachDB understand schema names #22371

knz · 2018-02-05T05:46:35Z

"Why":

Prior to this patch, CockroachDB only recognized names of the form
"db.tbl", like MySQL, whereas PostgreSQL compatibility mandates that
"db.public.tbl" be also valid. This needed a change.

"What"

The visible tip of this iceberg patch is as follows:

new name resolution functions which are more correct and more easy
to use than the previous code!

Use Normalize() on a TableName, then ResolveExistingObject()
or ResolveTargetObject().
this in turn relies on generic, well-abstracted name resolution
algorithms in sem/tree/name_resolution.go, the definition
of which closely follows the specs in the accompanying RFC (rfcs: proposal to make schemas more pg-compatible #21456).

This handles the pg syntax for catalog and schema names, together
with compatibility rules with CockroachDB v1.x.
a new schema accessor interface that truly encapsulates schema access!
Check out the gorgeous documentation in sql/schema_accessors.go.

In particular:
- planner now implements a new interface SchemaResolver. Use it.
- if you are not happy with SchemaResolver directly, use
  SchemaAccessor or its various consistuent interfaces.
  
  Depending on whether virtual schemas are to be considered, consider
  alternatively planner.LogicalSchemaAccessor() and
  planner.PhysicalSchemaAccessor().

One may be surprised to see this schema accessor work happen in the
same patch. This was, unfortunately, a requirement to successfully
plug catalog and schema name resolution in all the right places.
Also, it incidentally answers a long-standing demand for a better
interface to access descriptors.

"How"

The itinerary to achieve this brought me through the following steps:

the various name resolution algorithms are concentrated in a new
file sql/sem/tree/name_resolution.go. They use abstract
interfaces to interface with the "name provider": database, table,
column things coming from the database.
in the process of writing the name resolution algorithms, I found
that our previous understanding of name resolution was problematic:
- the previous code assumed it was always possible to "qualify a
  name using the current database" by just looking at the name
  itself and the current database, without any additional
  information.
  
  In contrast, the correct qualification algorithms requires both
  the current database, the search path, and descriptor
  lookups.
  
  This is why this patch fuses all occurrences of separate
  "qualification" and "resolution" phases together.
  
  Before: tn = tn.QualifyWithDatabase(tcurDb); desc = MustGetDesc(tn)
  
  After: desc = ResolveExistingObject(p, tn)
- the resolution code was pushing a VirtualTabler very deep in the
  lookup routines, with the mistaken understanding that
  VirtualTabler is able to respond to requests for database names.
  
  In contrast, VirtualTabler really has nothing to do with database
  names, and the corresponding code had to get rid of it.
  
  This was the first motivation for implementing new schema accessor
  interfaces.
- the path to decide whether to use cached or non-cached descriptors
  was very hard to read; in many instances the code was using
  uncached descriptors where cached descriptors would be needed
  instead. The various APIs were a mess, and information needed to
  decide whether a lookup was possible or not was not available at
  the right level(s) of abstraction.
  
  This was the second motivation for implementing new schema accessor
  interfaces.
Once this all was said done, the various consumers of name
resolution had to implement the interfaces. They are:
- resolution of zone specifiers;
- resolution of target names for CCL statements;
- resolution of names (both existing and targets) in the sql package;
- some testing code in sql/opt and sql/opt/build.

Details and suggestions to the reviewers

The following items are fairly self-contained and can be reviewed in
parallel, and probably should be reviewed first:

the changes in ccl/sqlccl. This achieves the following two things:
- the search path is plumbed where needed.
- the check for valid table names does not use
  QualifyWithDatabase any more (because that was removed);
  instead they use the new name resolution interface with a shim
  that just checks the schema name is valid.
the changes in sql/opt and sql/opt/build. This merely
interfaces with the name resolution algorithm for column items. I
noticed that the underlying data structures there only know about
unqualified table names, so I did not attempt to plumb
catalog/schema name resolution logic in there. No functionality is
lost.
the changes in pkg/config for zone specifiers. These are
straightforward and merely deal with the correspondance between the
schema part in the CLI zone specifier syntax and the catalog part of
the internal ZoneSpecifier struct.
the new file resolver.go, which contains the entry-point
functions for name resolution using a SchemaResolver.
the changes to server/admin.go. These simply get rid of the
virtual database checks (we have virtual schemas, not databases,
now.)
the name resolution algorithms in sem/tree/name_resolution.go
and the corresponding tests in sem/tree/name_resolution_test.go.

This is where it all started and should be compared to the
specifications in the accompanying RFC (rfcs: proposal to make schemas more pg-compatible #21456).
the interface definitions in schema_accessors.go and their
implementations in phy_schema_accessors.go and
logical_schema_accessors.go.

Once the last two items in the list above have been reviewed,
it becomes possible to proceed as follows:

check that the various planNode constructors access the new APIs
consistently, with no surprises.
check that no harm was done in table.go (table cache) and
database.go (database cache).
check that the construction of the main accessors in
session.go make sense.

Suggested review attributions:

@vivekmenezes: items 7 and 9.
@petermattis: items 2 and 6.
@jordanlewis: items 4 and 7.
@benesch: items 1 and 3.
@m-schneider: item 8.
@justinj: item 6.
@rytaft, @andy-kimball: items 2 and 6.
@BramGruneir, @a-robinson: item 5.
@andreimatei: item 10.

Release note (sql change): CockroachDB now recognizes the syntax
db.public.tbl in addition to db.tbl, for better compatibility with
PostgreSQL.

cockroach-teamcity · 2018-02-05T05:46:41Z

This change is

knz · 2018-02-05T05:49:49Z

Note especially the section "Details and suggestions to the reviewers" in the PR description.

cc:

@vivekmenezes: items 7 and 9.
@petermattis, @rytaft, @andy-kimball : items 2 and 6.
@jordanlewis: items 4 and 7.
@benesch: items 1 and 3.
@m-schneider: item 8.
@justinj: items 6.
@BramGruneir, @a-robinson: item 5.
@andreimatei: item 10.

RaduBerinde · 2018-02-05T13:42:41Z

sql/opt changes (item 2) LGTM

Review status: 0 of 90 files reviewed at latest revision, 1 unresolved discussion, some commit checks broke.

pkg/sql/opt/scope.go, line 53 at r1 (raw file):

}

// FindSourceMatchingName implements the tree.ColumnItemResolver interface.

[nit] incorrect name.

Also "is part of" is better than "implements" (which really only makes sense if this is the only method in the interface).

Comments from Reviewable

RaduBerinde · 2018-02-05T14:02:05Z

Review status: 0 of 90 files reviewed at latest revision, 5 unresolved discussions, some commit checks broke.

pkg/sql/sem/tree/name_part.go, line 159 at r1 (raw file):

	// allocation for the slice every time we construct an
	// UnresolvedName. It does imply however that Parts does not have
	// a meaningful "length"; its actually length (the number of parts

[nit, not this change] actual length

pkg/sql/sem/tree/name_resolution.go, line 32 at r1 (raw file):

// - resolution algorithms. These are used to map an unresolved name
//   or pattern to something looked up from the database.
//

[nit] extra lines

pkg/sql/sem/tree/name_resolution.go, line 58 at r1 (raw file):

	}

	return TableName{tblName{

This construction code is repeated a bunch of times, maybe it should be a function.

pkg/sql/sem/tree/name_resolution.go, line 115 at r1 (raw file):

	}

	// Check that all the parts specified are not empty.

This empty checking code appears in many places, it should probably be a separate function.

Comments from Reviewable

jordanlewis · 2018-02-05T14:43:54Z

Is there any way to break this up? It will be very difficult for your reviewers to successfully determine whether a +3,301 −1,574 patch over 90 files looks good.

knz · 2018-02-05T15:06:38Z

That's why I proposed a split of reviewing responsibilities so that each part is bite sized.

…

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

jordanlewis · 2018-02-05T15:08:00Z

Yes! I should have started by saying thanks for the excellent executive summary and directions for review. I do think that will help.

However, I worry that it'll be hard for any single reviewer to get a big picture of what's happening here. The whole is more than the sum of the parts and all that.

andreimatei · 2018-02-05T15:45:16Z

A silly question - the new functionality (as opposed to refactoring) in this PR is for more than supporting one schema literally called "public", right?

Review status: 0 of 90 files reviewed at latest revision, 5 unresolved discussions, some commit checks broke.

Comments from Reviewable

petermattis · 2018-02-05T15:46:33Z

Splitting the review responsibility seems like a good way for something to slip through the cracks. I imagine there was a way to break this into smaller chunks, but what is done is done.

My main anxiety is that name resolution can affect performance. @jordanlewis can you either show Raphael how to run kv and tpcc or take this PR for a spin? I'm less concerned about correctness given the test cases and the lack of changes to the logic tests. The one exception is that the addition of .public to a few random tests (noted in comments below).

Review status: 0 of 90 files reviewed at latest revision, 32 unresolved discussions, some commit checks broke.

pkg/ccl/sqlccl/targets_test.go, line 70 at r1 (raw file):

		{"data", "TABLE foo, baz", nil, nil, `table "foo" does not exist`},

		{"", "TABLE system.public.foo", []string{"system", "foo"}, nil, ``},

Does TABLE system.foo no longer work? That seems like a backward incompatible change. I thought we were trying to make that work, though, which means we should be adding new test cases instead of modifying existing ones.

pkg/config/zone.go, line 135 at r1 (raw file):

	// Table prefixes in CLI zone specifiers always have the form
	// "table" or "db.table" and do not know about schemas. They never
	// refer to virtual schemas. So migrate the db part in catalog position.

s/part in/part to the/g?

pkg/sql/create_index.go, line 42 at r1 (raw file):

	}

	defer p.useNewDescriptors()()

Note to self: what is this? Why is it not needed before ResolveTargetObject.

pkg/sql/create_stats.go, line 38 at r1 (raw file):

	}

	tableDesc, err := ResolveExistingObject(ctx, p, tn, true /*required*/, requireTableDesc)

No call to defer p.useNewDescriptors()(). I'm not sure what it does, but you're using it elsewhere that ResolveExistingObject is called.

pkg/sql/data_source.go, line 384 at r1 (raw file):

		}
		// Usual case: a table.
		log.VEventf(ctx, 2, "before table src: %q", t)

Did you intend to keep these? log.VEventf calls are not free and this is a low-level routine.

pkg/sql/data_source.go, line 587 at r1 (raw file):

	ctx context.Context, tn *tree.TableName,
) (*sqlbase.TableDescriptor, error) {
	return ResolveExistingObject(ctx, p, tn, true /*required*/, anyDescType)

Do you need a defer p.useNewDescriptors()() line?

pkg/sql/data_source.go, line 806 at r1 (raw file):

// for the set of sources and if it is, qualifies the missing database name.
func (sources multiSourceInfo) checkDatabaseName(tn tree.TableName) (tree.TableName, error) {
	// FIXME XXX

Reminder to address this. And below too.

pkg/sql/database.go, line 107 at r1 (raw file):

// DatabaseDescEditor provides helper methods for using SQL database descriptors.
type DatabaseDescEditor interface {

nit: The previous name wasn't accurate, but this new name isn't accurate either. How about DatabaseStorage? Hmm, that is a bit odd too as this is only the namespace for databases.

pkg/sql/phy_schema_accessors.go, line 1 at r1 (raw file):

// Copyright 2018 The Cockroach Authors.

nit: Do we use the abbreviation phy_ elsewhere? I'd prefer to spell this out as physical_schema_accessors.go for consistency with logical_schema_accessors.

pkg/sql/phy_schema_accessors.go, line 169 at r1 (raw file):

		return nil, nil, newInvalidSchemaError(name.Catalog(), name.Schema())
	}
	dbName, tblName := name.Catalog(), name.Table()

It is unfortunate that we're using Database in some places and Catalog in others and this is leaking throughout the code. Not for this PR: I think we should settle on one name and then leave the other isolated to some small piece of code.

pkg/sql/planner.go, line 216 at r1 (raw file):

	ctx := log.WithLogTagStr(context.Background(), opName, "")

	curDb := "" // FIXME XXX: change this to system later

Reminder to address this.

pkg/sql/resolver.go, line 110 at r1 (raw file):

// resolutions will be able to observe newly added descriptors still
// in the ADD state.
// The caller should use this as follows: defer p.useNewDescriptors()()

Can you expand on when this needs to be called? You're often calling it before calling ResolveExistingObject, but not always. It isn't clear to me what the rule is.

pkg/sql/resolver.go, line 127 at r1 (raw file):

)

var requiredTypeNames = map[requiredType]string{

nit: This can be [...]string. The initialization otherwise stays the same.

pkg/sql/resolver.go, line 162 at r1 (raw file):

func (p *planner) commonLookupFlags(ctx context.Context) CommonLookupFlags {
	return CommonLookupFlags{ctx: ctx, txn: p.txn, required: false, avoidCached: p.avoidCachedDescriptors}

nit: This would be more readable as a line per field.

pkg/sql/schema_accessors.go, line 51 at r1 (raw file):

type (
	// ObjectName is provided for convenience and to make the interface definitions below more intuitive.

I'm not sure I buy the convenience argument. Type aliases were mainly added to ease refactoring when you can't do so "atomically" (i.e. the refactor is spread across repos). On the other hand, I don't particularly mind this.

nit: These comments exceed 100 columns.

pkg/sql/sequence.go, line 263 at r1 (raw file):

			return nil, err
		}
		// FIXME XXX: uses nil resolver

Reminder to address this (or convert to TODO(knz)).

pkg/sql/set_zone_config.go, line 76 at r1 (raw file):

	}

	table, err := params.p.resolveTableForzone(params.ctx, &n.zoneSpecifier)

s/Forzone/ForZone/g

pkg/sql/zone_test.go, line 68 at r1 (raw file):

	jobsRow := sqlutils.ZoneRow{
		ID:           keys.JobsTableID,
		CLISpecifier: "system.public.jobs",

Shouldn't there be test cases for both system.jobs and system.public.jobs?

pkg/sql/pgwire/conn_test.go, line 176 at r1 (raw file):

		}
		query := exec.Stmt.String()
		t.Logf("received query: %s", query)

Did you intend to leave this in?

pkg/sql/sem/tree/name_resolution.go, line 103 at r1 (raw file):

		return &AllTablesSelector{prefix}, nil
	}
	return &TableName{tblName{TableName: Name(n.Parts[0]), TableNamePrefix: prefix}}, nil

Heh, TableName, tblName, TableName, and TableNamePrefix.

pkg/sql/sem/tree/name_resolution.go, line 252 at r1 (raw file):

		}
		// Two parts: D.T.
		// Try use the current database, and be satisfied if it's sufficient to find the object.

s/Try use/Try to use/g.

Should this be protected by if curDB != ""?

pkg/sql/sem/tree/name_resolution.go, line 295 at r1 (raw file):

		}
		// Two parts: D.T.
		// Try use the currebt database, and be satisfied if it's sufficient to find the object.

s/Try use/Try to use/g, s/currebt/current/g

pkg/sql/sem/tree/name_resolution_test.go, line 589 at r1 (raw file):

}

type spath = sessiondata.SearchPath

I believe you can define this inside TestResolveTablePatternOrName.

pkg/sql/sem/tree/name_resolution_test.go, line 591 at r1 (raw file):

type spath = sessiondata.SearchPath

var mpath = sessiondata.MakeSearchPath

You can define this inside TestResolveTablePatternOrName.

pkg/sql/sem/tree/name_resolution_test.go, line 593 at r1 (raw file):

var mpath = sessiondata.MakeSearchPath

func TestResolveTablePatternOrName(t *testing.T) {

Nice tests.

pkg/sql/sem/tree/name_resolution_test.go, line 604 at r1 (raw file):

		expanded string // The prefix after resolution, with hidden fields revealed.
		scName   string // The schema name after resolution.
		err      string // Trror, if expected.

s/Trror/Error/g

pkg/sql/sem/tree/name_resolution_test.go, line 612 at r1 (raw file):

		// Names of length 1.

		{`kv`, `db1`, mpath([]string{"public", "pg_catalog"}), true, `kv`, `db1.public.kv`, `db1.public[1]`, ``},

You could get rid of the []string by changing MakeSearchPath take a ...string argument.

Comments from Reviewable

andreimatei · 2018-02-05T16:05:47Z

Here's my main area of concern: historically we haven't been principled about when we can use "cached" descriptors and when we can't. Our understanding has improved significantly with the last iteration on the schema change RFC, but I'm not sure to what extent the code has caught up. I believe we can always use cached descriptors, as long as the cache is aware of the timestamps involved. I always thought that the allowCached in the Executor can probably go away, and also most of the session.tables structure. So I'd love it if you discussed the current state of caching here and check together that it makes any sense.

Other than that, for point #10 - the changes in session.go - LGTM with the comments I've sprinkled elsewhere - I suspect that we have too many types and interfaces now, but if you tell me we need them, I believe you.

Review status: 0 of 90 files reviewed at latest revision, 36 unresolved discussions, some commit checks broke.

pkg/sql/schema_accessors.go, line 25 at r1 (raw file):

)

// This file proposes to abstract access to the physical schema (the descriptors).

nit: "this file proposes" is a weird phrasing for a code file. "This file contains interfaces for accessing descriptors (physical schema)". It'd also be nice if a hint to what is the opposite of physical - what's the "logical" schema?

pkg/sql/schema_accessors.go, line 28 at r1 (raw file):

//
// The following interfaces are defined:
// - DatabaseAccessor, which provides access to database descriptors.

Unless there's good reason for so many interfaces and without knowing anything about anything, they look too many for my taste. If possible to have less...

pkg/sql/session.go, line 675 at r1 (raw file):

		StatusServer:          statusServer,
		MemMetrics:            s.memMetrics,
		Tables:                &s.tables,

can this guy go away now with the new schemaAccessors?

pkg/sql/session.go, line 681 at r1 (raw file):

		TxnModesSetter:        &s.TxnState,
		SchemaChangers:        &s.TxnState.schemaChangers,
		schemaAccessors:       scInterface,

I'd make this exported like everything else unless you have reason not to

Comments from Reviewable

rytaft · 2018-02-05T21:01:12Z

items 2 and 6 LGTM

Reviewed 7 of 90 files at r1.
Review status: 7 of 90 files reviewed at latest revision, 40 unresolved discussions, some commit checks broke.

pkg/sql/opt/build/scope.go, line 67 at r1 (raw file):

var _ tree.Visitor = &scope{}

// FindSourceMatchingName implements the tree.ColumnItemResolver interface.

[nit] fix name.

pkg/sql/opt/build/scope.go, line 76 at r1 (raw file):

			col := &s.cols[i]
			if col.matches("", colName) {
				// TODO(whomever): source names in a FROM clause also have a

Do you have a pointer we can look at for where this is done correctly (to fix this issue)?

pkg/sql/opt/build/scope.go, line 96 at r1 (raw file):

			if col.table == tblName {
				// TODO(whomever): this improperly fails to recognize when a source table
				// is ambiguous, e.g. SELECT kv.k FROM db1.kv, db2.kv

Ditto - do you have a pointer we can look at for where this is done correctly (to fix these issues)?

pkg/sql/sem/tree/name_resolution.go, line 124 at r1 (raw file):

	// `select "".crdb_internal.tables.table_id from "".crdb_internal.tables`.
	lastCheck := n.NumParts
	if lastCheck > 3 {

There are a lot of bare numbers here. Maybe make an enum for the different name parts?

Comments from Reviewable

BramGruneir · 2018-02-05T21:06:48Z

Item 5, won't this fail admin_test.go\TestAdminAPIDatabaseVirtual() ?

Reviewed 1 of 90 files at r1.
Review status: 8 of 90 files reviewed at latest revision, 40 unresolved discussions, some commit checks broke.

Comments from Reviewable

knz · 2018-02-05T22:03:05Z

Review status: 8 of 90 files reviewed at latest revision, 40 unresolved discussions, some commit checks broke.

pkg/sql/opt/build/scope.go, line 76 at r1 (raw file):

Previously, rytaft wrote…

Do you have a pointer we can look at for where this is done correctly (to fix this issue)?

pkg/sql/data_source.go, func (sources multiSourceInfo) findColumn

As I am currently rewriting this, I'll make an attempt at documenting it better.

Comments from Reviewable

vivekmenezes · 2018-02-06T17:39:37Z

Item 10 LGTM

Review status: 8 of 90 files reviewed at latest revision, 40 unresolved discussions, some commit checks broke.

Comments from Reviewable

vivekmenezes · 2018-02-06T17:54:47Z

Item 7 LGTM

Review status: 8 of 90 files reviewed at latest revision, 40 unresolved discussions, some commit checks broke.

Comments from Reviewable

knz · 2018-02-07T03:25:27Z

Bram yes indeed it broke that test. I had to remove it.

Peter, Rebecca, Vivek thank you!

Andrei regarding desc caching it would be great if we could confirm that indeed we can always use the cache, it would simplify the code very greatly. I am not fond of the hoops I had to jump through to keep that conditional. Let's talk soon.

I am still working on the PR, as CI revealed my last amend broke several things and Nathan reminded me I still had to update the virtual table generators. Will ping when I have news.

Review status: 0 of 88 files reviewed at latest revision, 40 unresolved discussions, all commit checks successful.

pkg/ccl/sqlccl/targets_test.go, line 70 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Does TABLE system.foo no longer work? That seems like a backward incompatible change. I thought we were trying to make that work, though, which means we should be adding new test cases instead of modifying existing ones.

It still works, I was just a bit too trigger-happy on the global search-and-replace. Added the issing tests.

pkg/config/zone.go, line 135 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/part in/part to the/g?

Done.

pkg/sql/create_index.go, line 42 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Note to self: what is this? Why is it not needed before ResolveTargetObject.

It exposes non-public descriptors to the resolution algorithm. This is because CREATE INDEX can target newly created tables.

pkg/sql/create_stats.go, line 38 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

No call to defer p.useNewDescriptors()(). I'm not sure what it does, but you're using it elsewhere that ResolveExistingObject is called.

CREATE STATS cannot target newly created tables inside the same txn (as indicated by the allowAdding=false argument to MustGetTableDesc in the previous code)

pkg/sql/data_source.go, line 384 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Did you intend to keep these? log.VEventf calls are not free and this is a low-level routine.

woops, oversight.

pkg/sql/data_source.go, line 587 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Do you need a defer p.useNewDescriptors()() line?

no, in here we're conditional on what the caller has set (idem as the previous code)

pkg/sql/data_source.go, line 806 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Reminder to address this. And below too.

yes thanks for noticing!

pkg/sql/database.go, line 107 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

nit: The previous name wasn't accurate, but this new name isn't accurate either. How about DatabaseStorage? Hmm, that is a bit odd too as this is only the namespace for databases.

I'll welcome any suggestion. No skin in this game.

pkg/sql/planner.go, line 216 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Reminder to address this.

yep thanks

pkg/sql/resolver.go, line 110 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Can you expand on when this needs to be called? You're often calling it before calling ResolveExistingObject, but not always. It isn't clear to me what the rule is.

Done.

pkg/sql/resolver.go, line 127 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

nit: This can be [...]string. The initialization otherwise stays the same.

oh cool TIL. Thanks!

pkg/sql/resolver.go, line 162 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

nit: This would be more readable as a line per field.

Done.

pkg/sql/schema_accessors.go, line 25 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: "this file proposes" is a weird phrasing for a code file. "This file contains interfaces for accessing descriptors (physical schema)". It'd also be nice if a hint to what is the opposite of physical - what's the "logical" schema?

Done.

pkg/sql/schema_accessors.go, line 28 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Unless there's good reason for so many interfaces and without knowing anything about anything, they look too many for my taste. If possible to have less...

I'd like to discuss this with you and jordan. Fewer interfaces make it harder to layer the functionality cleanly.

pkg/sql/schema_accessors.go, line 51 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I'm not sure I buy the convenience argument. Type aliases were mainly added to ease refactoring when you can't do so "atomically" (i.e. the refactor is spread across repos). On the other hand, I don't particularly mind this.

nit: These comments exceed 100 columns.

Fixed the width.

The utility came in by enabling me to write less code, and making the remaining declarations much more compact. It accelerated my work; for me this is a net win.

pkg/sql/sequence.go, line 263 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Reminder to address this (or convert to TODO(knz)).

no I need to fix this really before this can merge. looking into it.

pkg/sql/session.go, line 675 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

can this guy go away now with the new schemaAccessors?

huh hmm maybe I haven't thought about it. If there are cleanup opportunities I'll look into it at the end.

pkg/sql/session.go, line 681 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I'd make this exported like everything else unless you have reason not to

Yes I have a reason not to, that's to ensure the user clearly decides to either call LogicalSchemaAccessor() or PhysicalSchemaAccessor() as defined below.

pkg/sql/set_zone_config.go, line 76 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/Forzone/ForZone/g

Done.

pkg/sql/opt/scope.go, line 53 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] incorrect name.

Also "is part of" is better than "implements" (which really only makes sense if this is the only method in the interface).

Interesting, TIL. Modified.

pkg/sql/opt/build/scope.go, line 67 at r1 (raw file):

Previously, rytaft wrote…

[nit] fix name.

Doneprevious

pkg/sql/sem/tree/name_part.go, line 159 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit, not this change] actual length

Done.

pkg/sql/sem/tree/name_resolution.go, line 32 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] extra lines

Done.

pkg/sql/sem/tree/name_resolution.go, line 58 at r1 (raw file):

Previously, RaduBerinde wrote…

This construction code is repeated a bunch of times, maybe it should be a function.

Done.

pkg/sql/sem/tree/name_resolution.go, line 103 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Heh, TableName, tblName, TableName, and TableNamePrefix.

Simplified.

pkg/sql/sem/tree/name_resolution.go, line 115 at r1 (raw file):

Previously, RaduBerinde wrote…

This empty checking code appears in many places, it should probably be a separate function.

The 3 occurrences are oh so slightly different from each other. One doesn't use the same start value, and the 2 others don't use the same error message. Suggestions welcome.

pkg/sql/sem/tree/name_resolution.go, line 124 at r1 (raw file):

Previously, rytaft wrote…

There are a lot of bare numbers here. Maybe make an enum for the different name parts?

How do you suggest to call them?

pkg/sql/sem/tree/name_resolution.go, line 252 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/Try use/Try to use/g.

Should this be protected by if curDB != ""?

I had that initially, but the tests showed it was wrong :)

Added a comment to clarify.

pkg/sql/sem/tree/name_resolution.go, line 295 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/Try use/Try to use/g, s/currebt/current/g

Done.

pkg/sql/sem/tree/name_resolution_test.go, line 589 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I believe you can define this inside TestResolveTablePatternOrName.

TIL, done

pkg/sql/sem/tree/name_resolution_test.go, line 591 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

You can define this inside TestResolveTablePatternOrName.

Done

pkg/sql/sem/tree/name_resolution_test.go, line 604 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/Trror/Error/g

Done.

pkg/sql/sem/tree/name_resolution_test.go, line 612 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

You could get rid of the []string by changing MakeSearchPath take a ...string argument.

Yay! done.

pkg/sql/zone_test.go, line 68 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Shouldn't there be test cases for both system.jobs and system.public.jobs?

no my changes here were mistaken. the CLI specifier syntax (which will soon disappear, btw) is not the same as SQL syntax.

pkg/sql/pgwire/conn_test.go, line 176 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Did you intend to leave this in?

Nope!

pkg/sql/phy_schema_accessors.go, line 1 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

nit: Do we use the abbreviation phy_ elsewhere? I'd prefer to spell this out as physical_schema_accessors.go for consistency with logical_schema_accessors.

Ok done

pkg/sql/phy_schema_accessors.go, line 169 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

It is unfortunate that we're using Database in some places and Catalog in others and this is leaking throughout the code. Not for this PR: I think we should settle on one name and then leave the other isolated to some small piece of code.

Agreed

Comments from Reviewable

vivekmenezes · 2018-02-07T10:46:27Z

please do not make the change to have AS OF SYSTEM TIME queries use the descripotor cache in this change. I tried doing that this week and ran into some trouble. The scope of this change is large enough that we shouldn't do that here. Thanks!

The information_schema introspection queries were incorrect after the change made in cockroachdb/cockroach#22371.

knz · 2018-02-15T23:34:13Z

Merged the 6 schema accessor interfaces into just 2 as requested by Andrei.
Was also needed for correctness.

Review status: 98 of 233 files reviewed at latest revision, 34 unresolved discussions, some commit checks pending.

pkg/sql/logictest/testdata/logic_test/information_schema, line 284 at r7 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why are the other schemas hidden though? Shouldn't we see pg_catalog tables here?

Yes we'll see the pg_catalog tables here but only with table_catalog = 'test' (and table_schema = 'pg_catalog'). There is no row where table_catalog = 'other_db' because of the filtering.

pkg/sql/logictest/testdata/logic_test/information_schema, line 293 at r7 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Remove ORDER BY 1, use rowsort.

Done.

pkg/sql/logictest/testdata/logic_test/information_schema, line 1203 at r7 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Don't fully qualify these targets in the next two queries, since we just set the database.

Done.

pkg/sql/opt/optbuilder/builder.go, line 770 at r6 (raw file):

Previously, justinj (Justin Jaffray) wrote…

nit: I always thought these names were less about responsibility and more about giving the person who eventually tackles it a clue as to who to ask for help - I think it might be more valuable to have your name in here?

Oh interesting. TIL.
Nevertheless what is written here still matches your definition.

pkg/sql/sem/tree/name_resolution.go, line 211 at r5 (raw file):

Previously, RaduBerinde wrote…

It should be documented what the interface{} is (or that it's passed through opaquely), both here and for the Resolve functions. Actually, we should make some interfaces with dummy marker functions and use those instead of interface{}. That makes it clearer what's going on, and much easier to find out the possible types for scMeta/objMeta across the codebase.

Ok will do before this merges.

Comments from Reviewable

The information_schema introspection queries were incorrect after the change made in cockroachdb/cockroach#22371.

knz · 2018-02-16T01:58:08Z

Implemented the interface abstractions suggested by Radu.

Review status: 92 of 308 files reviewed at latest revision, 34 unresolved discussions, some commit checks pending.

Comments from Reviewable

"Why": Prior to this patch, CockroachDB only recognized names of the form "db.tbl", like MySQL, whereas PostgreSQL compatibility mandates that "db.public.tbl" be also valid. This needed a change. **What:** The visible tip of this iceberg patch is as follows: - new name resolution functions which are more correct and more easy to use than the previous code! Use `Normalize()` on a `TableName`, then `ResolveExistingObject()` or `ResolveTargetObject()`. - this in turn relies on generic, well-abstracted name resolution algorithms in `sem/tree/name_resolution.go`, the definition of which closely follows the specs in the accompanying RFC. This handles the pg syntax for catalog and schema names, together with compatibility rules with CockroachDB v1.x. - a new schema accessor interface that truly encapsulates schema access! Check out the gorgeous documentation in `sql/schema_accessors.go`. In particular: - `planner` now implements a new interface `SchemaResolver`. Use it. - if you are not happy with `SchemaResolver` directly, use `SchemaAccessor` or its various consistuent interfaces. Depending on whether virtual schemas are to be considered, consider alternatively `planner.LogicalSchemaAccessor()` and `planner.PhysicalSchemaAccessor()`. One may be surprised to see this schema accessor work happen in the same patch. This was, unfortunately, a requirement to successfully plug catalog and schema name resolution in all the right places. Also, it incidentally answers a long-standing demand for a better interface to access descriptors. **How:** The itinerary to achieve this brought me through the following steps: - the various name resolution algorithms are concentrated in a new file `sql/sem/tree/name_resolution.go`. They use abstract interfaces to interface with the "name provider": database, table, column things coming from the database. - in the process of writing the name resolution algorithms, I found that our previous understanding of name resolution was problematic: - the previous code assumed it was always possible to "qualify a name using the current database" by just looking at the name itself and the current database, without any additional information. In contrast, the correct qualification algorithms requires both the current database, the search path, and descriptor lookups. This is why this patch fuses all occurrences of separate "qualification" and "resolution" phases together. Before: `tn = tn.QualifyWithDatabase(tcurDb); desc = MustGetDesc(tn)` After: `desc = ResolveExistingObject(p, tn)` - the resolution code was pushing a `VirtualTabler` very deep in the lookup routines, with the mistaken understanding that VirtualTabler is able to respond to requests for database names. In contrast, VirtualTabler really has nothing to do with database names, and the corresponding code had to get rid of it. This was the first motivation for implementing new schema accessor interfaces. - the path to decide whether to use cached or non-cached descriptors was very hard to read; in many instances the code was using uncached descriptors where cached descriptors would be needed instead. The various APIs were a mess, and information needed to decide whether a lookup was possible or not was not available at the right level(s) of abstraction. This was the second motivation for implementing new schema accessor interfaces. - Once this all was said done, the various consumers of name resolution had to implement the interfaces. They are: - resolution of zone specifiers; - resolution of target names for CCL statements; - resolution of names (both existing and targets) in the `sql` package; - some testing code in `sql/opt` and `sql/opt/build`. - the virtual schemas and tables. Release note (sql change): CockroachDB now recognizes the syntax `db.public.tbl` in addition to `db.tbl`, for better compatibility with PostgreSQL. The handling of the session variable `search_path`, as well as that of the built-in functions `current_schemas()` and `current_schema()`, is now closer to that of PostgreSQL. Release note (sql change): SHOW TABLES FROM can now inspect the tables of a specific shema, for example `SHOW TABLES FROM db.public` or `SHOW TABLES FROM db.pg_catalog`. Release note (sql change): SHOW GRANTS now also shows the schema of the databases and tables.

knz · 2018-02-16T03:45:18Z

Replaced by #22753.

vivekmenezes · 2018-02-16T14:00:43Z

Great job! I'm cuurious as to why you needed another PR to merge this?

…

On Thu, Feb 15, 2018 at 10:45 PM kena ***@***.***> wrote: Closed #22371 <#22371>. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#22371 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALOpBDzPOiOBv-egU7eoWWg62gljW_UXks5tVPnhgaJpZM4R48p8> .

knz · 2018-02-16T14:13:15Z

I had based this PR off another branch than master, where I had accumulated some unrelated troubleshooting changes that were aiding development. Unfortunately github does not allow me to change the base branch before merging.

I could have merged this PR, and then created another PR to merge the base branch into master; but in the end there would still be two PRs.

The name resolution overhaul (cockroachdb#22371) caused TestDockerCSharp to start failing. Specifically, I suspect that Npgsql performs some query on a table in the pg_catalog schema, which now requires the session database to exist. There seems to be no way to prevent Npgsql from performing this query, so simply instruct it to connect to the system database, which is guaranteed to exist. Also delete a test that checked the oid of the pg_catalog namespace entry. That's changed since the test has been skipped, and seems like a moving target. I'm not sure why we cared about the oid's value in the first palce. Fix cockroachdb#22769. Release note: None

knz requested review from a team February 5, 2018 05:46

knz requested a review from a team as a code owner February 5, 2018 05:46

knz requested review from a team February 5, 2018 05:46

knz force-pushed the 20180202-schema-proto branch from cee36a0 to aaaba1e Compare February 5, 2018 05:53

knz force-pushed the knz/schema-base branch from 6bb56ab to 9651b3e Compare February 7, 2018 02:36

knz force-pushed the 20180202-schema-proto branch from aaaba1e to 603eded Compare February 7, 2018 03:22

knz force-pushed the 20180202-schema-proto branch from 603eded to 3e201f7 Compare February 7, 2018 03:26

nvanbenschoten added a commit to cockroachdb/examples-orms that referenced this pull request Feb 15, 2018

testing: adapt information_schema queries to new schema structure

bbe2745

The information_schema introspection queries were incorrect after the change made in cockroachdb/cockroach#22371.

knz force-pushed the 20180202-schema-proto branch from 1f2c20d to a91748c Compare February 15, 2018 23:31

nvanbenschoten added a commit to cockroachdb/examples-orms that referenced this pull request Feb 15, 2018

testing: adapt information_schema queries to new schema structure

ddaf74b

The information_schema introspection queries were incorrect after the change made in cockroachdb/cockroach#22371.

nvanbenschoten mentioned this pull request Feb 15, 2018

testing: adapt information_schema queries to new schema structure cockroachdb/examples-orms#45

Merged

knz force-pushed the 20180202-schema-proto branch from a91748c to b470b02 Compare February 15, 2018 23:38

nvanbenschoten added a commit to cockroachdb/examples-orms that referenced this pull request Feb 15, 2018

testing: adapt information_schema queries to new schema structure

96d6396

The information_schema introspection queries were incorrect after the change made in cockroachdb/cockroach#22371.

knz force-pushed the 20180202-schema-proto branch 4 times, most recently from fe8c690 to 310277e Compare February 16, 2018 00:31

knz mentioned this pull request Feb 16, 2018

acceptance/csharp: the dotnet pg connector expects a db named after the username #22769

Closed

knz force-pushed the 20180202-schema-proto branch 3 times, most recently from 528adaf to b0d2d32 Compare February 16, 2018 01:55

knz force-pushed the 20180202-schema-proto branch 2 times, most recently from da47831 to fa6797e Compare February 16, 2018 02:26

knz force-pushed the 20180202-schema-proto branch from fa6797e to 15b73e3 Compare February 16, 2018 03:07

knz closed this Feb 16, 2018

knz mentioned this pull request Feb 16, 2018

rfcs: proposal to make schemas more pg-compatible #21456

Merged

benesch mentioned this pull request Mar 22, 2018

acceptance: fix and reenable TestDockerCSharp #24141

Merged

knz deleted the 20180202-schema-proto branch April 27, 2018 18:41

knz mentioned this pull request May 3, 2018

sql: allow virtual table descriptors in a concrete db #10319

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql, *: make CockroachDB understand schema names #22371

sql, *: make CockroachDB understand schema names #22371

knz commented Feb 5, 2018 •

edited

Loading

cockroach-teamcity commented Feb 5, 2018

knz commented Feb 5, 2018 •

edited

Loading

RaduBerinde commented Feb 5, 2018

RaduBerinde commented Feb 5, 2018

jordanlewis commented Feb 5, 2018

knz commented Feb 5, 2018 via email

jordanlewis commented Feb 5, 2018

andreimatei commented Feb 5, 2018

petermattis commented Feb 5, 2018

andreimatei commented Feb 5, 2018

rytaft commented Feb 5, 2018

BramGruneir commented Feb 5, 2018

knz commented Feb 5, 2018

vivekmenezes commented Feb 6, 2018

vivekmenezes commented Feb 6, 2018

knz commented Feb 7, 2018

vivekmenezes commented Feb 7, 2018

knz commented Feb 15, 2018

knz commented Feb 16, 2018

knz commented Feb 16, 2018

vivekmenezes commented Feb 16, 2018 via email

knz commented Feb 16, 2018

sql, *: make CockroachDB understand schema names #22371

sql, *: make CockroachDB understand schema names #22371

Conversation

knz commented Feb 5, 2018 • edited Loading

"What"

"How"

Details and suggestions to the reviewers

cockroach-teamcity commented Feb 5, 2018

knz commented Feb 5, 2018 • edited Loading

RaduBerinde commented Feb 5, 2018

RaduBerinde commented Feb 5, 2018

jordanlewis commented Feb 5, 2018

knz commented Feb 5, 2018 via email

jordanlewis commented Feb 5, 2018

andreimatei commented Feb 5, 2018

petermattis commented Feb 5, 2018

andreimatei commented Feb 5, 2018

rytaft commented Feb 5, 2018

BramGruneir commented Feb 5, 2018

knz commented Feb 5, 2018

vivekmenezes commented Feb 6, 2018

vivekmenezes commented Feb 6, 2018

knz commented Feb 7, 2018

vivekmenezes commented Feb 7, 2018

knz commented Feb 15, 2018

knz commented Feb 16, 2018

knz commented Feb 16, 2018

vivekmenezes commented Feb 16, 2018 via email

knz commented Feb 16, 2018

knz commented Feb 5, 2018 •

edited

Loading

knz commented Feb 5, 2018 •

edited

Loading