sql: make pg_catalog OIDs stable and refer to the descriptor IDs directly #33697

hueypark · 2019-01-13T11:47:25Z

Fixes #32940
Fixes #32963
Fixes #34710
Enables #32964

This patch changes the definition of vtables to assign a unique table
ID to every virtual table. Moreover, it also extends vtable
descriptors to assign proper column IDs to virtual table columns. This
aims to support the logical planning code, the table and plan caches,
and simplify+fix the introspection tables. Incidentally it also makes
it possible to use COMMENT ON on virtual tables and their columns too.

The table IDs are picked descending from MaxUint32 although this may
be refined in future PRs to accommodate the numbering of virtual
schema descriptors.

Incidentally pg_catalog.pg_attribute is fixed to properly use the
column ID in column attnum, so that attnum remains stable across
column drops.

Release note (sql change): Virtual tables in pg_catalog,
information_schema etc now support COMMENT ON like regular tables.

cockroach-teamcity · 2019-01-13T11:47:32Z

This change is

knz

Hi again!
Thank you very much for your change, and the code as usual looks very good.

I'd like to come back to two points.

the numbering approach. You have chosen 1, 19, 12 bits for the 3 parts. This raises several questions:
- how did you choose these 3 values. Did you perform experiments? Did you conduct reasoning? In any case, the motivation for this choice must be explained in the commit message and as a comment close to where this is defined in the code.
- why do you allocate separate areas in the OID for table IDs and column IDs. If the first bit is set, then I'd expect all remaining 30 bits to become a column ID, and conversely if the first bit is not set, then all remaining 30 bits become a table ID. Why do you need both table and column ID side-by-side?
the release note: this must explain what problem this is solving. You should explain in the commit message what was the problem before this PR, and what is the new situation is afterwards, (how this is different) and why it is good. For example: "previous to this patch, CockroachDB had problem X. This was inconvenient because it would cause users to experience Y and Z. This patch addresses problem X by doing ABC. Now users cannot experience Y any more because of AB, and Z because of C."

Reviewed 15 of 15 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained

It turns out that many virtual tables (e.g. in `pg_catalog` schema) have instances for each catalog (and the context differs within each catalog). Unfortunately, these instances all share the same virtual table ID. This violates our assumptions and prevents us from invalidating cached plans properly. This change fixes this by also putting the database ID in the StableID for virtual tables. Note that the situation is in fact even worse: all virtual tables have the *same* ID. This is being addressed separately (cockroachdb#33697, cockroachdb#32963). Fortunately this doesn't currently cause problems in practice because virtual tables have static names and can't be involved in FKs. Release note: None

hueypark · 2019-02-03T03:10:51Z

@knz
Thank you for your kind review.

I would like to hear advice before improving PR. I have to allocate separate areas in the OID for table IDs and column IDs. It is because the column ID in the cockroachdb is not unique.

Solution 1. Make cockroachdb column ID unique.
- Pros: The column ID can be easily changed to an OID like a table ID, and the overall OID processing is simplified.
- Cons: Since the previous column ID determination method are different, separate work for supporting the lower version is needed. Maybe it will be available in 3.x versions.
Solution 2. Assign an appropriate bit for ID like my PR.
- Pros: Do not need to worry about backward compatibility.
- Cons: The range of IDs we can use is dramatically reduced.

Which way do you think is more appropriate

knz · 2019-02-04T10:12:32Z

I have to allocate separate areas in the OID for table IDs and column IDs. It is because the column ID in the cockroachdb is not unique.

I understand the column IDs in CockroachDB are not unique. They are not unique in PostgreSQL either.

Whether that matters is another question -- why do you think it matters? Is there a pg_catalog table or elsewhere where all the columns are listed, and their column ID is a primary key or unique column for the table?

hueypark · 2019-02-06T14:24:29Z

Thank you for your advice. Now I agree that the column ID need not be unique.

I have added a new commit, so please check back.

knz

Yes this looks better.

Two final remarks:

since you now have ensured that all tables have distinct IDs, even virtual tables, it's possible to simplify the h.IndexOid method, see my comment below.
Now, I have double-checked and it appears that postgres does indeed allocate globally-unique OIDs for columns (not column IDs) in very few place, like pg_attrdef. For these uses (and these uses only) I encourage you to add a hash function that combines the table ID with the column ID. See my recommendation below.

Reviewed 14 of 14 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @hueypark)

pkg/sql/pg_catalog.go, line 334 at r2 (raw file):

					defSrc := tree.NewDString(*column.DefaultExpr)
					return addRow(
						columnOid(column.ID),            // oid

here use h.ColumnID(table.ID, column.ID)

pkg/sql/pg_catalog.go, line 2429 at r2 (raw file):

func (h oidHasher) writeDB(db *sqlbase.DatabaseDescriptor) {
	h.writeUInt32(uint32(db.ID))
	h.writeStr(db.Name)

I think the use of the database name can be removed here.

pkg/sql/pg_catalog.go, line 2438 at r2 (raw file):

func (h oidHasher) writeTable(table *sqlbase.TableDescriptor) {
	h.writeUInt32(uint32(table.ID))
	h.writeStr(table.Name)

The table name can be removed here, because now all the table IDs are unique.

pkg/sql/pg_catalog.go, line 2471 at r2 (raw file):

	h.writeTypeTag(indexTypeTag)
	h.writeDB(db)
	h.writeSchema(scName)

you can remove the database and schema from here because now all table IDs are unique.

pkg/sql/pg_catalog.go, line 2505 at r2 (raw file):

}

func (h oidHasher) ColumnOid(

I recommend re-introducing this, with only the two arguments TableDescriptor+ColumnDescriptor.

pkg/sql/pg_catalog.go, line 2574 at r2 (raw file):

}

func columnOid(columnID sqlbase.ColumnID) *tree.DOid {

prefer h.ColumnOid instead.

knz

This is very good! thank you!

Reviewed 10 of 10 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained

…ctly This patch changes the definition of vtables to assign a unique table ID to every virtual table. Moreover, it also extends vtable descriptors to assign proper column IDs to virtual table columns. This aims to support the logical planning code, the table and plan caches, and simplify+fix the introspection tables. Incidentally it also makes it possible to use COMMENT ON on virtual tables and their columns too. The table IDs are picked descending from MaxUint32 although this may be refined in future PRs to accommodate the numbering of virtual schema descriptors. Incidentally `pg_catalog.pg_attribute` is fixed to properly use the column ID in column `attnum`, so that `attnum` remains stable across column drops. Release note (sql change): Virtual tables in `pg_catalog`, `information_schema` etc now support `COMMENT ON` like regular tables.

knz · 2019-02-08T13:11:54Z

bors r+

knz · 2019-02-08T13:12:39Z

bors r-

craig · 2019-02-08T13:12:39Z

Canceled

knz · 2019-02-08T13:12:46Z

bors r+

33697: sql: make pg_catalog OIDs stable and refer to the descriptor IDs directly r=knz a=hueypark Fixes #32940 Fixes #32963 Fixes #34710 Enables #32964 This patch changes the definition of vtables to assign a unique table ID to every virtual table. Moreover, it also extends vtable descriptors to assign proper column IDs to virtual table columns. This aims to support the logical planning code, the table and plan caches, and simplify+fix the introspection tables. Incidentally it also makes it possible to use COMMENT ON on virtual tables and their columns too. The table IDs are picked descending from MaxUint32 although this may be refined in future PRs to accommodate the numbering of virtual schema descriptors. Incidentally `pg_catalog.pg_attribute` is fixed to properly use the column ID in column `attnum`, so that `attnum` remains stable across column drops. Release note (sql change): Virtual tables in `pg_catalog`, `information_schema` etc now support `COMMENT ON` like regular tables. Co-authored-by: Jaewan Park <jaewan.huey.park@gmail.com>

hueypark · 2019-02-08T13:15:17Z

@knz
Thank you for your careful and detailed review.

knz · 2019-02-08T13:15:43Z

@andy-kimball @RaduBerinde you're going to love this :)

knz · 2019-02-08T13:20:45Z

@hueypark we are very grateful for this change.

Even though you were primarily interested in ensuring compatibility with COMMENT ON, this change is very influential for many projects inside CockroachDB. With this change you have unlocked progress in multiple areas at once. This is not only very good work; it has larger-scale value for the future of CockroachDB too.

This uses column IDs directly for the `attnum` column, so that the value remains stable across column drops. This is a sub-set of the changes in cockroachdb#33697. Required by ORMs, requested for by the TypeORM dev looking at crdb compat in TypeORM. Release note (bug fix): the value of the `attnum` column in `pg_catalog.pg_attribute` now remains stable across column drops.

craig · 2019-02-08T13:43:20Z

Build succeeded

GitHub CI (Cockroach)

hueypark · 2019-02-08T13:54:48Z

@knz Thank you for your praise. It is a great motivation for me.

RaduBerinde · 2019-02-08T19:11:26Z

Thanks @hueypark, this is great!

hueypark requested review from a team January 13, 2019 11:47

hueypark requested a review from a team as a code owner January 13, 2019 11:47

hueypark requested review from a team January 13, 2019 11:47

hueypark mentioned this pull request Jan 13, 2019

sql: make pg_catalog OIDs stable and refer to the descriptor IDs directly #32940

Closed

hueypark force-pushed the pg-oid-2 branch from c645c8b to b056e7f Compare January 15, 2019 14:11

knz mentioned this pull request Jan 29, 2019

opt: fix cached plan invalidation with some virtual tables #34302

Merged

knz reviewed Jan 29, 2019

View reviewed changes

hueypark force-pushed the pg-oid-2 branch from b056e7f to 0e6771a Compare February 6, 2019 14:19

knz reviewed Feb 7, 2019

View reviewed changes

hueypark force-pushed the pg-oid-2 branch from 0e6771a to 03e2ec7 Compare February 8, 2019 12:23

knz approved these changes Feb 8, 2019

View reviewed changes

knz force-pushed the pg-oid-2 branch from 03e2ec7 to b1580f2 Compare February 8, 2019 13:09

knz mentioned this pull request Feb 8, 2019

Unexpected behaviour in "pg_attribute" on column drop and recreate #34710

Closed

knz mentioned this pull request Feb 8, 2019

backport-2.1: sql: fix pg_catalog.pg_attribute #34734

Merged

craig bot merged commit b1580f2 into cockroachdb:master Feb 8, 2019

smklein mentioned this pull request Jan 2, 2024

Optimize OID lookup for user-defined types oxidecomputer/omicron#4735

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: make pg_catalog OIDs stable and refer to the descriptor IDs directly #33697

sql: make pg_catalog OIDs stable and refer to the descriptor IDs directly #33697

hueypark commented Jan 13, 2019 •

edited by knz

Loading

cockroach-teamcity commented Jan 13, 2019

knz left a comment •

edited

Loading

hueypark commented Feb 3, 2019

knz commented Feb 4, 2019

hueypark commented Feb 6, 2019

knz left a comment

knz left a comment

knz commented Feb 8, 2019

knz commented Feb 8, 2019

craig bot commented Feb 8, 2019

knz commented Feb 8, 2019

hueypark commented Feb 8, 2019

knz commented Feb 8, 2019

knz commented Feb 8, 2019

craig bot commented Feb 8, 2019

hueypark commented Feb 8, 2019

RaduBerinde commented Feb 8, 2019

sql: make pg_catalog OIDs stable and refer to the descriptor IDs directly #33697

sql: make pg_catalog OIDs stable and refer to the descriptor IDs directly #33697

Conversation

hueypark commented Jan 13, 2019 • edited by knz Loading

cockroach-teamcity commented Jan 13, 2019

knz left a comment • edited Loading

Choose a reason for hiding this comment

hueypark commented Feb 3, 2019

knz commented Feb 4, 2019

hueypark commented Feb 6, 2019

knz left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

knz commented Feb 8, 2019

knz commented Feb 8, 2019

craig bot commented Feb 8, 2019

Canceled

knz commented Feb 8, 2019

hueypark commented Feb 8, 2019

knz commented Feb 8, 2019

knz commented Feb 8, 2019

craig bot commented Feb 8, 2019

Build succeeded

hueypark commented Feb 8, 2019

RaduBerinde commented Feb 8, 2019

hueypark commented Jan 13, 2019 •

edited by knz

Loading

knz left a comment •

edited

Loading