Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: implement COMMENT ON TABLE #32442

Merged
merged 1 commit into from
Dec 10, 2018
Merged

sql: implement COMMENT ON TABLE #32442

merged 1 commit into from
Dec 10, 2018

Conversation

hueypark
Copy link
Contributor

@hueypark hueypark commented Nov 17, 2018

Informs #19472.

This patch introduces support for table comments.

The syntax to set or delete a comment is the same as postgres:
COMMENT ON TABLE ... IS .... See:
https://postgresql.org/docs/9.1/sql-comment.html

This also makes pg_catalog.pg_description and obj_description() do
the right thing for compatibility with 3rd party clients.

This is supported by a new system table system.comments, which is
extensible to support comments on other database objects than tables:

  • its type column indicates the type of object, to distinguish
    between db, table, column and others. For now just one type is
    defined.
  • object_id: table or database ID, relative to the type.
  • sub_id: when a comment is placed on an object "inside" another, eg
    a column inside a table.
  • comment: the comment proper.

This design of system.comments mimics pg's own pg_description
which uses the same schema.

Release note (sql change): CockroachDB now supports associating
comments to SQL tables using PostgreSQL's COMMENT ON TABLE
syntax. This also provides proper support for pg's
pg_catalog.pg_description and built-in function obj_description().

Release note (sql change): The SHOW TABLES statement
now supports printing out table comments using the optional phrase
WITH COMMENT, e.g SHOW TABLES FROM mydb WITH COMMENT.

@hueypark hueypark requested review from a team November 17, 2018 04:25
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz
Copy link
Contributor

knz commented Nov 20, 2018

Sorry I did not get back to you earlier. Will look now.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so your change overall is valid and achieves the functionality. Thank you for taking a look.

That being said it is creating a "special case" for table comments. If we wanted to generalize this PR to also support database comments, we would need a separate field on database descriptors. Then if we wanted to support column comments, we would need to add separate fields on column descriptors. And so on.

All these new fields would increase the overa;l memory usage by descriptors. Not to count that all the updates will need different code paths to flush the descriptor caches, query them in introspection, etc.

This is the pitfall that Bram was pointing out in #19472 (comment): the more extensible approach is to store the comments in a system table.

The system table would be created like this:

CREATE TABLE system.comments (
   type INT NOT NULL, -- type of object, to distinguish between db and non-db objects
   object_id INT NOT NULL, -- main object ID, this will be usually db/table desc ID
   sub_id INT, -- sub-ID for columns inside table, NULL for other types
   comment STRING, -- the comment
   UNIQUE (type, object_id, sub_id)
)

Then COMMENT ON TABLE db.x would be translated (and planned as)

UPSERT INTO system.comments (type, object_id, comment)
  VALUES('table', tbl_id, comment)

And we would need to be careful during TRUNCATE (which generates new table IDs) to issue an UPDATE on system.comments as well.

Does this make sense to you?

Reviewed 16 of 16 files at r1, 5 of 5 files at r2, 1 of 2 files at r3, 2 of 2 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@hueypark
Copy link
Contributor Author

I understand. I'll be back with improved implementation soon.

@hueypark hueypark closed this Nov 20, 2018
@hueypark hueypark reopened this Nov 23, 2018
@hueypark hueypark requested review from a team November 23, 2018 23:58
@hueypark hueypark requested a review from a team as a code owner November 23, 2018 23:58
@hueypark hueypark requested review from a team November 23, 2018 23:58
@hueypark hueypark requested a review from a team as a code owner November 23, 2018 23:58
@hueypark hueypark requested a review from a team November 23, 2018 23:58
@CLAassistant
Copy link

CLAassistant commented Nov 23, 2018

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very good! I have looked at the PR and this goes all in the right direction.

To complete the reviews we are going to work in two steps. First I will make general high-level comments, and we can iterate on that. Once we have addressed high-level comments I will make more detailed / fine comments.

The first high level comments:

  1. please include a unit test in pkg/sql that adds a comment on a table, then verifies via pg_description that it is visible/available. Then include another unit test that verifies that a comment can be erased.

  2. you have chosen to place postgres OIDs (h.TableOid()) directly in the system.comment table. I think for now, we would prefer to not use postgres-specific logic (e.g. the OID computation) in the system tables. So I think it would be better to store the CockroachDB table IDs / column IDs inside system.comments and compute the postgres OID only when pg_description is queried.

Regarding this point, however, I would also like some advice from @jordanlewis: the OID computation (oidHasher.TableOid) currently uses table descriptors as input. However the proposed change here will populate just the table ID in the system.comments table. So we would need to look up the descriptor before we can produce an Oid. This seems wasteful: why cannot we compute a table Oid from the table ID directly?

  1. this point is related to point 1 above. You need to explain (in English, in the commit message) what is the data model you are using in this change. The data model is an explanation of the data structure used to represent comments, with examples. Your explanation must state:
  • how the system.comment table is holding comments for tables.
  • what happens when new tables are added, existing tables are deleted or renamed.
  • it should also suggest how we can add support for database and column comments in the future.
  1. you need to extend your PR to also support TRUNCATE. In CockroachDB, truncate is equivalent to a DROP + CREATE, so the table ID changes. Your PR must extend the TRUNCATE logic to move the comment from the old ID to the new ID. This also must be exercised in a unit test.

  2. I think you also need to extend the PR to support DROP TABLE (so that it also removes the comment). This also needs a test.

  3. it would be useful to include the table comment as an additional column in the output of SHOW TABLES (this is optional - do not feel obliged to do this. It is "nice to have". If you choose to do this, consider using a LEFT OUTER JOIN to populate this column.) It is interesting to think about this problem because it can help think about points 2 and 3 above.

Reviewed 18 of 18 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/keys/constants.go, line 334 at r5 (raw file):

	// CommentType is type for system.comments
	TableCommentType = 1

Can you explain to me what this value is for?

Copy link
Contributor Author

@hueypark hueypark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I will do that soon.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the extra step!

There is just one simplification needed and then this will be good to go.

I also kindly request that you squash all the commits into one before we merge. The various release notes must appear under each other in the (single) commit message.

Thank you again

Reviewed 1 of 2 files at r8, 15 of 25 files at r9, 1 of 1 files at r10.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/ccl/importccl/import_stmt_test.go, line 457 at r10 (raw file):

			data: testPgdumpFk,
			query: map[string][][]string{
				`SHOW TABLES WITH COMMENT`: {{"cities", "NULL"}, {"weather", "NULL"}},

I think in this test it is better to remove with comment and come back to the original code.


pkg/sql/logictest/testdata/logic_test/crdb_internal, line 13 at r10 (raw file):


query TT
SHOW TABLES FROM crdb_internal WITH COMMENT

please remove with comment here and revert the test to the original code.


pkg/sql/logictest/testdata/logic_test/fk, line 809 at r10 (raw file):

# refers is not visible because it is in the ADD state.
query TT
SHOW TABLES FROM test WITH COMMENT

ditto here and all the other logic test files.


pkg/sql/logictest/testdata/logic_test/show_source, line 142 at r10 (raw file):


query TT colnames
SELECT * FROM [SHOW TABLES FROM system WITH COMMENT]

This is the only test file where you are going to keep WITH COMMENT.

However, it should test both variants: one test with "WITH COMMENT", and one test without.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also rebase your branch on top of master to resolve the conflicts.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor Author

@hueypark hueypark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/logictest/testdata/logic_test/show_source, line 142 at r10 (raw file):

Previously, knz (kena) wrote…

This is the only test file where you are going to keep WITH COMMENT.

However, it should test both variants: one test with "WITH COMMENT", and one test without.

Done.


pkg/ccl/importccl/import_stmt_test.go, line 457 at r10 (raw file):

Previously, knz (kena) wrote…

I think in this test it is better to remove with comment and come back to the original code.

Done.


pkg/sql/logictest/testdata/logic_test/crdb_internal, line 13 at r10 (raw file):

Previously, knz (kena) wrote…

please remove with comment here and revert the test to the original code.

Done.


pkg/sql/logictest/testdata/logic_test/fk, line 809 at r10 (raw file):

Previously, knz (kena) wrote…

ditto here and all the other logic test files.

Done.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: - thank you so much for your contribution.

Reviewed 28 of 28 files at r11.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/show_tables.go, line 56 at r6 (raw file):

Previously, hueypark (Jaewan Park) wrote…

I still think we need that check. Examples follow,

Before the start we need preparation.

CREATE TABLE foo (a INT);
TRUNCATE foo;
  1. If I do not check PUBLIC case, we got DROP state rows,
SELECT
	i.table_name, c.comment
FROM
	huey.information_schema.tables AS i
	LEFT JOIN crdb_internal.tables AS t
	ON
		i.table_name = t.name
		AND i.table_catalog = t.database_name
	LEFT JOIN system.comments AS c
	ON t.table_id = c.object_id
WHERE
	table_schema = 'public'
	AND (t.state = 'PUBLIC' OR t.state IS NULL)
	AND (t.database_name = 'huey' OR t.database_name IS NULL)
ORDER BY
	table_schema, table_name;
---
table_name comment
foo NULL

If I remove 'PUBLIC' state check,

SELECT
	i.table_name, c.comment
FROM
	huey.information_schema.tables AS i
	LEFT JOIN crdb_internal.tables AS t
	ON
		i.table_name = t.name
		AND i.table_catalog = t.database_name
	LEFT JOIN system.comments AS c
	ON t.table_id = c.object_id
WHERE
	table_schema = 'public'
	--AND (t.state = 'PUBLIC' OR t.state IS NULL)
	AND (t.database_name = 'huey' OR t.database_name IS NULL)
ORDER BY
	table_schema, table_name;
---
table_name comment
foo NULL
foo NULL
  1. and if I do not check NULL case SHOW TABLES FROM crdb_internal will return 0 row.
SELECT
	i.table_name, c.comment
FROM
	huey.information_schema.tables AS i
	LEFT JOIN crdb_internal.tables AS t
	ON
		i.table_name = t.name
		AND i.table_catalog = t.database_name
	LEFT JOIN system.comments AS c
	ON t.table_id = c.object_id
WHERE
	table_schema = 'crdb_internal'
	AND (t.state = 'PUBLIC' OR t.state IS NULL)
	AND (t.database_name = 'huey' OR t.database_name IS NULL)
ORDER BY
	table_schema, table_name;
---
many rows...

If I remove NULL state check,

SELECT
	i.table_name, c.comment
FROM
	huey.information_schema.tables AS i
	LEFT JOIN crdb_internal.tables AS t
	ON
		i.table_name = t.name
		AND i.table_catalog = t.database_name
	LEFT JOIN system.comments AS c
	ON t.table_id = c.object_id
WHERE
	table_schema = 'crdb_internal'
	AND (t.state = 'PUBLIC')-- OR t.state IS NULL)
	AND (t.database_name = 'huey' OR t.database_name IS NULL)
ORDER BY
	table_schema, table_name;
---
no row

Yes I see it now. Thank you.

@knz knz force-pushed the master branch 2 times, most recently from 3e04d7a to 1f2cdbb Compare December 9, 2018 18:50
@knz
Copy link
Contributor

knz commented Dec 9, 2018

@bdarnell I have decided to keep the comment output of SHOW TABLES optional in the first iteration because it is rather disruptive to a lot of tests, and we have made 3rd party tools (dbeaver) very unhappy recently by changing the schema.

I suggest we postpone the change "make comment visible always" until later in the cycle when we enter the QA process of verifying compatibility.

@knz
Copy link
Contributor

knz commented Dec 9, 2018

bors r+

craig bot pushed a commit that referenced this pull request Dec 9, 2018
32442: sql: implement `COMMENT ON TABLE` r=knz a=hueypark

Informs  #19472.

This patch introduces support for table comments.

The syntax to set or delete a comment is the same as postgres:
`COMMENT ON TABLE ... IS ...`. See:
postgresql.org/docs/9.1/sql-comment.html

This also makes `pg_catalog.pg_description` and `obj_description()` do
the right thing for compatibility with 3rd party clients.

This is supported by a new system table `system.comments`, which is
extensible to support comments on other database objects than tables:

- its `type` column indicates the type of object, to distinguish
  between db, table, column and others. For now just one type is
  defined.
- `object_id`: table or database ID, relative to the `type`.
- `sub_id`: when a comment is placed on an object "inside" another, eg
  a column inside a table.
- `comment`: the comment proper.

This design of `system.comments` mimics pg's own `pg_description`
which uses the same schema.

Release note (sql change): CockroachDB now supports associating
comments to SQL tables using PostgreSQL's `COMMENT ON TABLE`
syntax. This also provides proper support for pg's
`pg_catalog.pg_description` and built-in function `obj_description()`.

Release note (sql change): The `SHOW TABLES` statement
now supports printing out table comments using the optional phrase
`WITH COMMENT`, e.g `SHOW TABLES FROM mydb WITH COMMENT`.

Co-authored-by: Jaewan Park <jaewan.huey.park@gmail.com>
@craig
Copy link
Contributor

craig bot commented Dec 9, 2018

Build failed

This patch introduces support for table comments.

The syntax to set or delete a comment is the same as postgres:
`COMMENT ON TABLE ... IS ...`. See:
https://www.postgresql.org/docs/9.1/sql-comment.html

This also makes `pg_catalog.pg_description` and `obj_description()` do
the right thing for compatibility with 3rd party clients.

This is supported by a new system table `system.comments`, which is
extensible to support comments on other database objects than tables:

- its `type` column indicates the type of object, to distinguish
  between db, table, column and others. For now just one type is
  defined.
- `object_id`: table or database ID, relative to the `type`.
- `sub_id`: when a comment is placed on an object "inside" another, eg
  a column inside a table.
- `comment`: the comment proper.

This design of `system.comments` mimics pg's own `pg_description`
which uses the same schema.

Release note (sql change): CockroachDB now supports associating
comments to SQL tables using PostgreSQL's `COMMENT ON TABLE`
syntax. This also provides proper support for pg's
`pg_catalog.pg_description` and built-in function `obj_description()`.

Release note (sql change): The `SHOW TABLES` statement
now supports printing out table comments using the optional phrase
`WITH COMMENT`, e.g `SHOW TABLES FROM mydb WITH COMMENT`.
@knz
Copy link
Contributor

knz commented Dec 10, 2018

bors r+

craig bot pushed a commit that referenced this pull request Dec 10, 2018
32442: sql: implement `COMMENT ON TABLE` r=knz a=hueypark

Informs  #19472.

This patch introduces support for table comments.

The syntax to set or delete a comment is the same as postgres:
`COMMENT ON TABLE ... IS ...`. See:
https://postgresql.org/docs/9.1/sql-comment.html

This also makes `pg_catalog.pg_description` and `obj_description()` do
the right thing for compatibility with 3rd party clients.

This is supported by a new system table `system.comments`, which is
extensible to support comments on other database objects than tables:

- its `type` column indicates the type of object, to distinguish
  between db, table, column and others. For now just one type is
  defined.
- `object_id`: table or database ID, relative to the `type`.
- `sub_id`: when a comment is placed on an object "inside" another, eg
  a column inside a table.
- `comment`: the comment proper.

This design of `system.comments` mimics pg's own `pg_description`
which uses the same schema.

Release note (sql change): CockroachDB now supports associating
comments to SQL tables using PostgreSQL's `COMMENT ON TABLE`
syntax. This also provides proper support for pg's
`pg_catalog.pg_description` and built-in function `obj_description()`.

Release note (sql change): The `SHOW TABLES` statement
now supports printing out table comments using the optional phrase
`WITH COMMENT`, e.g `SHOW TABLES FROM mydb WITH COMMENT`.

Co-authored-by: Jaewan Park <jaewan.huey.park@gmail.com>
@craig
Copy link
Contributor

craig bot commented Dec 10, 2018

Build succeeded

@craig craig bot merged commit 4dbf01a into cockroachdb:master Dec 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants