Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add labels to Tuples #25283

Merged
merged 1 commit into from
May 16, 2018
Merged

Conversation

BramGruneir
Copy link
Member

@BramGruneir BramGruneir commented May 3, 2018

In order to further solving #16971, the ability to access the content of a
tuple based on a label is required. Until now, tuples had no labels and thus
there was no way access a specific element in a tuple. This commit adds labels
to tuples and also the ability to specify them directly in the SQL dialect.
However, it does not go to the next step of adding the ability to choose an
element based on the label. That will be in a follow up commit.

Note that the ability to give tuples labels is an extension on top of Postgres'
SQL grammar. But it should be noted that this functionality is mostly for
internal use and ease of testing.

Specifically, one can now specify the names for a tuple using the following:

((1,2,3) AS a,b,c)

or

(ROW(1,2,3) AS a,b,c)

Similar to unlabeled tuples, the ROW keyword is optional unless there is only
a single element in the tuple.

(ROW('a') AS name)

However, the enclosing parentheses are always required.

Furthermore, when comparing tuples, the labels are ignored and only the types
of the elements must be comparable to each other.

This change started out by adding a Labels element to both the Tuple and
TTuple structs. And following the though where all the required changes led.
After this, updates to Tuple's TypeCheck where required and then a lot of
testing to ensure the correct format was applied when printing or typing them.

Release note (sql change): Tuples can now be specified with labels. They can be
specified using the new grammar ((1,2,3) AS a,b,c).

@BramGruneir BramGruneir added this to the 2.1 milestone May 3, 2018
@BramGruneir BramGruneir requested review from knz and a team May 3, 2018 17:39
@BramGruneir BramGruneir requested a review from a team as a code owner May 3, 2018 17:39
@BramGruneir BramGruneir requested review from a team May 3, 2018 17:39
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@BramGruneir BramGruneir requested a review from rytaft May 3, 2018 17:40
@BramGruneir
Copy link
Member Author

@rytaft There are some minor changes to optimizer code here and this introduces a change to our SQL grammar.

@knz
Copy link
Contributor

knz commented May 3, 2018

Thank you for this. The code is mostly good. I am agreeably surprised by the regularity and the thoroughness of this change. Minor comments below.


Reviewed 20 of 20 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved.


pkg/sql/logictest/testdata/logic_test/tuple, line 689 at r1 (raw file):

SELECT ((1, 2, 'hello', NULL, NULL) AS a1, b2, c3, d4, e5), ((true, NULL, (false, 6.6, false)) AS a1, b2, c3)
----
(1,2,'hello',,) (true,,(false, 6.6, false))

wait why is NULL not printed. I'd have expected it to be printed.


pkg/sql/logictest/testdata/logic_test/tuple, line 694 at r1 (raw file):

query BBB
SELECT
	((2, 2) AS a, b) < ((1, 1) AS c, d),

remove tabs here and below


pkg/sql/parser/sql.y, line 6749 at r1 (raw file):

    t.Labels = $4.nameList()
    $$.val = &t
  }

This wants some changes in parse_test.go.


pkg/sql/sem/tree/type_check.go, line 813 at r1 (raw file):

	// If there are labels, some extra checks are needed.
	if len(expr.Labels) != len(expr.Exprs) {

Pull this check up above, then share the rest of the code for the two cases.


Comments from Reviewable

@andy-kimball
Copy link
Contributor

How does this relate to Postgres composite types? How come we're going off in this direction rather than implementing some subset of composite types? Or is the plan to take an initial step in that direction? Is our Tuple type (before this change and after) compatible with Postgres composite types?


Review status: all files reviewed at latest revision, 4 unresolved discussions.


pkg/sql/sem/types/types.go, line 323 at r1 (raw file):

// TTuple is the type of a DTuple.
type TTuple struct {

TTuple is embedded in TTable. That means TTable now has two sets of labels.


Comments from Reviewable

@andy-kimball
Copy link
Contributor

Review status: all files reviewed at latest revision, 5 unresolved discussions.


pkg/sql/sem/types/types.go, line 339 at r1 (raw file):

			}
			buf.WriteString(typ.String())
			if t.Labels != nil {

I believe this string gets printed out by the pg_typeof function? Are we sure it's a good idea to make the labels part of that? I notice that TTable is printing something like set of {int, string} rather than the labels. Why the inconsistency?


Comments from Reviewable

@knz
Copy link
Contributor

knz commented May 4, 2018

How does this relate to Postgres composite types?

It relates well (see below)

How come we're going off in this direction rather than implementing some subset of composite types?

There's no "off" nor "rather". We are going purposefully in this direction exactly in order to implement postgres' composite types.

Or is the plan to take an initial step in that direction?

Yes this is one of the steps towards addressing #24866 in turn needed to solve #16971 in turn needed to solve #16491.

Is our Tuple type (before this change and after) compatible with Postgres composite types?

With a resounding yes: if you look at pg's code they actually use a single data structure with a field column labels to represent every tuple type (with or without label). We just had failed to recognize the usefulness of this earlier. This patch fixes that.


Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.


pkg/sql/sem/types/types.go, line 323 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

TTuple is embedded in TTable. That means TTable now has two sets of labels.

Bram and I found out that TTable needs to go away. I introduced TTable on a hunch that since then proved to be a mistake. There's no need for it -- anything that looks like a table disappears from scalar expressions prior to type checking (with the SRF conversion which we're not yet doing completely, but will be done before 2.1).


pkg/sql/sem/types/types.go, line 339 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

I believe this string gets printed out by the pg_typeof function? Are we sure it's a good idea to make the labels part of that? I notice that TTable is printing something like set of {int, string} rather than the labels. Why the inconsistency?

  1. What is printed for TTable is purely arbitary and has no precedence in pg. See my comment above.

  2. I'd be indeed interested to ensure pg_typeof does the right thing for TTuple. However at the same time we must be mindful that pg does not report the detailed type composition of a tuple with pg_typeof (neither for labeled nor non-labeled tuples! Our current implementation, regardless of this PR, is wrong) -- instead, if uses the name of the thing that produced the tuple type. For example:

  • select pg_typeof(kv) from kv -> reported type name is just "kv" even though the implementation of that type is a labeled tuple
  • `select pg_typeof(_pg_expandarray(...)) -> reported type name is just "record"
  • `select pg_typeof(row(1,2,3)) -> also just "record"

We'll implement this transient type name logic in a later iteration. The logic is simple: any composite type derived from a table name gets the name of the table; everything else gets named "record".


Comments from Reviewable

@BramGruneir BramGruneir removed this from the 2.1 milestone May 15, 2018
@BramGruneir
Copy link
Member Author

All comments addressed, PTAL.


Review status: 8 of 22 files reviewed at latest revision, 6 unresolved discussions.


pkg/sql/logictest/testdata/logic_test/tuple, line 689 at r1 (raw file):

Previously, knz (kena) wrote…

wait why is NULL not printed. I'd have expected it to be printed.

Nope.

Here's the result from Postgres:

SELECT (1, 2, 'hello', NULL, NULL), (true, NULL, (false, 6.6, false));
      row      |       row
---------------+------------------
 (1,2,hello,,) | (t,,"(f,6.6,f)")
(1 row)

And from us without the labels:

SELECT (1, 2, 'hello', NULL, NULL), (true, NULL, (false, 6.6, false));
+-----------------------------+--------------------------------+
| (1, 2, 'hello', NULL, NULL) |   (true, NULL, (false, 6.6,    |
|                             |            false))             |
+-----------------------------+--------------------------------+
| (1,2,'hello',,)             | (true,,(false, 6.6, false))    |
+-----------------------------+--------------------------------+
(1 row)

But I will add colnames, because I want us to be explicit about any changes to them.

-(2.1)
Actually, went even further and added colnames to all non-explain queries in this file. I think it's important for us to monitor any changes to them.


pkg/sql/logictest/testdata/logic_test/tuple, line 694 at r1 (raw file):

Previously, knz (kena) wrote…

remove tabs here and below

Done.


pkg/sql/parser/sql.y, line 6749 at r1 (raw file):

Previously, knz (kena) wrote…

This wants some changes in parse_test.go.

Indeed. Added.


pkg/sql/sem/tree/type_check.go, line 813 at r1 (raw file):

Previously, knz (kena) wrote…

Pull this check up above, then share the rest of the code for the two cases.

Done. Much cleaner. I didn't like the duplication.


pkg/sql/sem/types/types.go, line 339 at r1 (raw file):

Previously, knz (kena) wrote…
  1. What is printed for TTable is purely arbitary and has no precedence in pg. See my comment above.

  2. I'd be indeed interested to ensure pg_typeof does the right thing for TTuple. However at the same time we must be mindful that pg does not report the detailed type composition of a tuple with pg_typeof (neither for labeled nor non-labeled tuples! Our current implementation, regardless of this PR, is wrong) -- instead, if uses the name of the thing that produced the tuple type. For example:

  • select pg_typeof(kv) from kv -> reported type name is just "kv" even though the implementation of that type is a labeled tuple
  • `select pg_typeof(_pg_expandarray(...)) -> reported type name is just "record"
  • `select pg_typeof(row(1,2,3)) -> also just "record"

We'll implement this transient type name logic in a later iteration. The logic is simple: any composite type derived from a table name gets the name of the table; everything else gets named "record".

I just added the labels here, but I'm cool with changing how the output looks. Let me know if we want a different format.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented May 15, 2018

LGTM modulo minor nits and missing links to new follow-up issues.


Reviewed 13 of 14 files at r2.
Review status: 21 of 22 files reviewed at latest revision, 3 unresolved discussions.


pkg/sql/logictest/testdata/logic_test/tuple, line 689 at r1 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

Nope.

Here's the result from Postgres:

SELECT (1, 2, 'hello', NULL, NULL), (true, NULL, (false, 6.6, false));
      row      |       row
---------------+------------------
 (1,2,hello,,) | (t,,"(f,6.6,f)")
(1 row)

And from us without the labels:

SELECT (1, 2, 'hello', NULL, NULL), (true, NULL, (false, 6.6, false));
+-----------------------------+--------------------------------+
| (1, 2, 'hello', NULL, NULL) |   (true, NULL, (false, 6.6,    |
|                             |            false))             |
+-----------------------------+--------------------------------+
| (1,2,'hello',,)             | (true,,(false, 6.6, false))    |
+-----------------------------+--------------------------------+
(1 row)

But I will add colnames, because I want us to be explicit about any changes to them.

-(2.1)
Actually, went even further and added colnames to all non-explain queries in this file. I think it's important for us to monitor any changes to them.

The pretty-printing of tuples as now implemented is sufficient for a first round but is still incomplete to be compatible with pg's. Please file an issue with your example:

(true,,(false, 6.6, false)) should really be (t,,"(f,6.6,f)")

then add a TODO pointing to the new issue both around here and the corresponding pretty-printing code.

We had to do something similar for arrays before. cc Justin on the new issue.


pkg/sql/logictest/testdata/logic_test/tuple, line 13 at r2 (raw file):

query BBBBBBBBB colnames
SELECT
	(2, 2) < (1, 1),

tabs here and below (I know you didn't introduce the tabs, but since you're at it already....)


pkg/sql/sem/types/types.go, line 339 at r1 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

I just added the labels here, but I'm cool with changing how the output looks. Let me know if we want a different format.

I think this function is OK for troubleshooting/debugging purposes. We need to change the logic of pg_typeof separately to do what I explained above. Mind filing an issue specifically about that? Then add a TODO to the pg_typeof builtin code that refers to the new issue.


Comments from Reviewable

@BramGruneir
Copy link
Member Author

Review status: 18 of 23 files reviewed at latest revision, 4 unresolved discussions.


pkg/sql/logictest/testdata/logic_test/tuple, line 689 at r1 (raw file):

Previously, knz (kena) wrote…

The pretty-printing of tuples as now implemented is sufficient for a first round but is still incomplete to be compatible with pg's. Please file an issue with your example:

(true,,(false, 6.6, false)) should really be (t,,"(f,6.6,f)")

then add a TODO pointing to the new issue both around here and the corresponding pretty-printing code.

We had to do something similar for arrays before. cc Justin on the new issue.

Done.


pkg/sql/logictest/testdata/logic_test/tuple, line 13 at r2 (raw file):

Previously, knz (kena) wrote…

tabs here and below (I know you didn't introduce the tabs, but since you're at it already....)

Done.


pkg/sql/sem/types/types.go, line 339 at r1 (raw file):

Previously, knz (kena) wrote…

I think this function is OK for troubleshooting/debugging purposes. We need to change the logic of pg_typeof separately to do what I explained above. Mind filing an issue specifically about that? Then add a TODO to the pg_typeof builtin code that refers to the new issue.

Done.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented May 15, 2018

Reviewed 5 of 5 files at r3.
Review status: all files reviewed at latest revision, 2 unresolved discussions.


Comments from Reviewable

@BramGruneir
Copy link
Member Author

BramGruneir commented May 15, 2018 via email

@craig
Copy link
Contributor

craig bot commented May 15, 2018

Merge conflict (retrying...)

@BramGruneir
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented May 15, 2018

Merge conflict (retrying...)

@BramGruneir
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented May 15, 2018

Build failed (retrying...)

BramGruneir added a commit to BramGruneir/cockroach that referenced this pull request May 22, 2018
This work is yet another step towards cockroachdb#16971.

The labeled tuples introduced in cockroachdb#25283 can now be accessed using their labels.
It's important to note that they require an extra parentheses around them to
do so.

```sql
SELECT (((1,'2',true) AS a, b, c)).a
+---+
| a |
+---+
| 1 |
+---+

SELECT (((1,'2',true) AS a, b, c)).b
+---+
| b |
+---+
| 2 |
+---+

SELECT (((1,'2',true) AS a, b, c)).c
+------+
|  c   |
+------+
| true |
+------+
```

This change prompted the need to restrict the labels in the tuple to be unique;
checks and test cases for that have been added as well.

Star expansion is still not allowed.

Release note: Labeled tuples can now be accessed using their labels (e.g.
`SELECT (((1,'2',true) AS a, b, c)).a` but it requires an extra level of
parentheses to do so.
BramGruneir added a commit to BramGruneir/cockroach that referenced this pull request May 24, 2018
This work is yet another step towards cockroachdb#16971.

The labeled tuples introduced in cockroachdb#25283 can now be accessed using their labels.
It's important to note that they require an extra parentheses around them to
do so.

```sql
SELECT (((1,'2',true) AS a, b, c)).a
+---+
| a |
+---+
| 1 |
+---+

SELECT (((1,'2',true) AS a, b, c)).b
+---+
| b |
+---+
| 2 |
+---+

SELECT (((1,'2',true) AS a, b, c)).c
+------+
|  c   |
+------+
| true |
+------+
```

This change prompted the need to restrict the labels in the tuple to be unique;
checks and test cases for that have been added as well.

Star expansion is still not allowed.

Release note: Labeled tuples can now be accessed using their labels (e.g.
`SELECT (((1,'2',true) AS a, b, c)).a` but it requires an extra level of
parentheses to do so.
BramGruneir added a commit to BramGruneir/cockroach that referenced this pull request May 25, 2018
This work is yet another step towards cockroachdb#16971.

The labeled tuples introduced in cockroachdb#25283 can now be accessed using their labels.
It's important to note that they require an extra parentheses around them to
do so.

```sql
SELECT (((1,'2',true) AS a, b, c)).a
+---+
| a |
+---+
| 1 |
+---+

SELECT (((1,'2',true) AS a, b, c)).b
+---+
| b |
+---+
| 2 |
+---+

SELECT (((1,'2',true) AS a, b, c)).c
+------+
|  c   |
+------+
| true |
+------+
```

This change prompted the need to restrict the labels in the tuple to be unique;
checks and test cases for that have been added as well.

Star expansion is still not allowed.

Release note: Labeled tuples can now be accessed using their labels (e.g.
`SELECT (((1,'2',true) AS a, b, c)).a` but it requires an extra level of
parentheses to do so.
craig bot pushed a commit that referenced this pull request May 25, 2018
25810: sql: Add the ability to access a specific column in a labeled tuple r=BramGruneir a=BramGruneir

This work is yet another step towards #16971.

The labeled tuples introduced in #25283 can now be accessed using their labels.
It's important to note that they require an extra parentheses around them to
do so.

```sql
SELECT (((1,'2',true) AS a, b, c)).a
+---+
| a |
+---+
| 1 |
+---+

SELECT (((1,'2',true) AS a, b, c)).b
+---+
| b |
+---+
| 2 |
+---+

SELECT (((1,'2',true) AS a, b, c)).c
+------+
|  c   |
+------+
| true |
+------+
```

This change prompted the need to restrict the labels in the tuple to be unique;
checks and test cases for that have been added as well.

Star expansion is still not allowed.

Release note: Labeled tuples can now be accessed using their labels (e.g.
`SELECT (((1,'2',true) AS a, b, c)).a` but it requires an extra level of
parentheses to do so.

26096: scripts: add Yahor to release notes r=yuzefovich a=yuzefovich

This adds Yahor Yuzefovich to the roachers list.

Release note: None

Co-authored-by: Bram Gruneir <bram@cockroachlabs.com>
Co-authored-by: yuzefovich <yahor@cockroachlabs.com>
knz added a commit to knz/cockroach that referenced this pull request Jun 12, 2018
This work is yet another step towards cockroachdb#16971.

The labeled tuples introduced in cockroachdb#25283 can now be accessed using their labels.

Star expansion is still not allowed for tuples with fewer or more than
1 columns.

Release note (sql change): Labeled tuples can now be accessed using
their labels (e.g. `SELECT (x).word FROM (SELECT pg_expand_keywords() AS
x)`.

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
knz added a commit to knz/cockroach that referenced this pull request Jun 13, 2018
This work is yet another step towards cockroachdb#16971.

The labeled tuples introduced in cockroachdb#25283 can now be accessed using their labels.

Star expansion is still not allowed for tuples with fewer or more than
1 columns.

Release note (sql change): Labeled tuples can now be accessed using
their labels (e.g. `SELECT (x).word FROM (SELECT pg_expand_keywords() AS
x)`.

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
knz added a commit to knz/cockroach that referenced this pull request Jun 13, 2018
This work is yet another step towards cockroachdb#16971.

The labeled tuples introduced in cockroachdb#25283 can now be accessed using
their labels or using a star, e.g. `(E).*`.

Release note (sql change): Labeled tuples can now be accessed using
their labels (e.g. `SELECT (x).word FROM (SELECT pg_expand_keywords()
AS x)` or a star (e.g. `SELECT (x).* FROM (SELECT pg_expand_keywords()
AS x)`).

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
knz added a commit to knz/cockroach that referenced this pull request Jun 13, 2018
This work is yet another step towards cockroachdb#16971.

The labeled tuples introduced in cockroachdb#25283 can now be accessed using
their labels or using a star, e.g. `(E).*`.

Release note (sql change): Labeled tuples can now be accessed using
their labels (e.g. `SELECT (x).word FROM (SELECT pg_expand_keywords()
AS x)` or a star (e.g. `SELECT (x).* FROM (SELECT pg_expand_keywords()
AS x)`).

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
knz added a commit to knz/cockroach that referenced this pull request Jun 13, 2018
This work is yet another step towards cockroachdb#16971.

The labeled tuples introduced in cockroachdb#25283 can now be accessed using
their labels or using a star, e.g. `(E).*`.

Release note (sql change): Labeled tuples can now be accessed using
their labels (e.g. `SELECT (x).word FROM (SELECT pg_expand_keywords()
AS x)` or a star (e.g. `SELECT (x).* FROM (SELECT pg_expand_keywords()
AS x)`).

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
knz added a commit to knz/cockroach that referenced this pull request Jun 14, 2018
This work is yet another step towards cockroachdb#16971.

The labeled tuples introduced in cockroachdb#25283 can now be accessed using
their labels or using a star, e.g. `(E).*`.

Release note (sql change): Labeled tuples can now be accessed using
their labels (e.g. `SELECT (x).word FROM (SELECT pg_expand_keywords()
AS x)` or a star (e.g. `SELECT (x).* FROM (SELECT pg_expand_keywords()
AS x)`).

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
craig bot pushed a commit that referenced this pull request Jun 14, 2018
26628: sql: Support tuple column access and tuple stars r=knz a=knz

First commits from #26621.
Completes the fix to #24866 by re-activating disabled tests.
This work is yet another step towards #16971. It would actually fix #16971 if it were not for #26627, #26624 and #26629.

This work is yet another step towards #16971.

The labeled tuples introduced in #25283 can now be accessed using
their labels or using a star, e.g. `(E).*`.

Release note (sql change): Labeled tuples can now be accessed using
their labels (e.g. `SELECT (x).word FROM (SELECT pg_expand_keywords()
AS x)` or a star (e.g. `SELECT (x).* FROM (SELECT pg_expand_keywords()
AS x)`).

Fixes #26720.

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Co-authored-by: Bram Gruneir <bram@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants