Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: desugar "TIMESTAMP WITH TIME ZONE" to "TIMESTAMPTZ" #28955

Merged
merged 3 commits into from
Aug 23, 2018

Conversation

knz
Copy link
Contributor

@knz knz commented Aug 22, 2018

First commits from #28952 and priors.
Forked off #28690.

This is unfortunate because this is a pretty common type, and is
moreover used to disambiguate the return type of now() and thus will
be emitted pretty often in distsql physical plans. Having such a long
type name is unnecessary and indeed adds 50%+ storage, communication
and parsing overhead for every occurrence of this type in
distsql plans.

This patch shortens the canonical CockroachDB name to "TIMESTAMPTZ".

The representation of the type in introspection tables (pg_catalog,
information_schema) is unaffected.

Release note: None

@knz knz requested a review from BramGruneir August 22, 2018 13:24
@knz knz requested a review from a team as a code owner August 22, 2018 13:24
@knz knz requested a review from a team August 22, 2018 13:24
@knz knz requested a review from a team as a code owner August 22, 2018 13:24
@knz knz requested review from a team August 22, 2018 13:24
@knz knz requested a review from a team as a code owner August 22, 2018 13:24
@knz knz requested review from a team August 22, 2018 13:24
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz knz requested a review from a team August 22, 2018 13:25
@justinj
Copy link
Contributor

justinj commented Aug 22, 2018

and indeed adds 50% planning and distribution overhead in common distsql executions.

can you expand on this? this seems quite high?

@knz
Copy link
Contributor Author

knz commented Aug 22, 2018

can you expand on this? this seems quite high?

It's a reference to the fact the string "TIMESTAMP WITH TIME ZONE" is more than 50% longer (to store, to parse, to send) than "TIMESTAMPTZ".

Copy link
Member

@BramGruneir BramGruneir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @justinj, I think the commit message needs to be updated a bit to tone down the 50% savings claim.

That being said, the change :LGTM: but I think we're missing a test or two to ensure the old verbose name still works.

Reviewed 4 of 4 files at r1, 1 of 1 files at r2, 18 of 18 files at r3, 2 of 2 files at r4, 7 of 7 files at r5, 3 of 3 files at r6, 13 of 13 files at r7, 31 of 31 files at r8, 23 of 23 files at r9, 16 of 16 files at r10.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/logictest/testdata/logic_test/datetime, line 36 at r10 (raw file):

  a BIGINT PRIMARY KEY,
  b TIMESTAMP,
  c TIMESTAMPTZ,

Since we do still allow the longer type name, we should have at least one test to ensure it still works.

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the commit message needs to be updated a bit to tone down the 50% savings claim.

Done.

I think we're missing a test or two to ensure the old verbose name still works.

  • there is a parse test already
  • the pgadmin queries in the opt testdata still use the long form (I left them unchanged since they come from an external source)
  • filed a separate issue to add more pgdump tests: importccl: add more pgdump tests #29003.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/logictest/testdata/logic_test/datetime, line 36 at r10 (raw file):

Previously, BramGruneir (Bram Gruneir) wrote…

Since we do still allow the longer type name, we should have at least one test to ensure it still works.

I agree, this test exists in two forms: parse_test.go verifies that timestamp with time zone reduces to the same coltype; and opt/xform/testdata/external/pgadmin verifies it works when imported from pgadmin.

This patch corrects two long-standing bugs in CockroachDB which were
confusing both 3rd party tools/ORMs and users:

- the SQL standard type called "CHAR" is supposed to have a default
  maximum width of 1 character.

  Prior to this patch, CockroachDB did not support this and instead
  set no default maximum width on CHAR. This would cause CockroachDB
  to fail validating the width of values inserted in tables, fail
  to constrain the width of values during casts to CHAR, and report
  incorrect information in introspection tables.

- PostgresSQL's dialect distinguishes the table column types TEXT,
  CHAR, VARCHAR and a special thing called `"char"` (this is a legacy
  PostgreSQL type which is, unfortunately, still used by some
  pg_catalog tables).

  Prior to this patch, CockroachDB would map all of these types into
  the same column type "STRING". In turn this caused them to show
  up as "text" in introspection.

  While this was not incorrect from the perspective of value
  handling (all these types are, indeed, handled with about the same
  value semantics in both PostgreSQL and CockroachDB), ORMs
  do pay attention to this distinction and become confused if they
  see a different type in introspection (e.g. "text") than what
  the application model requires (e.g. "char"). The result of this
  confusion was to trigger a schema change to adapt the type, which
  in turn fails due to missing features in ALTER COLUMN TYPE.

This patch corrects both problems by giving the appropriate default
width to CHAR and preserving the distinction between the various string
types in column descriptors and thus introspection.

Additionally, a width constraint check on collated string columns was
missing. This is now fixed too.

Tables and columns created prior to this patch are unaffected.

Release note (bug fix): CockroachDB now distinguishes "CHAR" and
"VARCHAR" as mandated by the SQL standard and PostgreSQL
compatibility. When a width is not specified (e.g. `CHAR(3)`), the
maximum width of `VARCHAR` remains unconstrained whereas the maximum
width for `CHAR` is now 1 character.

Release note (bug fix): CockroachDB now properly checks the width of
strings inserted in a collated string column with a specified width.

Release note (sql change): CockroachDB now preserves the distinction
between different column types for string values like in PostgreSQL,
for compatibility with 3rd party tools and ORMs.
The distinction between the type names "JSON" and "JSONB" is
inconsequential. The types are already reported properly in
introspection using the right alias. This patch removes this
distinction in code.

Release note: None
Prior to this patch the canonical name for the timestamptz type in
CockroachDB was "TIMESTAMP WITH TIME ZONE" with spaces and all
glorious 24 characters.

This is unfortunate because this is a pretty common type, and is
moreover used to disambiguate the return type of `now()` and thus will
be emitted pretty often in distsql physical plans. Having such a long
type name is unnecessary and indeed adds 50%+ storage, communication
and parsing overhead for every occurrence of this type in
distsql plans.

This patch shortens the canonical CockroachDB name to "TIMESTAMPTZ".

The representation of the type in introspection tables (pg_catalog,
information_schema) is unaffected.

Release note: None
craig bot pushed a commit that referenced this pull request Aug 23, 2018
28959: sql/coltypes: remove the field `Name` in TArray r=knz a=knz

First commits from #28955 and priors.
Forked from #28690.

It was not needed. This simplifies.

Release note: None

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig craig bot merged commit a18647e into cockroachdb:master Aug 23, 2018
@knz knz deleted the 20180822-c12-desugar-ttz branch August 23, 2018 12:51
craig bot pushed a commit that referenced this pull request Aug 23, 2018
29006: backport-2.1: fix the handling of SQL types r=knz a=knz

Backport 1/1 commits from #28937.
Backport 1/1 commits from #28942.
Backport 17/17 commits from #28943.
Backport 1/1 commits from #28944.
Backport 1/1 commits from #28945.
Backport 1/1 commits from #28949.
Backport 2/2 commits from #28950.
Backport 1/1 commits from #28951 (#28959).
Backport 1/1 commits from #28952 (#28959).
Backport 1/1 commits from #28955 (#28959).
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28690.

cc @cockroachdb/release 

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
craig bot pushed a commit that referenced this pull request Aug 24, 2018
29006: backport-2.1: fix the handling of SQL types r=knz a=knz

Backport 1/1 commits from #28937.
Backport 1/1 commits from #28942.
Backport 17/17 commits from #28943.
Backport 1/1 commits from #28944.
Backport 1/1 commits from #28945.
Backport 1/1 commits from #28949.
Backport 2/2 commits from #28950.
Backport 1/1 commits from #28951 (#28959).
Backport 1/1 commits from #28952 (#28959).
Backport 1/1 commits from #28955 (#28959).
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28959.
Backport 1/1 commits from #28690.

cc @cockroachdb/release 

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants