Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: adding too many enum values may render tables using them inaccessible #61814

Closed
arulajmani opened this issue Mar 10, 2021 · 4 comments · Fixed by #61933
Closed

sql: adding too many enum values may render tables using them inaccessible #61814

arulajmani opened this issue Mar 10, 2021 · 4 comments · Fixed by #61933
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3 Medium-low impact: incurs increased costs for some users (incl lower avail, recoverable bad data)

Comments

@arulajmani
Copy link
Collaborator

Describe the problem

I have no clue why this is happening but this seems quite bad.

@ajwerner I ran into this when trying to test the byte representation stuff we were discussing about on #61406.

Please describe the issue you observed, and any steps we can take to reproduce it:

To Reproduce

demo@127.0.0.1:26257/movr> CREATE TYPE typ AS ENUM('a');
CREATE TYPE

Time: 7ms total (execution 7ms / network 0ms)

demo@127.0.0.1:26257/movr> CREATE TABLE tab(a typ, b typ[]);
CREATE TABLE

Time: 22ms total (execution 7ms / network 15ms)

demo@127.0.0.1:26257/movr> ALTER TYPE typ ADD VALUE 'b' BEFORE 'a';
ALTER TYPE

Time: 29ms total (execution 5ms / network 24ms)

demo@127.0.0.1:26257/movr> ALTER TYPE typ ADD VALUE 'c' BEFORE 'b';
ALTER TYPE

Time: 27ms total (execution 5ms / network 23ms)

demo@127.0.0.1:26257/movr> ALTER TYPE typ ADD VALUE 'd' BEFORE 'c';
ALTER TYPE

Time: 27ms total (execution 5ms / network 22ms)

demo@127.0.0.1:26257/movr> ALTER TYPE typ ADD VALUE 'e' BEFORE 'd';
ALTER TYPE

Time: 26ms total (execution 5ms / network 22ms)

demo@127.0.0.1:26257/movr> ALTER TYPE typ ADD VALUE 'f' BEFORE 'e';
ALTER TYPE

Time: 40ms total (execution 5ms / network 35ms)

demo@127.0.0.1:26257/movr> ALTER TYPE typ ADD VALUE 'g' BEFORE 'f';
ALTER TYPE

Time: 75ms total (execution 5ms / network 70ms)

demo@127.0.0.1:26257/movr> ALTER TYPE typ ADD VALUE 'h' BEFORE 'g';
ALTER TYPE

Time: 40ms total (execution 5ms / network 35ms)

demo@127.0.0.1:26257/movr> SELECT * FROM tab;
  a | b
----+----
(0 rows)

Time: 1ms total (execution 1ms / network 0ms)

demo@127.0.0.1:26257/movr> ALTER TYPE typ ADD VALUE 'i' BEFORE 'h';
ALTER TYPE

Time: 67ms total (execution 5ms / network 63ms)

demo@127.0.0.1:26257/movr> SELECT * FROM tab;
ERROR: unsupported comparison operator: a IN (b'\x0080':::public.typ, b'\x01':::@100059, b'\x02':::@100059, b'\x04':::@100059, b'\b':::@100059, b'\x10':::@100059, b' ':::@100059, b'@':::@100059, b'\x80':::@100059): could not find [0 56 48] in enum "public.typ" representation PhysicalReps: [[0 128] [1] [2] [4] [8] [16] [32] [64] [128]]; LogicalReps: [i h g f e d c b a]
SQLSTATE: 22023
demo@127.0.0.1:26257/movr>
@arulajmani arulajmani added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 10, 2021
@arulajmani arulajmani changed the title sql: adding enum values may render tables using them inaccessible sql: adding too many enum values may render tables using them inaccessible Mar 10, 2021
@rafiss rafiss added the S-3 Medium-low impact: incurs increased costs for some users (incl lower avail, recoverable bad data) label Mar 11, 2021
@rafiss
Copy link
Collaborator

rafiss commented Mar 11, 2021

random thought: Maybe it relates to the limit on number of times we can find a "midpoint" floating point number.

@ajwerner
Copy link
Contributor

it's not a floating point number

@rafiss
Copy link
Collaborator

rafiss commented Mar 11, 2021

Oh I see now that it's bytes -- I was thinking of this but forgot the details:

func GenByteStringBetween(prev []byte, next []byte, spacing ByteSpacing) []byte {

@ajwerner
Copy link
Contributor

I got to the bottom of this. It was a misunderstanding of b'' string literals. We can write regular string literals in the postgres format like this:

> SELECT '\x565656'::BYTES = 'VVV'::BYTES;
true

However, for our b'' literals, you need the \x escape per character:

> SELECT b'\x565656'::BYTES = 'VVV'::BYTES;
false
> SELECT b'\x565656'::BYTES = 'V5656'::BYTES;
true

We'd need to encode the enum as:

> SELECT b'\x56\x56\x56'::BYTES = 'VVV'::BYTES;
true

or

> SELECT b'VVV'::BYTES = 'VVV'::BYTES;
true

Instead, I'm going to make a change to encode it as:

> SELECT x'565656'::BYTES = 'VVV'::BYTES;
true

ajwerner added a commit to ajwerner/cockroach that referenced this issue Mar 15, 2021
The optimizer generates check constraints for enums. It does so by
writing these constraints as physical values. Unfortunately the way
we serialized these physical values was bogus if the enum's physical
value was more than one byte long. The reason was because of a difference
between string literal and byte string literals and their use of `\x`.

This fixes that by using hex byte string literals for these enum
values.

Fixes cockroachdb#61814

Release note (bug fix): Fixed a bug whereby enums which have large numbers of
values may cause unexpected errors when attempting to read from tables with
columns using that enum.
craig bot pushed a commit that referenced this issue Mar 16, 2021
61933: sql/sem/tree: properly serialize physical values of enums r=ajwerner a=ajwerner

The optimizer generates check constraints for enums. It does so by
writing these constraints as physical values. Unfortunately the way
we serialized these physical values was bogus if the enum's physical
value was more than one byte long. The reason was because of a difference
between string literal and byte string literals and their use of `\x`.

This fixes that by using hex byte string literals for these enum
values.

Fixes #61814

Release note (bug fix): Fixed a bug whereby enums which have large numbers of
values may cause unexpected errors when attempting to read from tables with
columns using that enum.

Co-authored-by: Andrew Werner <awerner32@gmail.com>
@craig craig bot closed this as completed in 18bad51 Mar 16, 2021
ajwerner added a commit to ajwerner/cockroach that referenced this issue Mar 18, 2021
The optimizer generates check constraints for enums. It does so by
writing these constraints as physical values. Unfortunately the way
we serialized these physical values was bogus if the enum's physical
value was more than one byte long. The reason was because of a difference
between string literal and byte string literals and their use of `\x`.

This fixes that by using hex byte string literals for these enum
values.

Fixes cockroachdb#61814

Release note (bug fix): Fixed a bug whereby enums which have large numbers of
values may cause unexpected errors when attempting to read from tables with
columns using that enum.
ajwerner added a commit to ajwerner/cockroach that referenced this issue Mar 25, 2021
The optimizer generates check constraints for enums. It does so by
writing these constraints as physical values. Unfortunately the way
we serialized these physical values was bogus if the enum's physical
value was more than one byte long. The reason was because of a difference
between string literal and byte string literals and their use of `\x`.

This fixes that by using hex byte string literals for these enum
values.

Fixes cockroachdb#61814

Release note (bug fix): Fixed a bug whereby enums which have large numbers of
values may cause unexpected errors when attempting to read from tables with
columns using that enum.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3 Medium-low impact: incurs increased costs for some users (incl lower avail, recoverable bad data)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants