Skip to content

Column definition in SQL #2351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pgulutzan opened this issue Apr 20, 2017 · 8 comments
Closed

Column definition in SQL #2351

pgulutzan opened this issue Apr 20, 2017 · 8 comments
Assignees
Milestone

Comments

@pgulutzan
Copy link
Contributor

Some problems with column definition in SQL.

I can put any words at the end of a column definition; they're ignored.
So I can say ASC|DESC, COLLATE RTRIM, REFERENCES table-name,
ON CONFLICT IGNORE, NOT NULL NULL, INTEGER moonbat, etc.
-- but they all do nothing.
box.sql.execute("CREATE TABLE te13 (s1 INT CHAR NOT NULL NULL, PRIMARY KEY (s1));")
Result = No error

"s1 INT, s2 INT DEFAULT s1" is allowed, but the result
is not that s2 has default = the value in the s1 column,
instead it has default = 's1'. There seems to be a general
rule that single quotes don't matter much.
box.sql.execute("CREATE TABLE te14 (s1 INT PRIMARY KEY, s2 INT DEFAULT s1);")
box.sql.execute("INSERT INTO te14 (s1) VALUES (1);")
box.sql.execute("SELECT s2 FROM te14;")
Result = ['s1']

@pgulutzan pgulutzan added the sql label Apr 20, 2017
@kyukhin kyukhin added the needs feedback Something is unclear with the issue label Apr 21, 2017
@kyukhin
Copy link
Contributor

kyukhin commented Apr 21, 2017

So, should we default s2 to value s1?
and select after insert you mentioned [1, 1] should be returned?

@pgulutzan
Copy link
Contributor Author

I'm sure it would be easier to just make it
illegal to default to anything except a literal.
Of course it's great to have column values generated
from expressions of any kind, but that's not what
this issue is about.

@pgulutzan pgulutzan removed the needs feedback Something is unclear with the issue label Apr 21, 2017
@pgulutzan
Copy link
Contributor Author

pgulutzan commented Jun 5, 2017

Moving to SQL-standard data support does not require
significant change to underlying structures.
It's mostly a matter of: some definitions that are
optional should be mandatory, some operations
that are legal should be illegal.

  1. It is mandatory to state a data type.
    Therefore this becomes illegal:
    CREATE TABLE t (s1 PRIMARY KEY);

  2. It is mandatory to state a data type by its
    name or by its synonym. The current method
    allows for stating any string at all, with
    rules about what substrings can be searched.
    Therefore this becomes illegal:
    CREATE TABLE t (s1 xcharx PRIMARY KEY);

  3. The stated data type is a statement of a
    requirement not an option.
    Therefore affinity rules do not exist.

  4. The mandatory data types are
    CHAR (synonyms CHARACTER),
    VARCHAR (synonyms CHAR VARYING, CHARACTER VARYING),
    SMALLINT,
    INT (synonyms INTEGER),
    FLOAT,
    REAL,
    DOUBLE PRECISION (non-standard synonym DOUBLE),
    NUMERIC,
    DECIMAL (synonym DEC),
    DATE,
    TIME,
    TIMESTAMP.
    Currently all these data types can be stored,
    the only problems are: enforcing the maximum
    value ranges and size limits, and (sometimes) formats.
    (Here we are only considering storage and comparison
    of the values, not operations on them.)

  5. It is illegal to assign or compare
    values of incompatible data types.
    There are no automatic incompatible-to-compatible
    conversions.
    Ranges of values and lengths are limited by their
    definitions and those limits are to be enforced.
    Therefore these become illegal:
    UPDATE t SET integer_column = 'x';
    UPDATE t SET character_column_with_length_4 = 'ABCDE';
    UPDATE t SET smallint_column = 999999;

  6. It is mandatory to enclose character literals
    in single quotes. Double quotes are not allowed
    because "xxxx" is an identifier, not a literal.
    And anything that is enclosed in single quotes is
    a character literal, not an identifier.
    Therefore this becomes illegal:
    UPDATE t SET 'column1' = "literal-value";
    (updated comment: this was fixed, it is no longer an issue)

  7. Assuming that we are claiming to support UTF8,
    all UTF8 "noncharacters" should cause exceptions.
    Therefore this becomes illegal:
    UPDATE t SET char_column = CAST(X'FFFF' AS CHAR);
    (This requirement is pre-rejected, and will be mentioned
    again in the description of issue#2121 Integrate ICU into Tarantool.)

  8. INT.
    Synonyms: INTEGER.
    Range is conventionally -2147483648 to 2147483647.
    Therefore this becomes illegal:
    UPDATE t SET int_column = 1e50;

  9. SMALLINT.
    Range is conventionally -32768 to +32767.
    Therefore this becomes illegal:
    UPDATE t SET smallint_column = 32768;

  10. FLOAT [(n)], REAL, DOUBLE PRECISION.
    The requirements mostly correspond to what everybody
    does with IEEE754, except that
    the approximate-numeric ranges do not include Inf
    or NaN. (This requirement is pre-rejected, and will
    be mentioned again in the description of issue#2135
    Division by zero should yield error or infinity.)

x. DECIMAL(x,y) and NUMERIC(x,y)
Synonyms: DEC.
These are not floating-point numbers, they are
exact numbers which may be big and have multiple
post-decimal positions. Range is not specified yet
because the maximum precision and scale will depend
on looking more at the existing code.

  1. DATE.
    Range: '0001-01-01' to '9999-12-31'.
    Proleptic Gregorian; invalid dates cause errors.
    Literals can have the form '0001-01-01' (the
    actual requirement is DATE '0001-01-01' but nobody
    bothers with that).
    (updated comment: nobody will mind negative dates.)

  2. TIME.
    Range: '00:00:00' to '23:59:59'.
    No need for fractions of seconds, they're optional.

  3. TIMESTAMP.
    Range: '0001-01-01 00:00:00' to '9999-12-31 23:59:59'.
    (updated comment: nobody will mind negative dates)

  4. BLOB.
    BLOB and BINARY and VARBINARY are not mandatory in
    standard SQL. But they differ from CHAR / VARCHAR.
    Assume they will be supported as a new data type.

See also:
Follow ANSI SQL quotation rules. #2120
Improve values comparison of different types. #2210 -- I'm suggesting the opposite
[strict typing] INTEGER column allows string values. #2315
String literals in SQL #2344
Automatic data type conversion in SQL #2352

@pgulutzan
Copy link
Contributor Author

There has been a behaviour change. Now the result of the example statements is an assertion:
tarantool: /home/pgulutzan/tarantool-2.0/src/box/sql/expr.c:4230: sqlite3ExprCodeTarget: Assertion `op == 118' failed.
Aborted (core dumped)

@Korablev77
Copy link
Contributor

I moved this bug to separate issue: #3695

@Korablev77
Copy link
Contributor

Korablev77 commented Sep 20, 2018

@pgulutzan
Currently we are working on static types. I've checked your list and that is what we have now:

It is mandatory to state a data type.
Therefore this becomes illegal:
CREATE TABLE t (s1 PRIMARY KEY);

It is mandatory to state a data type by its
name or by its synonym. The current method
allows for stating any string at all, with
rules about what substrings can be searched.
Therefore this becomes illegal:
CREATE TABLE t (s1 xcharx PRIMARY KEY);

The stated data type is a statement of a
requirement not an option.
Therefore affinity rules do not exist.

It is done (but still not hit the trunk - it is on the branch). Now every column must feature type.

CHAR (synonyms CHARACTER),
VARCHAR (synonyms CHAR VARYING, CHARACTER VARYING),
SMALLINT,
INT (synonyms INTEGER),
FLOAT,
REAL,
DOUBLE PRECISION (non-standard synonym DOUBLE),
NUMERIC,
DECIMAL (synonym DEC),
DATE,
TIME,
TIMESTAMP.

Almost all of these types are supported by syntax, but they are mapped onto poor set of available types in Tarantool (in fact, only integer, number, string and scalar for blob).
Issues below are aimed at extending this set:
Introduce SMALLINT and BIGINT types: #3687
Introduce scale and precision for numeric types: #3672
Introduce string type with fixed length: #3665
Introduce BOOLEAN type: #3648
Introduce DATETIME type: #3694

It is illegal to assign or compare
values of incompatible data types.
There are no automatic incompatible-to-compatible
conversions.
Ranges of values and lengths are limited by their
definitions and those limits are to be enforced.
Therefore these become illegal:
UPDATE t SET integer_column = 'x';
UPDATE t SET character_column_with_length_4 = 'ABCDE';
UPDATE t SET smallint_column = 999999;

First example now results in error:

tarantool> create table t2(id int primary key, a int, b char(10))
---
...

tarantool> insert into t2 values(1, 1, 'abc')
---
...

tarantool> update t2 set id = 'x'
---
- error: Can't convert x to INTEGER
...

Other two examples are related to issues mentioned above (so obviously now they don't raise an error). Note that query like update t2 set id = '3' will be successfully executed (in all DBs which I checked it is also legal: integer literal in quotes is converted to simple integer).

The rest of notes are irrelevant now, since they are related to not yet implemented types.
You can check branch with static types (it may be still raw): np/sql-static-types

@Korablev77
Copy link
Contributor

Examples from original issue are invalid now, so I guess we can close this ticket.

@kyukhin
Copy link
Contributor

kyukhin commented Nov 9, 2018

Closing as done.

@kyukhin kyukhin closed this as completed Nov 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants