Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Introduce SQL DATE data type #37693

Merged
merged 18 commits into from
Jan 24, 2019
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/reference/sql/functions/grouping.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,8 @@ Instead one can rewrite the query to move the expression on the histogram _insid
----
include-tagged::{sql-specs}/docs.csv-spec[histogramDateTimeExpression]
----

[NOTE]
When the histogram in SQL is applied on **DATE** type instead of **DATETIME**, then the interval specified is rounded to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this IMPORTANT.
Also drop the "then" : `when... DATETIME, the interval".

the nearest multiple of a day. E.g.: for `HISTOGRAM(CAST(birth_date AS DATE), INTERVAL '2 3:04' DAY TO MINUTE)`
the interval actually used will be `INTERVAL 2 DAYS`.
13 changes: 10 additions & 3 deletions docs/reference/sql/language/data-types.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@

beta[]

Most of {es} <<mapping-types, data types>> are available in {es-sql}, as indicated below.
As one can see, all of {es} <<mapping-types, data types>> are mapped to the data type with the same
name in {es-sql}, with the exception of **date** data type which is mapped to **datetime** in {es-sql}:

[cols="^,^m,^,^"]

Expand Down Expand Up @@ -46,13 +43,22 @@ s|SQL precision

|===

[NOTE]
Most of {es} <<mapping-types, data types>> are available in {es-sql}, as indicated above.
As one can see, all of {es} <<mapping-types, data types>> are mapped to the data type with the same
name in {es-sql}, with the exception of **date** data type which is mapped to **datetime** in {es-sql}.
This is to avoid confusion with the ANSI SQL **DATE** (date only) type, which is also supported by {es-sql}
in queries (with the use of <<sql-functions-type-conversion-cast>>/<<sql-functions-type-conversion-convert>>),
but doesn't correspond to an actual mapping in {es} (see the <<es-sql-extra-types, `table`>> below).

Obviously, not all types in {es} have an equivalent in SQL and vice-versa hence why, {es-sql}
uses the data type _particularities_ of the former over the latter as ultimately {es} is the backing store.

In addition to the types above, {es-sql} also supports at _runtime_ SQL-specific types that do not have an equivalent in {es}.
Such types cannot be loaded from {es} (as it does not know about them) however can be used inside {es-sql} in queries or their results.

[[es-sql-extra-types]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this to es-sql-only-types, extra doesn't really mean much.


The table below indicates these types:

[cols="^m,^"]
Expand All @@ -62,6 +68,7 @@ s|SQL type
s|SQL precision


| date | 24
| interval_year | 7
| interval_month | 7
| interval_day | 23
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ public enum EsType implements SQLType {
OBJECT( Types.STRUCT),
NESTED( Types.STRUCT),
BINARY( Types.VARBINARY),
DATE( Types.DATE),
DATETIME( Types.TIMESTAMP),
IP( Types.VARCHAR),
INTERVAL_YEAR( ExtraTypes.INTERVAL_YEAR),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,9 @@ final class JdbcDateUtils {
.appendFraction(MILLI_OF_SECOND, 3, 3, true)
.appendOffsetId()
.toFormatter(Locale.ROOT);

static long asMillisSinceEpoch(String date) {
ZonedDateTime zdt = ISO_WITH_MILLIS.parse(date, ZonedDateTime::from);
return zdt.toInstant().toEpochMilli();
return ISO_WITH_MILLIS.parse(date, ZonedDateTime::from).toInstant().toEpochMilli();
}

static Date asDate(String date) {
Expand All @@ -71,7 +70,7 @@ static <R> R asDateTimeField(Object value, Function<String, R> asDateTimeMethod,
}
}

private static long utcMillisRemoveTime(long l) {
static long utcMillisRemoveTime(long l) {
return l - (l % DAY_IN_MILLIS);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@
import java.util.function.Function;

import static java.lang.String.format;
import static org.elasticsearch.xpack.sql.jdbc.JdbcDateUtils.asDateTimeField;
import static org.elasticsearch.xpack.sql.jdbc.JdbcDateUtils.asMillisSinceEpoch;
import static org.elasticsearch.xpack.sql.jdbc.JdbcDateUtils.utcMillisRemoveTime;

class JdbcResultSet implements ResultSet, JdbcWrapper {

Expand Down Expand Up @@ -252,8 +255,11 @@ private Long dateTime(int columnIndex) throws SQLException {
if (val == null) {
return null;
}
return JdbcDateUtils.asDateTimeField(val, JdbcDateUtils::asMillisSinceEpoch, Function.identity());
};
return asDateTimeField(val, JdbcDateUtils::asMillisSinceEpoch, Function.identity());
}
if (EsType.DATE == type) {
return utcMillisRemoveTime(asMillisSinceEpoch(val.toString()));
}
return val == null ? null : (Long) val;
} catch (ClassCastException cce) {
throw new SQLException(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,8 @@ static Object convert(Object v, EsType columnType, String typeString) throws SQL
return doubleValue(v); // Double might be represented as string for infinity and NaN values
case FLOAT:
return floatValue(v); // Float might be represented as string for infinity and NaN values
case DATE:
return JdbcDateUtils.asDateTimeField(v, JdbcDateUtils::asDate, Date::new);
case DATETIME:
return JdbcDateUtils.asDateTimeField(v, JdbcDateUtils::asTimestamp, Timestamp::new);
case INTERVAL_YEAR:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ public static List<Object[]> readScriptSpec() throws Exception {
tests.addAll(readScriptSpec("/fulltext.csv-spec", parser));
tests.addAll(readScriptSpec("/agg.csv-spec", parser));
tests.addAll(readScriptSpec("/columns.csv-spec", parser));
tests.addAll(readScriptSpec("/date.csv-spec", parser));
tests.addAll(readScriptSpec("/datetime.csv-spec", parser));
tests.addAll(readScriptSpec("/alias.csv-spec", parser));
tests.addAll(readScriptSpec("/null.csv-spec", parser));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ public static void assertResultSetMetadata(ResultSet expected, ResultSet actual,
if (expectedType == Types.TIMESTAMP_WITH_TIMEZONE) {
expectedType = Types.TIMESTAMP;
}

// since csv doesn't support real, we use float instead.....
if (expectedType == Types.FLOAT && expected instanceof CsvResultSet) {
expectedType = Types.REAL;
Expand Down Expand Up @@ -204,6 +205,9 @@ private static void doAssertResultSetData(ResultSet expected, ResultSet actual,
// fix for CSV which returns the shortName not fully-qualified name
if (!columnClassName.contains(".")) {
switch (columnClassName) {
case "Date":
columnClassName = "java.sql.Date";
break;
case "Timestamp":
columnClassName = "java.sql.Timestamp";
break;
Expand Down
90 changes: 90 additions & 0 deletions x-pack/plugin/sql/qa/src/main/resources/date.csv-spec
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
//
// Date
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

//

dateExtractDateParts
SELECT
DAY(CAST(birth_date AS DATE)) d,
DAY_OF_MONTH(CAST(birth_date AS DATE)) dm,
DAY_OF_WEEK(CAST(birth_date AS DATE)) dw,
DAY_OF_YEAR(CAST(birth_date AS DATE)) dy,
ISO_DAY_OF_WEEK(CAST(birth_date AS DATE)) iso_dw,
WEEK(CAST(birth_date AS DATE)) w,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could, also, add the ISO_WEEK_OF_YEAR function, since you added the iso_dow one.

IW(CAST(birth_date AS DATE)) iso_w,
QUARTER(CAST(birth_date AS DATE)) q,
YEAR(CAST(birth_date AS DATE)) y,
birth_date, last_name l FROM "test_emp" WHERE emp_no < 10010 ORDER BY emp_no;

d:i | dm:i | dw:i | dy:i | iso_dw:i | w:i |iso_w:i | q:i | y:i | birth_date:ts | l:s
2 |2 |4 |245 |3 |36 |35 |3 |1953 |1953-09-02T00:00:00Z |Facello
2 |2 |3 |154 |2 |23 |22 |2 |1964 |1964-06-02T00:00:00Z |Simmel
3 |3 |5 |337 |4 |49 |49 |4 |1959 |1959-12-03T00:00:00Z |Bamford
1 |1 |7 |121 |6 |18 |18 |2 |1954 |1954-05-01T00:00:00Z |Koblick
21 |21 |6 |21 |5 |4 |3 |1 |1955 |1955-01-21T00:00:00Z |Maliniak
20 |20 |2 |110 |1 |17 |16 |2 |1953 |1953-04-20T00:00:00Z |Preusig
23 |23 |5 |143 |4 |21 |21 |2 |1957 |1957-05-23T00:00:00Z |Zielinski
19 |19 |4 |50 |3 |8 |8 |1 |1958 |1958-02-19T00:00:00Z |Kalloufi
19 |19 |7 |110 |6 |16 |16 |2 |1952 |1952-04-19T00:00:00Z |Peac
;


dateExtractTimePartsTimeSecond
SELECT
SECOND(CAST(birth_date AS DATE)) d,
MINUTE(CAST(birth_date AS DATE)) m,
HOUR(CAST(birth_date AS DATE)) h
FROM "test_emp" WHERE emp_no < 10010 ORDER BY emp_no;

d:i | m:i | h:i
0 |0 |0
0 |0 |0
0 |0 |0
0 |0 |0
0 |0 |0
0 |0 |0
0 |0 |0
0 |0 |0
0 |0 |0
;

dateAsFilter
SELECT birth_date, last_name FROM "test_emp" WHERE birth_date <= CAST('1955-01-21' AS DATE) ORDER BY emp_no LIMIT 5;

birth_date:ts | last_name:s
1953-09-02T00:00:00Z |Facello
1954-05-01T00:00:00Z |Koblick
1955-01-21T00:00:00Z |Maliniak
1953-04-20T00:00:00Z |Preusig
1952-04-19T00:00:00Z |Peac
;


// The date return can differ due to the randomized timezones (it's transformed to the JDBC client's timezone)
dateAsGroupingKey-Ignore
SELECT CAST(birth_date AS DATE) AS d, CAST(SUM(emp_no) AS INT) s FROM test_emp GROUP BY d ORDER BY d LIMIT 5;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use a function than to extract just the month or the year?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then maybe use the year. I don't understand why we would add a test which is disabled since the start.
Either there's a bug/missing feature so the test will be enabled in the future or the test is too random and as such, it should be rewritten or removed.

Copy link
Contributor Author

@matriv matriv Jan 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there is not point, I will remove.


d:date | s:i
null |100445
1952-02-26 |10097
1952-04-18 |10009
1952-05-14 |10072
1952-06-12 |10076
;

dateAndFunctionAsGroupingKey
SELECT MONTH(CAST(birth_date AS DATE)) AS m, CAST(SUM(emp_no) AS INT) s FROM test_emp GROUP BY m ORDER BY m LIMIT 5;

m:i | s:i
null |100445
1 |60288
2 |80388
3 |20164
4 |80401
;

dateAndInterval
SELECT YEAR(CAST('2019-01-21' AS DATE) + INTERVAL '1-2' YEAR TO MONTH) AS y, MONTH(INTERVAL '1-2' YEAR TO MONTH + CAST('2019-01-21' AS DATE)) AS m;

y:i | m:i
2020 | 3
;
9 changes: 5 additions & 4 deletions x-pack/plugin/sql/qa/src/main/resources/datetime.sql-spec
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@
// Time NOT IMPLEMENTED in H2 on TIMESTAMP WITH TIME ZONE - hence why these are moved to CSV
//

// WEEK_OF_YEAR moved to CSV tests, because H2 builds its Calendar with the local Locale, we consider ROOT as the default Locale
// This has implications on the results, which could change given specific locales where the rules for determining the start of a year are different.
// WEEK_OF_YEAR moved to CSV tests, because H2 builds its Calendar with the local Locale,
// we consider ROOT as the default Locale. This has implications on the results, which could
// change given specific locales where the rules for determining the start of a year are different.

//
// DateTime
Expand All @@ -31,10 +32,10 @@ SELECT MONTHNAME(CAST('2018-09-03' AS TIMESTAMP)) month FROM "test_emp" limit 1;
dayNameFromStringDateTime
SELECT DAYNAME(CAST('2018-09-03' AS TIMESTAMP)) day FROM "test_emp" limit 1;

quarterSelect
dateTimeQuarter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the file name is already used as a prefix and adding dateTime in the test name becomes redundant:
{datetime.dateTimeQuarter}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see - the others are inconsistent though :)

SELECT QUARTER(hire_date) q, hire_date FROM test_emp ORDER BY hire_date LIMIT 15;

dayOfWeek
dateTimeDayOfWeek
SELECT DAY_OF_WEEK(birth_date) day, birth_date FROM test_emp ORDER BY DAY_OF_WEEK(birth_date);

//
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,9 @@ public Object extract(Bucket bucket) {
if (object == null) {
return object;
} else if (object instanceof Long) {
object = DateUtils.of(((Long) object).longValue(), zoneId);
object = DateUtils.asDateTime(((Long) object).longValue(), zoneId);
} else if (object instanceof String) { // CAST(<value> AS DATE) is used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks suspicious - why would the key be a string and that would represent only a date and not a datetime?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If zoneId != null we now we are expecting a DATETIME or DATE.
If it's a Long it's a DATETIME (ES date), If it's a DATE then there is a CAST applied with painless and the object extracted is a String. Do you have some other idea? Maybe I'm missing something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a more detailed comment for the instanceof String branch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The painless script is unnecessary and complicated namely because a histogram with a granularity of 1 day is going to ignore the time component anyway. We can and should identify this pattern GROUP BY - CAST(DATETIME->DATE) and simply convert that to a GROUP BY (DATETIME).
I'm not sure if the time will show-up or not (I would expect it to be zero anyway due to 1 day granularity).
Even if we were to use the painless script, since the casting is to a date, it should still return a ZonedDateTime and not a String however as I mention, we can optimize the script away.

I'm fine with addressing this in a different PR.

object = DateUtils.asDateOnly(object.toString());
} else {
throw new SqlIllegalArgumentException("Invalid date key returned: {}", object);
}
Expand Down Expand Up @@ -129,4 +131,4 @@ public boolean equals(Object obj) {
public String toString() {
return "|" + key + "|";
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -130,11 +130,11 @@ private Object unwrapMultiValue(Object values) {
}
if (dataType == DataType.DATETIME) {
if (values instanceof String) {
return DateUtils.of(Long.parseLong(values.toString()));
return DateUtils.asDateTime(Long.parseLong(values.toString()));
}
// returned by nested types...
if (values instanceof DateTime) {
return DateUtils.of((DateTime) values);
return DateUtils.asDateTime((DateTime) values);
}
}
if (values instanceof Long || values instanceof Double || values instanceof String || values instanceof Boolean) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
*/
package org.elasticsearch.xpack.sql.expression;

import org.elasticsearch.common.Strings;
import org.elasticsearch.xpack.sql.SqlIllegalArgumentException;
import org.elasticsearch.xpack.sql.expression.Expression.TypeResolution;
import org.elasticsearch.xpack.sql.expression.gen.pipeline.Pipe;
Expand All @@ -16,11 +15,13 @@
import java.util.Collection;
import java.util.List;
import java.util.Locale;
import java.util.StringJoiner;
import java.util.function.Predicate;

import static java.lang.String.format;
import static java.util.Collections.emptyList;
import static java.util.Collections.emptyMap;
import static org.elasticsearch.xpack.sql.type.DataType.BOOLEAN;

public final class Expressions {

Expand Down Expand Up @@ -155,7 +156,7 @@ public static List<Pipe> pipe(List<Expression> expressions) {
}

public static TypeResolution typeMustBeBoolean(Expression e, String operationName, ParamOrdinal paramOrd) {
return typeMustBe(e, dt -> dt == DataType.BOOLEAN, operationName, paramOrd, "boolean");
return typeMustBe(e, dt -> dt == BOOLEAN, operationName, paramOrd, "boolean");
}

public static TypeResolution typeMustBeInteger(Expression e, String operationName, ParamOrdinal paramOrd) {
Expand All @@ -171,11 +172,11 @@ public static TypeResolution typeMustBeString(Expression e, String operationName
}

public static TypeResolution typeMustBeDate(Expression e, String operationName, ParamOrdinal paramOrd) {
return typeMustBe(e, dt -> dt == DataType.DATETIME, operationName, paramOrd, "date");
return typeMustBe(e, DataType::isDateBased, operationName, paramOrd, "date", "datetime");
}

public static TypeResolution typeMustBeNumericOrDate(Expression e, String operationName, ParamOrdinal paramOrd) {
return typeMustBe(e, dt -> dt.isNumeric() || dt == DataType.DATETIME, operationName, paramOrd, "numeric", "date");
return typeMustBe(e, dt -> dt.isNumeric() || dt.isDateBased(), operationName, paramOrd, "date", "datetime", "numeric");
}

public static TypeResolution typeMustBe(Expression e,
Expand All @@ -188,8 +189,20 @@ public static TypeResolution typeMustBe(Expression e,
new TypeResolution(format(Locale.ROOT, "[%s]%s argument must be [%s], found value [%s] type [%s]",
operationName,
paramOrd == null || paramOrd == ParamOrdinal.DEFAULT ? "" : " " + paramOrd.name().toLowerCase(Locale.ROOT),
Strings.arrayToDelimitedString(acceptedTypes, " or "),
acceptedTypesForErrorMsg(acceptedTypes),
Expressions.name(e),
e.dataType().esType));
}

private static String acceptedTypesForErrorMsg(String... acceptedTypes) {
StringJoiner sj = new StringJoiner(", ");
for (int i = 0; i < acceptedTypes.length - 1; i++) {
sj.add(acceptedTypes[i]);
}
if (acceptedTypes.length > 1) {
return sj.toString() + " or " + acceptedTypes[acceptedTypes.length - 1];
} else {
return acceptedTypes[0];
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -47,4 +47,4 @@ public String innerName() {
protected TypeResolution resolveType() {
return Expressions.typeMustBeNumericOrDate(field(), sourceText(), ParamOrdinal.DEFAULT);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ protected TypeResolution resolveType() {
TypeResolution resolution = Expressions.typeMustBeNumericOrDate(field(), "HISTOGRAM", ParamOrdinal.FIRST);
if (resolution == TypeResolution.TYPE_RESOLVED) {
// interval must be Literal interval
if (field().dataType() == DataType.DATETIME) {
if (field().dataType().isDateBased()) {
resolution = Expressions.typeMustBe(interval, DataTypes::isInterval, "(Date) HISTOGRAM", ParamOrdinal.SECOND, "interval");
} else {
resolution = Expressions.typeMustBeNumeric(interval, "(Numeric) HISTOGRAM", ParamOrdinal.SECOND);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,4 +74,4 @@ public boolean equals(Object obj) {
public int hashCode() {
return Objects.hash(field(), zoneId());
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -357,11 +357,11 @@ private static Object asDateTime(Object dateTime, boolean lenient) {
return ((JodaCompatibleZonedDateTime) dateTime).getZonedDateTime();
}
if (dateTime instanceof ZonedDateTime) {
return (ZonedDateTime) dateTime;
return dateTime;
}
if (false == lenient) {
if (dateTime instanceof Number) {
return DateUtils.of(((Number) dateTime).longValue());
return DateUtils.asDateTime(((Number) dateTime).longValue());
}

throw new SqlIllegalArgumentException("Invalid date encountered [{}]", dateTime);
Expand Down
Loading