-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Fix esType for DATETIME/DATE and INTERVALS #38179
Conversation
Since introduction of data types that don't have a corresponding type in ES the `esType` is error-prone when used for `unmappedType()` calls. Moreover since the renaming of `DATE` to `DATETIME` and the introduction of an actual date-only `DATE` the `esType` would return `datetime` which is not a valid type for ES mapping. Fixes: elastic#38051
Pinging @elastic/es-search |
@@ -661,17 +661,17 @@ public void testTopHitsAggregationWithTwoArgs() { | |||
|
|||
} | |||
{ | |||
PhysicalPlan p = optimizeAndPlan("SELECT LAST(keyword, int) FROM test"); | |||
PhysicalPlan p = optimizeAndPlan("SELECT LAST(date, int) FROM test"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this change required by the types change in this PR or you just wanted some diversity in FIRST/LAST tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to test the unmapped_type
https://github.com/elastic/elasticsearch/pull/38179/files#diff-aef2b0ce456b8fdd5cc09d6cfd55f0c2R674 which before the fix was datetime
*/ | ||
public final String esType; | ||
public final String esSQLType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be wrong, but why not esSqlType
name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
@@ -126,9 +127,9 @@ | |||
|
|||
|
|||
/** | |||
* Elasticsearch type name | |||
* Lowercase type name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure "Lowercase" is the right description for this field, I mean it is a true statement, but doesn't help much with the purpose of the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
The purpose of a comment is to explain the non-obvious:
/** an integer */ vs /** array index */
public int i;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left just few minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why a new property was introduced (esSQLType - by the way the naming is incorrect since we use Sql everywhere else) or what does it even mean - the type of esSQL?
If it's the Elasticsearch type then that's what esType
is for. if it's the sql type it's sqlType
.
If it's the type name itself why not use dt.name? If it's a lowercase vs uppercase (again arguable since it's es vs sql) a typeName() method might be more appropriate.
Further more the style of the code changes is confusing - the field access (.esType) was moved to method invocation (esType()) but not in all places.
@@ -44,13 +44,13 @@ AggregationBuilder toBuilder() { | |||
new FieldSortBuilder(sortField) | |||
.order(sortOrder) | |||
.missing(LAST.position()) | |||
.unmappedType(sortFieldDataType.esType)); | |||
.unmappedType(sortFieldDataType.esType())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there still a sortFieldDataType? I recall discussing about removing it in the date PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the date PR? You mean in the FIRST/LAST PR? I checked again, but don't get, and sorry maybe I missed something, but we need it for this the unmappedType
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for sortFieldDataType
- you know it already null (hence the FieldSortBuilder), simply do sortField.dataType()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot why I didn't do it after all:
We need the field name to be extracted from the field, which has some small logic and is something that we do in the QueryTranslator
for other cases as well, so I didn't want to pass the field and do the name extraction inside the TopHitsAgg
.
I wasn't sure about this name, so I was waiting for suggestions. I chose
So the idea is to use the lowercase name to everywhere it has to do with error messages or returning the column data type to the client as we used to, and use the new method |
Being a enum, Let's discuss this more next week. |
So, for error messages, return column types, etc.. we need the name of the enum in lowercase. BUT for calls to Moreover if we try to make a call to |
@costin Please check again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good however there's still the esType initialization that could be sorted out.
@@ -176,7 +181,8 @@ | |||
|
|||
DataType(SQLType sqlType, int size, int defaultPrecision, int displaySize, boolean isInteger, | |||
boolean isRational, boolean defaultDocValues) { | |||
this.esType = name().toLowerCase(Locale.ROOT); | |||
this.typeName = name().toLowerCase(Locale.ROOT); | |||
this.esType = name().equals("DATETIME") ? "date" : typeName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better way would be to ask this parameter for the constructor and add it for each declaration.
So DATETIME
would be "date"
, INTERVAL_
... null
while NULL
would be "null"
. Potentially add an overloaded constructor to specify esType name which by default would be the enum name lowercase but with the possibility of being overwritten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
OBJECT( "object", JDBCType.STRUCT, -1, 0, 0, false, false, false), | ||
NESTED( "nested", JDBCType.STRUCT, -1, 0, 0, false, false, false), | ||
BINARY( "binary", JDBCType.VARBINARY, -1, Integer.MAX_VALUE, 0, false, false, false), | ||
DATE( JDBCType.DATE, Long.BYTES, 10, 10, false, false, true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be "date" as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But DATE
doesn't have a corresponding esType, same as all INTERVALS.
INTEGER( JDBCType.INTEGER, Integer.BYTES, 10, 11, true, false, true), | ||
LONG( JDBCType.BIGINT, Long.BYTES, 19, 20, true, false, true), | ||
// esType jdbc type, size, defPrecision,dispSize, int, rat, docvals | ||
NULL( "null", JDBCType.NULL, 0, 0, 0, false, false, false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have gone the other way - use the lower-case enum way as a default type and just override this convention with null or "date". Just a matter of preference really.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's more elegant yes, but I choose the more "explicit" way to show clearly what the corresponding esType is.
Since introduction of data types that don't have a corresponding type in ES the `esType` is error-prone when used for `unmappedType()` calls. Moreover since the renaming of `DATE` to `DATETIME` and the introduction of an actual date-only `DATE` the `esType` would return `datetime` which is not a valid type for ES mapping. Fixes: #38051
Backported to |
* 6.x: (25 commits) Backport of types removal for Put/Get index templates (elastic#38465) Add support for API keys to access Elasticsearch (elastic#38291) (elastic#38399) Deprecate support for internal versioning for concurrency control (elastic#38451) Deprecate types in rollover index API (elastic#38389) (elastic#38458) Add typless client side GetIndexRequest calls and response class (elastic#38422) [ML] Report index unavailable instead of waiting for lazy node (elastic#38444) await fix CurtIT#testIndex until elastic#38451 is merged (elastic#38466) Update ilm-api.asciidoc, point to REMOVE policy (elastic#38235) (elastic#38464) SQL: Fix esType for DATETIME/DATE and INTERVALS (elastic#38179) Clean up duplicate follow config parameter code (elastic#37688) (elastic#38443) Deprecation check for No Master Block setting (elastic#38383) Bubble-up exceptions from scheduler (elastic#38441) Lift retention lease expiration to index shard (elastic#38391) Deprecate maxRetryTimeout in RestClient and increase default value (elastic#38425) Update Rollup Caps to allow unknown fields (elastic#38446) Backport of elastic#38411: `if_seq_no` and `if_primary_term` parameters aren't wired correctly in REST Client's CRUD API Support unknown fields in ingest pipeline map configuration (elastic#38429) SQL: Implement CURRENT_DATE (elastic#38175) Backport changes to the release notes script. (elastic#38346) Fix ILM explain response to allow unknown fields (elastic#38363) ...
* master: Add an authentication cache for API keys (elastic#38469) Fix exit code in certutil packaging test (elastic#38393) Enable logs for intermittent test failure (elastic#38426) Disable BWC to backport recovering retention leases (elastic#38477) Enable bwc tests now that elastic#38443 is backported. (elastic#38462) Fix Master Failover and DataNode Leave Blocking Snapshot (elastic#38460) Recover retention leases during peer recovery (elastic#38435) Set update mappings mater node timeout to 30 min (elastic#38439) Assert job is not null in FullClusterRestartIT (elastic#38218) Update ilm-api.asciidoc, point to REMOVE policy (elastic#38235) (elastic#38463) SQL: Fix esType for DATETIME/DATE and INTERVALS (elastic#38179) Handle deprecation header-AbstractUpgradeTestCase (elastic#38396) XPack: core/ccr/Security-cli migration to java-time (elastic#38415) Disable bwc tests for elastic#38443 (elastic#38456) Bubble-up exceptions from scheduler (elastic#38317) Re-enable TasksClientDocumentationIT.testCancelTasks (elastic#38234) Allow custom authorization with an authorization engine (elastic#38358) CRUDDocumentationIT fix documentation references Remove support for internal versioning for concurrency control (elastic#38254)
Since introduction of data types that don't have a corresponding type
in ES the
esType
is error-prone when used forunmappedType()
calls.Moreover since the renaming of
DATE
toDATETIME
and the introductionof an actual date-only
DATE
theesType
would returndatetime
whichis not a valid type for ES mapping.
Fixes: #38051