-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add datetime aggregation fixes #144
Conversation
This comment was marked as spam.
This comment was marked as spam.
case TIME: | ||
return MILLIS.between(LocalTime.MIN, expr.timeValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workaround for sesstion context
issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a TODO to fix this
opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/Utils.java
Outdated
Show resolved
Hide resolved
opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/Utils.java
Outdated
Show resolved
Hide resolved
.shouldMatch(new ExprTimestampValue("1984-03-17 22:16:42") | ||
.timestampValue().toEpochMilli()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add the all-nulls case here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to add ITs that cover too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing UT framework is very limited and can't properly cover such case. There is a test for that and it didn't detect such case:
Lines 102 to 108 in 3eba57e
@Test | |
void can_execute_expression_with_null_field() { | |
assertThat() | |
.docValues("age", null) | |
.evaluate(ref("age", INTEGER)) | |
.shouldMatch(null); | |
} |
case TIME: | ||
state.total += MILLIS.between(LocalTime.MIN, value.timeValue()); | ||
break; | ||
case DOUBLE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot compute AVG for SHORT, INTEGER, FLOAT, etc...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These types are casted to DOUBLE automatically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the automatic casting of types to double be tested in AvgAggregatorTest? I think only INTEGER and DOUBLE are only in there for number values.
core/src/test/java/org/opensearch/sql/expression/aggregation/AvgAggregatorTest.java
Outdated
Show resolved
Hide resolved
public static long extractEpochMilliFromAnyDateTimeType(ExprValue value) { | ||
switch ((ExprCoreType)value.type()) { | ||
case DATE: | ||
return MILLIS.between(LocalDateTime.of(1970, 1, 1, 0, 0), value.datetimeValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be wrong, but it seems like we could use the Instant.toEpochMilli for this call too. Something like:
value.datetimeValue().toInstant().toEpochMilli();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
Initially case DATE:
fallen back with TIMESTAMP
and DATETIME
, conversion thru timestamp (Instant
) returned wrong result, because timestamp changes TZ to UTC.
DATE
-> MILLIS
-> DATE
-> MILLIS
check failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. While thinking on your comment I found a mistake. Fixed in 478141e.
Nice catch! It is fixed in #145 (not merged yet on upstream - opensearch-project#1000). |
I see. Thanks. LGTM! |
I think we should add integration tests for these cases once the PR upstream is merged. |
db98434
to
435e940
Compare
* Add aggregator fixes and some useless unit tests. * Add `avg` aggregation on datetime types. * Rework in-memory `AVG`. Fix parsing value returned from the OpenSearch node. Signed-off-by: Yury-Fridlyand <yuryf@bitquilltech.com>
435e940
to
31a72d4
Compare
Done! Thank you for the idea. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation looks good.
Signed-off-by: Yury-Fridlyand yuryf@bitquilltech.com
Description
Make listed aggregations work with datetime types.
Fixes
min
max
avg
Not included
var_samp
var_pop
stddev_samp
stddev_pop
sum
[1]Implementation details:
Presicion
Aggregation done with milliseconds precision - this follows OpenSearch approach. See code snippets: one, two.
We convert datetimes to millis and back on our own (see below), so we can scale up to nanoseconds. Such fix requires additional changes - few tests fail.
ExpressionAggregationScript::execute
A callback function, called by OpenSearch node to extract a value during processing aggregation script.
This added for backward compatibility:
opensearch-project-sql/opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/ExpressionAggregationScript.java
Lines 53 to 55 in 272bf67
This - to extract value. Fortunately,
toEpochMilli()
returns negative values for pre-Epoch timestamps, so we are able to use it for group/compare any values.opensearch-project-sql/opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/ExpressionAggregationScript.java
Lines 56 to 67 in 272bf67
OpenSearch accepts dates in
Joda
lib types and converts to milliseconds since Epoch. We have java datetime types, to I do conversion there.https://github.com/opensearch-project/OpenSearch/blob/140e8d3e6c91519edc47be07b4cd053fdfac1769/server/src/main/java/org/opensearch/search/aggregations/support/values/ScriptDoubleValues.java#L123
OpenSearchExprValueFactory
parseTimestamp
andconstructTimestamp
OpenSearch returns aggregated values as strings, even milliseconds since Epoch. I have to extract it, so I combined these functions together.
opensearch-project-sql/opensearch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java
Lines 178 to 198 in 975317c
ExprValueUtils
These two methods used by
avg
in-memory aggregation for datetime types.opensearch-project-sql/core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java
Line 190 in 272bf67
opensearch-project-sql/core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java
Line 213 in 272bf67
AvgAggregator
AvgState
was renamed toDoubleAvgState
DateTimeAvgState
does the same logic, but for datetime types.opensearch-project-sql/core/src/main/java/org/opensearch/sql/expression/aggregation/AvgAggregator.java
Lines 105 to 117 in 272bf67
AvgState
AggregatorFunction
New signature were added.
opensearch-project-sql/core/src/main/java/org/opensearch/sql/expression/aggregation/AggregatorFunction.java
Lines 69 to 76 in 272bf67
Actually, plugin was able to do push down aggregation on datetime types, but this weren't accepted.
[1] It works on MySQL, but I don't see any reason to implement this for datetime types.
Test queries
In-memory aggregation
opensearch-project-sql/integ-test/src/test/java/org/opensearch/sql/sql/AggregationIT.java
Lines 196 to 199 in 17e1ae9
(cast required to ensure that you're working with
TIME
).To try other types, use
Push Down aggregation (
metric
)Push Down aggregation (
composite_buckets
)Issues Resolved
opensearch-project#645
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.