Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force selection of calendar or fixed intervals in date histo agg #33727

Merged
merged 21 commits into from
May 6, 2019

Conversation

polyfractal
Copy link
Contributor

WIP: still needs documentation, and probably some more randomization tests

This adds a calendar_interval and fixed_interval parameter to the date histogram aggregation, as well as deprecating the interval parameter. The two new parameters are mutually exclusive, and cannot be set with interval either. It also deprecates the interval() and dateHistogramInterval() methods of the builder. Finally, it adjusts the composite agg and rollup's to use a similar scheme.

This makes the selection of calendar vs fixed time units explicit to the user, and provides immediate validation of the unit they provide.

This PR should be fully BWC, and throws deprecation warnings where possible.

Some notable points which may need adjusting:

  • If the user opts to use the new parameters, we will accept any valid unit even if it "overlaps". For example, 1h can be both a fixed and calendar unit. If the user specifies fixed_interval: 1h, we know it is a fixed hour, as opposed to calendar_interval: 1h which indicates calendar-aware hour.

    If the user continues to use the deprecated interval: 1h, we fall back to the old behavior where we assume units are calendar if possible.

    This felt like the "right" way to do it, rather than forcing the user to understand that a fixed hour had to be specified as 60m, etc.

  • DateHisto agg builder was a mess, since it accepted both a fixed long interval and a DateHistogramInterval which could be fixed or cal. And now we have two more parameters. Lastly, this logic needed to duplicated to composite agg too. To keep track of this (and reduce copy/paste between composite agg), there's a new DateIntervalWrapper which records which method is used and enforces which can be set together or not.

    Internally, the wrapper gets rid of the long interval and records everything as a DateHistogramInterval. Once the deprecation period is up this class will be greatly simplified.

  • The rollup date histo config just got rid of builders during HLRC refactoring. But now we have interval or one_of(fixed_interval, calendar_interval), which doesn't play nicely with constructing object parsers. So I opted to specialize the DateHistoGroupConfig object into fixed/calendar objects. They just do simple validation and provide a way to tell the configs apart at a later point.

    This let us avoid having weird ctor arguments where two of the three are null (interval, fixed_interval, calendar_interval), and made it easier to deprecate the old way. Happy to entertain a different idea if it's better.

/cc @colings86 @jimczi

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@polyfractal
Copy link
Contributor Author

Note: rollup still stores the the interval as interval in rollup docs, although we should move to also storing fixed_interval/calendar_interval. But since this PR is already enormous I'm leaving that for later.

@polyfractal
Copy link
Contributor Author

polyfractal commented Sep 25, 2018

Jenkins, retest this please

@polyfractal
Copy link
Contributor Author

@colings86 @jimczi this still needs documentation, but I wanted to get a sanity check about the approach (two fields, the various caveats/cleanups that had to happen, etc) before proceeding further. Think the approach is ok based on the OP's text?

If yes I'll go ahead and add docs and get tests passing, etc.

@colings86
Copy link
Contributor

I'm happy with the details outlined in the description. This looks like a good route to sorting out these issues to me

@polyfractal
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/packaging-sample

1 similar comment
@polyfractal
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/packaging-sample

@polyfractal
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/2

@polyfractal
Copy link
Contributor Author

@elasticmachine test this please

@polyfractal
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/1

@polyfractal polyfractal force-pushed the date_histo_explicit_intervals branch from 09b9a4c to 773a24b Compare April 8, 2019 20:22
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 7, 2019
* elastic/master: (414 commits)
  Add tasks to build Docker build context artifacts (elastic#41819)
  Replace more uses of immutable map builder (elastic#41823)
  Force selection of calendar or fixed intervals in date histo agg (elastic#33727)
  Switch run task to use real distro (elastic#41590)
  Clarify that path_match also considers object fields. (elastic#41658)
  [DOCS] remove 'es.scripting.update.ctx_in_params' system property for 7.0 (elastic#41643)
  Clarify _doc is a permanent part of certain document APIs. (elastic#41727)
  Remove the jdk directory to save space on bwc tests (elastic#41743)
  Fix full text queries test that start with now (elastic#41854)
  Remove `nonApplicationWrite` from `SSLDriver` (elastic#41829)
  SQL: [Docs] Add example for custom bucketing with CASE (elastic#41787)
  Cleanup Bulk Delete Exception Logging (elastic#41693)
  [DOCS] Rewrite `term` query docs for new format (elastic#41498)
  Mute PermissionsIT#testWhen[...]ByILMPolicy (elastic#41858)
  ReadOnlyEngine assertion fix (elastic#41842)
  [ML] addresses preview bug, and adds check to PUT (elastic#41803)
  Fix javadoc in WrapperQueryBuilder
  Testsclusters use seprate configurations per version (elastic#41504)
  Skip explain fetch sub phase when request holds only suggestions (elastic#41739)
  remove unused import
  ...
@polyfractal
Copy link
Contributor Author

#41906 has merged, removing backport label and this is done! 🎉

hendrikmuhs pushed a commit to hendrikmuhs/elasticsearch that referenced this pull request May 24, 2019
data frames should use the new syntax. There is no need to support the old syntax as this feature will be released with 7.2.

fixes elastic#42297
hendrikmuhs pushed a commit that referenced this pull request May 24, 2019
data frames should use the new syntax. There is no need to support the old syntax as this feature will be released with 7.2.

fixes #42297
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
…stic#33727)

The date_histogram accepts an interval which can be either a calendar 
interval (DST-aware, leap seconds, arbitrary length of months, etc) or 
fixed interval (strict multiples of SI units). Unfortunately this is inferred
by first trying to parse as a calendar interval, then falling back to fixed
if that fails.

This leads to confusing arrangement where `1d` == calendar, but 
`2d` == fixed.  And if you want a day of fixed time, you have to 
specify `24h` (e.g. the next smallest unit).  This arrangement is very
error-prone for users.

This PR adds `calendar_interval` and `fixed_interval` parameters to any
code that uses intervals (date_histogram, rollup, composite, datafeed, etc).
Calendar only accepts calendar intervals, fixed accepts any combination of
units (meaning `1d` can be used to specify `24h` in fixed time), and both
are mutually exclusive.  

The old interval behavior is deprecated and will throw a deprecation warning.
It is also mutually exclusive with the two new parameters. In the future the 
old dual-purpose interval will be removed.

The change applies to both REST and java clients.
codebrain added a commit to elastic/elasticsearch-net that referenced this pull request Aug 2, 2019
codebrain added a commit to elastic/elasticsearch-net that referenced this pull request Aug 9, 2019
codebrain added a commit to elastic/elasticsearch-net that referenced this pull request Aug 16, 2019
russcam pushed a commit to elastic/elasticsearch-net that referenced this pull request Sep 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants