Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more flexibility to MovingFunction window alignment #44360

Merged
merged 9 commits into from
Aug 2, 2019
Merged

Add more flexibility to MovingFunction window alignment #44360

merged 9 commits into from
Aug 2, 2019

Conversation

Hohol
Copy link
Contributor

@Hohol Hohol commented Jul 15, 2019

Introduce shift field to MovingFunction aggregation.
By default, shift = 0. Behavior, in this case, is the same as before.
Increasing shift by 1 moves starting window position by 1 to the right.

  • To simply include current bucket to the window, use shift = 1
  • For center alignment (n/2 values before and after the current bucket), use shift = window / 2
  • For right alignment (n values after the current bucket), use shift = window.

Closes #42181

@Hohol
Copy link
Contributor Author

Hohol commented Jul 15, 2019

@polyfractal, check this out.
It's only a draft, so a lot of things are still missing.
First of all, could you take a look at tests in MovFnUnitTests and say if you find the expected behaviour in these tests reasonable?
Particularly, I'm not entirely sure how the result should look like when the window moves outside the borders.

@Hohol
Copy link
Contributor Author

Hohol commented Jul 15, 2019

There is a serialization of MovFnPipelineAggregator values. Do we need cross-version compatibility this time?

@Hohol
Copy link
Contributor Author

Hohol commented Jul 15, 2019

We can allow specifying special shift values using descriptive names.
Namely for inclusive, center and right alignments.
Do we need it?

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

Copy link
Contributor

@polyfractal polyfractal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Particularly, I'm not entirely sure how the result should look like when the window moves outside the borders.

Checking my understanding: the current behavior is to "clamp" to the available values in the window, correct? So as the window approaches the edge, the actual window of values offered to the function shrinks? E.g. there is no "padding" once the window goes past the values (on either side).

I think that's the correct choice. I dont think we want to introduce padding, or circularizing like is done with FFT. Too complicated, unintuitive. Clamping seems reasonable to me.

There is a serialization of MovFnPipelineAggregator values. Do we need cross-version compatibility this time?

Afraid so. :( We'll need to put version gates on the serialization, and handle the case where we are talking to old nodes that don't serialize shift (setting shift = 0 in that case)

We can allow specifying special shift values using descriptive names.

Hmm, good question. I'm leaning towards "no", and just use the numeric shift parameter. Simpler that way and as much flexibility as a user wants.

@Hohol
Copy link
Contributor Author

Hohol commented Jul 16, 2019

We'll need to put version gates on the serialization, and handle the case where we are talking to old nodes that don't serialize shift (setting shift = 0 in that case)

What if we're writing to an old node and shift != 0? Should we just ignore it, so the node will work as if shift was 0, or should we throw an exception?

@polyfractal
Copy link
Contributor

Yeah, in these cases we assume the old behavior, so here we'd just avoid sending shift and the old node works as if shift == 0.

Heterogeneous clusters are assumed to be short-lived and only during rolling upgrades, so with regards to bwc in mixed clusters we just try not to break things while in that mixed state.

Since pipelines are only executed on the coordinating node, the behavior will depend on if the query lands on a new or old node. Serialization is needed because they go out with the full agg tree, but the execution only happens on the coordinator.

@Hohol Hohol marked this pull request as ready for review July 16, 2019 16:37
@Hohol
Copy link
Contributor Author

Hohol commented Jul 16, 2019

Ready for review.

Copy link
Contributor

@polyfractal polyfractal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, but otherwise LGTM! Will kick off a CI build after the last commit. Thanks!

@polyfractal
Copy link
Contributor

@elasticmachine test this please

@polyfractal
Copy link
Contributor

Sorry for the delay, was out on holiday last week. Not sure why CI got stuck like that, going to trigger another build.

@elasticmachine test this please

@Hohol
Copy link
Contributor Author

Hohol commented Jul 25, 2019

Maybe I need to merge with master?

@polyfractal
Copy link
Contributor

++ Let's merge in master, that tends to resolve unrelated failures :)

@polyfractal
Copy link
Contributor

@elasticmachine test this please

@polyfractal
Copy link
Contributor

Hmm, ILM EOFException exception on serialization. I think it's unrelated although we should keep an eye out if we see more serialization issues... might be side-effect of changes here. Re-running bwc tests.

@elasticmachine run elasticsearch-ci/bwc

@Hohol
Copy link
Contributor Author

Hohol commented Aug 2, 2019

Failed again. This time it's a different error.

@polyfractal
Copy link
Contributor

Bleh, that's an artifact from the version bump yesterday. Gonna merge master and start again. Sorry, it's not normally this tough to get a green CI :(

@polyfractal
Copy link
Contributor

@elasticmachine update branch

@polyfractal
Copy link
Contributor

@elasticmachine test this please

@polyfractal
Copy link
Contributor

🎉 merging! Thanks @Hohol :)

@polyfractal polyfractal merged commit ead4eb5 into elastic:master Aug 2, 2019
polyfractal pushed a commit that referenced this pull request Aug 2, 2019
Introduce shift field to MovingFunction aggregation.

By default, shift = 0. Behavior, in this case, is the same as before.
Increasing shift by 1 moves starting window position by 1 to the right.

    To simply include current bucket to the window, use shift = 1
    For center alignment (n/2 values before and after the current bucket), use shift = window / 2
    For right alignment (n values after the current bucket), use shift = window.
polyfractal added a commit that referenced this pull request Aug 2, 2019
polyfractal pushed a commit to polyfractal/elasticsearch that referenced this pull request Aug 2, 2019
Introduce shift field to MovingFunction aggregation.

By default, shift = 0. Behavior, in this case, is the same as before.
Increasing shift by 1 moves starting window position by 1 to the right.

    To simply include current bucket to the window, use shift = 1
    For center alignment (n/2 values before and after the current bucket), use shift = window / 2
    For right alignment (n values after the current bucket), use shift = window.
@Hohol Hohol deleted the flexible-moving-function branch August 5, 2019 10:16
Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request Oct 14, 2019
russcam pushed a commit to elastic/elasticsearch-net that referenced this pull request Oct 15, 2019
russcam pushed a commit to elastic/elasticsearch-net that referenced this pull request Oct 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add more flexibility to MovingFunction window alignment
4 participants