Identify partial buckets #2803

spalger · 2015-01-29T04:07:19Z

There has been some discussion surrounding the partial buckets created using a date_histogram and a the time filter. As a stopgap solution, we have decided to removing the current "clipping" behavior as it negatively impacts grouped style bar charts. This ticket is for additional discussion regarding how we will more completely fix the underlying issue.

the problem

When using a date_histogram, elasticsearch creates time buckets rounded based on the interval. If our timefilter was set to 8:30am - 10:30am, and our interval was set to "hourly", elasticsearch would bucket on the whole hour and return buckets for 8am, 9am, 10am, and 11am. We can already detect an issue as we requested 2 hours worth of data that was bucketed hourly, but we ended up with 4 buckets.

Furthermore, the documents used to fill these buckets with metrics would still be limited by the time filter, so we would end up with two whole buckets (9 & 10am) and two partial buckets (8 & 11am). If we had chosen to calculate a sum aggregation on these buckets, the values in the partial buckets are pretty much useless.

possible solutions

The first option below seems to be a popular one, though after detailing how it would work I am of the opinion it only points out the issue and doesn't actually solve it. Options 2-4 are doubt the more difficult option, and each of the possible implementations has it's own drawbacks. It does however prevent us from having to implement disclaimers or warnings as the data should be what we assume users intend to request.

one

We could identify the partial buckets using visual indicators (which might be hidden on mouseover) and warn the user that the charted values might not be an accurate view of the data.

This approach has the benefit of showing the user exactly what their input produced in elasticsearch. Unfortunately that means that it will still be invalid is some cases, and the update also draws more attention to the issue in order to tell you to ignore it.

two

calculate the buckets we would prefer manually, and specify them via a filter aggregation. This puts the complexity in Kibana, and it isn't really clear what the edge cases are. It also forces us to decide to either limit the final bucket based on the time range, or extend it to the end of the final interval. example

three

extending the date_histogram aggregation to support custom rounding rules/target bucket counts/etc. While this is definitely worth asking for, it probably won't be a simple thing to get in.

four

extend the time range to the closest interval boundary – regardless of the time filter requested. This could have undesired effects for large intervals paired with small timeframes. It would also require that we either clip the bars or show dates that are not supposed to be in the data.

rashidkpc · 2015-01-29T14:31:25Z

Here's a mockup that doesn't include clipping and might more accurately show what we think might work:

rashidkpc · 2015-01-29T14:32:31Z

Also clearing the milestone here since this seems important and doesn't seem super hard to implement hopefully, maybe we can get it in earlier. Would like to see input from @stormpython and @jthomassie here

stormpython · 2015-01-29T14:38:09Z

Removing clipping seems like an easy enough fix. We would simply default to what we do normally. And we are given access to the time range the user specifies, so we could shade the range of times on both ends that you see in the images above. I do not think it would be very difficult to implement.

However, we will have to see if there are any adverse side effects to remove clipping since we've set a lot of things up to deal with it in the code base.

In closing: easy to implement, unsure of the unintended side effects. But hopefully, there are only a few if any at all.

rashidkpc · 2015-01-29T14:41:14Z

IIRC one of the reasons for clipping was range selection, that is if a user brushed all the way to the end of the chart they would actually end up expanding their selection in one direction rather than narrowing it overall as intended, as the lack of clipping causes the axis to extend beyond the bounds of the search. We'll need some way to halt the brush to avoid that

Adding the grey bars or some other visual identifier to the end of the range would accomplish that

spalger · 2015-01-29T15:15:54Z

I get that grouped bars are not showing all groups, but why are we so concerned about rendering the whole partial buckets when they often don't even show valid data?

I think we should really focus on a solution that makes those buckets useful instead of just visible.

rashidkpc · 2015-01-29T15:29:51Z

Shrug, I'm not totally against clipping, this solution might make implementation a bit simpler, but maybe not. Still open to ideas here.

Fix texts for elastic#2803

francisca-lima · 2019-01-08T10:54:16Z

Any other solution to this? I don't want my data to disappear.

spalger added the discuss label Jan 29, 2015

spalger added this to the 4.1.0 milestone Jan 29, 2015

rashidkpc removed this from the 4.1.0 milestone Jan 29, 2015

rashidkpc mentioned this issue Jan 29, 2015

Option to drop partial buckets #2806

Closed

stormpython mentioned this issue Feb 2, 2015

Remove Clipping #2820

Merged

stormpython added a commit to stormpython/kibana that referenced this issue Feb 3, 2015

Merge pull request #8 from spenceralger/shelby_2803

6b6fab3

Fix texts for elastic#2803

spalger closed this as completed Mar 26, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify partial buckets #2803

Identify partial buckets #2803

spalger commented Jan 29, 2015

rashidkpc commented Jan 29, 2015

rashidkpc commented Jan 29, 2015

stormpython commented Jan 29, 2015

rashidkpc commented Jan 29, 2015

spalger commented Jan 29, 2015

rashidkpc commented Jan 29, 2015

francisca-lima commented Jan 8, 2019

Identify partial buckets #2803

Identify partial buckets #2803

Comments

spalger commented Jan 29, 2015

the problem

possible solutions

one

two

three

four

rashidkpc commented Jan 29, 2015

rashidkpc commented Jan 29, 2015

stormpython commented Jan 29, 2015

rashidkpc commented Jan 29, 2015

spalger commented Jan 29, 2015

rashidkpc commented Jan 29, 2015

francisca-lima commented Jan 8, 2019