-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] Time Units for Date Histogram Interval incomplete #28432
Comments
/cc @colings86 I remember talking to you about this a few years ago. |
This is tricky and there are a couple of things that I should explain here and we should explain (or explain better) in the docs. Tl;dr: intervals are hard because time zones are hard and we have made some trade offs which we should probably clarify better in the docs (though its hard to without writing a book in just this subject). For the details of why keep reading the wall of text. Firstly, there is a fundamental difference between what the interval It should also be noted that calendar intervals can be different periods from the fixed-duration intervals at pretty much all levels for some arbitrary date and time zone, the only levels that I have not come across a difference is milliseconds and seconds (a second is always 1000 milliseconds in every calendar[1] system I have seen but that may not be universally true). Let me show some examples at different levels. Note that differences between calendar and fixed-duration intervals at one particular level bleed into all the levels above it.
There are little to no rules as to what a particular country/part of a country etc. are allowed to do when it comes to changing their time zone offset. Countries do change time zones offsets, both in one off cases and as regular occurances and these shifts can and have been in awkward amounts like shifting the time zone offset by 45 minutes. There are also not hard and fast rules about when a time zone offset shift might occur. Most time zone offsets are performed at a time far enough into the day that the offset will not cause it to jump back into the previous day or forward into the next day but there are of course exceptions, like The outcome of all the above is that we had to make a decision that we would not support fixed-duration intervals expressed in units higher than days of the form So why don't we just support calendar intervals of the form
|
What @colings86 said. Nothing longer than a second has a constant length in seconds, so you can't just map things to an equivalent number of seconds (milliseconds, whatever) without losing information. ISO8601 has a nice model for durations (except quarters) that's way too complicated for time units for timeouts etc. but might be a useful guide for more human-facing things. I wholeheartedly believe that the correct model for things that are supposed to be a whole number of days should be based on counting days, as an integer, and not by mucking around with timestamps that may or may not represent midnight in some timezone or other. I don't understand why things like Joda make this so hard by trying to combine the notion of a day (discrete values) and a time (essentially continuous). They're different things.
I'm not sure about this. Calculating the local timezone offset involves a little bit of a search but we have to do this anyway; calculating the number of days since
I think the ability to "offset" buckets like this would be very useful. For instance, the UK tax year is a whole year, but offset by -270 days so it always starts on 6 April. I can think of more exotic bucketing strategies that would be useful: for instance, some accounting techniques require years to be a whole number of weeks long, so most "years" are 52 weeks and then every few years there's a 53-week one to catch up. Other accounting techniques require years to be a whole number of 4-week periods long, so most years are 52 weeks long and then occasionally there's a 56-weeker to stop things getting too far out of kilter. The UK railway divides the year into 13 periods, the first of which starts on 1 April, the second starts on the Sunday before the first Thursday in May, and the rest start every 28 days after that. I could go on. Since it's nearly renewal time for my pedantry badge this year and I'm a few points short, I'm compelled to add:
Except the weeks containing 6 or 8 days that happen when a country decides to move itself across the international date line. We don't talk about those, tho.
Unless we start to support aggregations based on things like the Hebrew calendar. They have leap months.
... and there was this one time in Sweden in 1712 when it had 30 days. |
There is also watcher |
@elastic/es-search-aggs |
https://www.elastic.co/guide/en/elasticsearch/reference/6.1/common-options.html#time-units
The issue is that this table incomplete and inaccurate. Units below
ms
make no sense because we do not represent dates at a better resolution than milliseconds. Both1micros
and1nanos
(thankfully) result in"Zero or negative time interval not supported"
.However, there are other units that can be used sometimes, but for some reason not other times. Like:
1y
works, but2y
does not.1M
works, but2M
does not. If I recall, the reasoning for these surrounds the changing nature of it.In some ways, I can appreciate that something like "2y" and
2M
are not consistent intervals (because of leap years and inconsistent month lengths). But weeks are consistent and quarters can only be true if you consider Q1 January - March (etc), which I feel like3M
could do too.Regardless, if we are going to block higher number ranges, then we should look at blocking all number ranges for a given time unit and we need to drop units that simply do not work.
The text was updated successfully, but these errors were encountered: