Skip to content

date_histogram issue when using "pre_zone_adjust_large_interval" and a timezone with DST #9491

Closed
@orenash

Description

@orenash

Hey there.

Since we upgraded to version 1.3.7, we noticed that we sometimes get weird double buckets when running our date_histogram aggregations.

After trying some things we figured out it only happens when we set pre_zone_adjust_large_interval to true, pre_zone to our local time zone (which uses DST), and interval is 'day' or bigger.

In the aggregration's result we see that we get two buckets corresponding to the same interval, but with an hour difference between their keys.

Here is a simple example that should reproduce this issue:

POST /test/t
{
    "d": "2014-10-08T13:00:00Z"
}

POST /test/t
{
    "d": "2014-11-08T13:00:00Z"
}

GET /test/_search?size=0
{
   "aggs": {
      "test": {
         "date_histogram": {
            "field": "d",
            "interval": "year",
            "pre_zone": "Asia/Jerusalem",
            "pre_zone_adjust_large_interval": true
         }
      }
   }
}

And the result:

"aggregations": {
      "test": {
         "buckets": [
            {
               "key_as_string": "2013-12-31T21:00:00.000Z",
               "key": 1388523600000,
               "doc_count": 1
            },
            {
               "key_as_string": "2013-12-31T22:00:00.000Z",
               "key": 1388527200000,
               "doc_count": 1
            }
         ]
      }
   }

As you can see, although the two timestamps unarguably belong to the same year, they're put into different buckets. As we figured out, it happens because the timezone offset is different for the two timestamps as one of them occurs when DST is on and the other is not.

We use Asia/Jerusalem timezone but the same problem happens in every timezone with DST (eg CET), as long as pre_zone_adjust_large_interval is used (When it's not used both documents will go to the same bucket "2014-01-01T00:00:00.000Z").

This problem never happened before we upgraded to 1.3.7. Moreover, while digging into the code we figured out that this bug is a direct result of the proposed fix for issue #8339. While that fix indeed solved the timezone problem in 'hour' intervals around DST switch, it caused the new problem with bigger intervals.

The problem is also in version 1.4.2.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions