- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 19.2k
ENH: Support TZ Aware IntervalIndex #18558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
        
          
                pandas/_libs/interval.pyx
              
                Outdated
          
        
      | msg = ("left and right must have the same time zone, got " | ||
| "'{left_tz}' and '{right_tz}'").format( | ||
| left_tz=left.tzinfo, right_tz=right.tzinfo) | ||
| raise ValueError(msg) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about pytz tzinfos? Eg US/Eastern is represented by a different tzinfo object for DST and non-DST
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does my switch to using get_timezone fully address this?  The docstring seems to indicate so, but I'm not super familiar with this part of the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that should do it, thanks.
bda1567    to
    2311143      
    Compare
  
    | Codecov Report
 @@            Coverage Diff             @@
##           master   #18558      +/-   ##
==========================================
- Coverage   91.35%   91.34%   -0.01%     
==========================================
  Files         164      164              
  Lines       49802    49813      +11     
==========================================
+ Hits        45496    45502       +6     
- Misses       4306     4311       +5
 
 Continue to review full report at Codecov. 
 | 
| Codecov Report
 @@            Coverage Diff             @@
##           master   #18558      +/-   ##
==========================================
- Coverage   91.59%   91.58%   -0.02%     
==========================================
  Files         153      153              
  Lines       51212    51217       +5     
==========================================
- Hits        46908    46906       -2     
- Misses       4304     4311       +7
 
 Continue to review full report at Codecov. 
 | 
        
          
                pandas/core/indexes/interval.py
              
                Outdated
          
        
      | # handle tz aware | ||
| if tz: | ||
| data = self.left.tz_localize(None) + 0.5 * delta | ||
| data = data.tz_localize(tz) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better way to do this computation? Couldn't get the correct result without removing the time zone, then adding it back afterwards.
Trying to get the midpoint between elements in left and right.  Basic setup is along the lines of:
In [2]: left = pd.date_range('2017-01-01', periods=3, tz='US/Eastern')
   ...: right = pd.date_range('2017-01-02', periods=3, tz='US/Eastern')
   ...: delta = right.values - left.values
   ...:The existing method removed the tz, and had the wrong time (all times should be 12:00):
In [3]: left.values + 0.5 * delta
Out[3]:
array(['2017-01-01T17:00:00.000000000', '2017-01-02T17:00:00.000000000',
       '2017-01-03T17:00:00.000000000'], dtype='datetime64[ns]')Operating on the index itself instead of the values keeps the tz, but the time is still wrong:
In [4]: left + 0.5 * delta
Out[4]:
DatetimeIndex(['2017-01-01 17:00:00-05:00', '2017-01-02 17:00:00-05:00',
               '2017-01-03 17:00:00-05:00'],
              dtype='datetime64[ns, US/Eastern]', freq='D')Having delta as a TimdeltaIndex raises:
In [20]: left + 0.5 * (right - left)
---------------------------------------------------------------------------
TypeError: data type not understoodThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[20] should work, we might have an issue about this; let's fix this fix this first in another PR. don't use .values with a datetime generally, you either want to operate on the .asi8, the actual values (generally you will also need to .tz_localize(None) first to make them UTC, then reverse this later though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I closed that other PR as its stale. So options are to: issue a new PR before this one to fix #17558, or to do hacky fix here with a TODO, or xfail the tests for this (for now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to look into that issue over the next few days and open new PR to fix it prior to this one being merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great!
        
          
                pandas/core/indexes/interval.py
              
                Outdated
          
        
      | elif isinstance(left, ABCPeriodIndex): | ||
| msg = 'Period dtypes are not supported, use a PeriodIndex instead' | ||
| raise ValueError(msg) | ||
| elif isinstance(left, ABCDatetimeIndex) and left.tz != right.tz: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually compare
str(left.tz) != str(right.tz)
        
          
                pandas/core/indexes/interval.py
              
                Outdated
          
        
      | # handle tz aware | ||
| if tz: | ||
| data = self.left.tz_localize(None) + 0.5 * delta | ||
| data = data.tz_localize(tz) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[20] should work, we might have an issue about this; let's fix this fix this first in another PR. don't use .values with a datetime generally, you either want to operate on the .asi8, the actual values (generally you will also need to .tz_localize(None) first to make them UTC, then reverse this later though).
        
          
                pandas/core/indexes/interval.py
              
                Outdated
          
        
      | else: | ||
| data = self.left + 0.5 * delta | ||
|  | ||
| return DatetimeIndex(data, freq=freq, tz=tz) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't need to reconvert the type
2311143    to
    8ff02ef      
    Compare
  
    8ff02ef    to
    79e255c      
    Compare
  
    | Fixed the  | 
| thanks @jschendel | 
git diff upstream/master -u -- "*.py" | flake8 --diffUpdates to
Interval:Intervalobjects with mixed time zonesUpdates to
IntervalIndex:IntervalIndexwith mixed time zonesIntervalIndex.midfor tz aware_get_next_labeland_get_previous_labelfor tz awareleft/righttypes