- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 19.2k
BUG: fix read_csv to parse timezone correctly #22380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
        
          
                pandas/io/parsers.py
              
                Outdated
          
        
      | errors='ignore', | ||
| infer_datetime_format=infer_datetime_format | ||
| ) | ||
| if not isinstance(converted, DatetimeIndex): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what breaks if you just change box-> True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
box=True will make a non-datetime array into an Index, resulting downstream errors. The downstream expects ndarray rather than Index
| pandas.io.parsers._infer_types breaks. Specifically, pandas._libs.parsers.sanitize_objects only accepts ndarray as the input. Otherwise, it raises an exception. | 
| right so just put a and see what happens | 
| Codecov Report
 @@           Coverage Diff           @@
##           master   #22380   +/-   ##
=======================================
  Coverage   92.05%   92.05%           
=======================================
  Files         169      169           
  Lines       50733    50733           
=======================================
  Hits        46702    46702           
  Misses       4031     4031
 
 Continue to review full report at Codecov. 
 | 
| We if cast data into np.ndarray, it will lost the timezone information. ndarray can't contain timezone information. link That's why I had to bypass it. | 
| So this is a case I didn't account for when I recently fixed  I don't have the patch off the top of my head, but what would happen if instead if the  | 
| @mroeschke Thinks for the idea! I don't want to modify  I think I have come up with a solution, inspired by @mroeschke . The whole problem was caused by  | 
| @swyoon object arrays of Timestamps are quite non performant | 
b0430cb    to
    45c5432      
    Compare
  
    | @jreback I have applied  | 
| tm.assert_frame_equal(result, expected) | ||
|  | ||
| def test_parse_timezone(self): | ||
| import pytz | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move import to top of file.
8ef5b1d    to
    e4fc116      
    Compare
  
    a3e4f7b    to
    52aa9b4      
    Compare
  
    | Yeah...Travis has been acting up a bit. Restarted the failing builds, as they didn't look related to this PR. | 
| @gfyoung Thank, but it failed again... failed job Could you run it one more time, please? | 
| @gfyoung Sorry, but travis still doesn't work | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small comment, lgtm. ping on green.
| tm.assert_frame_equal(result, expected) | ||
|  | ||
| def test_parse_timezone(self): | ||
| data = """dt,val | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the issue number as a comment
| I think we also might have another open issue about this (aside from the referenced one), if you'd do a search and see. | 
2f7e9c3    to
    91ef33e      
    Compare
  
    - use box=True for to_datetime(), and adjust downstream processing to the change. - resolve pandas-dev#22256
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
cc @jreback
| @jreback is there anything needs to be done before merging? | 
| thanks @swyoon | 
| @jreback @gfyoung @mroeschke Thank you all | 
Use
box=Trueforto_datetime(), and adjust downstream processing to the change.git diff upstream/master -u -- "*.py" | flake8 --diff