-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
bug: read_csv incorrect output with skipfooter and skip_blank_lines=True #10164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think the warning in the docs is pretty clear, see here skipping 2 lines skips everything, but 1 gets your data.
|
Thanks for the fast reply but I don't think the ambiguity is quite the same as that warning. Does skip_footer use row or line numbers? Your comment implies it uses row numbers. Swapping the blank and non-blank lines changes the behaviour:
So the ambiguity appears to be what pandas regards as the footer. It seems that blank lines between the last line in the data table and the first non-blank line in the footer are skipped but subsequent blank lines are not.
Maybe not a bug but I think it's surprising that a footer can contain blank lines but not start with one if skip_blank_lines=True. |
@jsspencer ok I'll buy that. So you want to give a shot at clarify what it should be doing and/or update the docs to be more specific? thxs. |
cc @mdmueller if you guys have thoughts about this |
See pandas-dev/pandas#10164 for details. Affects pandas 0.15.0-0.16.1.
read_csv does not return the data table expected if there's only one line in the CSV file and skipfooter is used and the line directly after the table is blank and skip_blank_lines=True.
But if the more than one line is in the data table or the line after the data table is not blank or skip_blank_lines is set to False, everything is ok:
This occurs in every version from 0.15.0 onwards (ie since skip_blank_lines was introduced).
The text was updated successfully, but these errors were encountered: