-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
skipfooter doesn't really "skip" in read_csv #13879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
…gine Python's native CSV library does not respect the skipfooter parameter, so if one of those skipped rows is malformed, it will still raise an error. Closes pandas-devgh-13879.
If this feature would be implemented in the C engine, I would expect it to work in this case, so that the skipped lines need not to parse correctly. But I am not sure if this is actually possible? Questions on how to treat quotations marks (are they parsed or not to determine the number of lines to skip .. ?) similar as those recent issues about skiprows will also come up. So for this to be consistent, they maybe need to get parsed to some extent? |
@jorisvandenbossche : You are correct. This code should not break, though whether it's possible is another story, as some parsing might be needed. But in any case, not sure yet how to implement for the C engine, though that can be dealt with separately from this issue. |
…gine Python's native CSV library does not respect the skipfooter parameter, so if one of those skipped rows is malformed, it will still raise an error. Closes pandas-devgh-13879.
…gine Python's native CSV library does not respect the skipfooter parameter, so if one of those skipped rows is malformed, it will still raise an error. Closes pandas-devgh-13879.
…gine Python's native CSV library does not respect the skipfooter parameter, so if one of those skipped rows is malformed, it will still raise an error. Closes pandas-devgh-13879.
On
master
:If we were truly "skipping" the last row, no error should have been raised. However, this occurs because the data is all parsed in memory first with Python's
csv
library.Whether this is intended behaviour or not has implications for the C engine in terms of implementing analogous
skipfooter
behaviour. Or perhaps it has something to do with the fact thaterror_bad_lines
anderror_warn_lines
parameters not with the Python engine?xref #5232
The text was updated successfully, but these errors were encountered: