Skip to content

API/DEPR: error_bad_lines/warn_bad_lines in pd.read_csv #22677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
h-vetinari opened this issue Sep 12, 2018 · 2 comments
Closed

API/DEPR: error_bad_lines/warn_bad_lines in pd.read_csv #22677

h-vetinari opened this issue Sep 12, 2018 · 2 comments
Labels
API Design Duplicate Report Duplicate issue or pull request IO CSV read_csv, to_csv

Comments

@h-vetinari
Copy link
Contributor

From discussion started in #22639:

Currently, pd.read_csv has two booleans

  • error_bad_lines : boolean, default True
    Lines with too many fields (e.g. a csv line with too many commas) will by default cause an exception to be raised, and no DataFrame will be returned. If False, then these “bad lines” will dropped from the DataFrame that is returned.
  • warn_bad_lines : boolean, default True
    If error_bad_lines is False, and warn_bad_lines is True, a warning for each “bad line” will be output.

This is confusing (what happens if both are True), and not in line with other errors-kwargs that are all around the place. Clearer would be something like: error_bad_lines = {'raise'|'warn'|'ignore'}, and removing warn_bad_lines.

@gfyoung gfyoung added API Design IO CSV read_csv, to_csv labels Sep 12, 2018
@gfyoung
Copy link
Member

gfyoung commented Sep 12, 2018

This is a pretty reasonable suggestion and is consistent with what we do with other functions from an API standpoint (e.g. to_numeric).

@lithomas1 lithomas1 added the Duplicate Report Duplicate issue or pull request label Mar 14, 2021
@lithomas1
Copy link
Member

closing as duplicate of #15122

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Duplicate Report Duplicate issue or pull request IO CSV read_csv, to_csv
Projects
None yet
Development

No branches or pull requests

3 participants