Skip to content

DOC: Improved the docstring of errors.ParserWarning #20076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 15, 2018
Merged
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 38 additions & 4 deletions pandas/errors/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,44 @@ class EmptyDataError(ValueError):

class ParserWarning(Warning):
"""
Warning that is raised in `pd.read_csv` whenever it is necessary
to change parsers (generally from 'c' to 'python') contrary to the
one specified by the user due to lack of support or functionality for
parsing particular attributes of a CSV file with the requested engine.
Warning raised when reading a file that doesn't use the default 'c' parser.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first line needs to fit in a line. Can you write something more concise please? This paragraph is really useful, and it surely needs to be in the description, but the first line is used in some summaries that should be shorter. Something like Warning raised when reading a table does not use the default parser. Not sure if it's accurate or fits in one line, but to give you an idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Mark! Thanks for the suggestion. Already commited my version of it.

Raised by `pd.read_csv` and `pd.read_table` when it is necessary to change
parsers, generally from the default 'c' parser to 'python'.

It happens due to a lack of support or functionality for parsing a
particular attribute of a CSV file with the requested engine.

Currently, 'c' unsupported options include the following parameters:

1. `sep` other than a single character (e.g. regex separators)
2. `skipfooter` higher than 0
3. `sep=None` with `delim_whitespace=False`

The warning can be avoided by adding `engine='python'` as a parameter in
`pd.read_csv` and `pd.read_table` methods.

See Also
--------
pd.read_csv : Read CSV (comma-separated) file into DataFrame.
pd.read_table : Read general delimited file into DataFrame.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think read_csv and read_table are good candidates for a See Also section, as you're mentioning them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a See Also section with read_csv and read_table.

Examples
--------
Using a `sep` in `pd.read_csv` other than a single character:

>>> import io
>>> csv = u'''a;b;c
... 1;1,8
... 1;2,1'''
>>> df = pd.read_csv(io.StringIO(csv), sep='[;,]')
Traceback (most recent call last):
...
ParserWarning: Falling back to the 'python' engine...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you check why the validation says that this test didn't pass, and that the read_csv returned nothing?

Copy link
Contributor Author

@joaoavf joaoavf Mar 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I ran the code in my console I had this warning displayed: 'ParserWarning: Falling back to the 'python' engine...'

I thought it might have something to do as it is a warning and not an error. Something along the lines that the kind of output generated by an error could be caught by Traceback but not the output of a warning.

Any ideas on how to fix and approach this?


Adding `engine='python'` to `pd.read_csv` removes the Warning:

>>> df = pd.read_csv(io.StringIO(csv), sep='[;,]', engine='python')
"""


Expand Down