Skip to content

bug: read_csv incorrect output with skipfooter and skip_blank_lines=True #10164

Open
@jsspencer

Description

@jsspencer

read_csv does not return the data table expected if there's only one line in the CSV file and skipfooter is used and the line directly after the table is blank and skip_blank_lines=True.

>>> import pandas as pd
>>> import StringIO
>>> test_csv = StringIO.StringIO('a,b,c\n1,2,3\n\nend\n')                                                                                                                                                                                      
>>> pd.read_csv(test_csv, skip_footer=2, engine='python')
Empty DataFrame
Columns: [a, b, c]
Index: []

But if the more than one line is in the data table or the line after the data table is not blank or skip_blank_lines is set to False, everything is ok:

>>> test_csv = StringIO.StringIO('a,b,c\n1,2,3\n4,5,6\n\nend\n')
>>> pd.read_csv(test_csv, skip_footer=2, engine='python')
   a  b  c
0  1  2  3
1  4  5  6
>>> test_csv = StringIO.StringIO('a,b,c\n1,2,3\nend\n\n')
>>> pd.read_csv(test_csv, skip_footer=2, engine='python')
   a  b  c
0  1  2  3
>>> test_csv = StringIO.StringIO('a,b,c\n1,2,3\n\nend\n')
>>> pd.read_csv(test_csv, skip_footer=2, engine='python', skip_blank_lines=False)
   a  b  c
0  1  2  3

This occurs in every version from 0.15.0 onwards (ie since skip_blank_lines was introduced).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions