-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
read_csv newline fix #10023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_csv newline fix #10023
Conversation
you need a test. As not really sure what you are fixing. |
Ok, test forthcoming. |
There is a small self-contained test in the comments on issue #10022. Would it be desirable to make it into a unit test? It takes about a second to run on my machine. |
yep |
Unit test added. Any further comments? |
@@ -359,6 +359,11 @@ def test_empty_field_eof(self): | |||
names=list('abcd'), engine='c') | |||
assert_frame_equal(df, c) | |||
|
|||
def test_chunk_begins_with_newline_whitespace(self): | |||
data = '\n hello\nworld\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue number as a comment here
even though the fix is only in the c-parser, IIRC, this should work in python parser as well? hence pls move the tests to |
@jblackburne looks good.
|
Squash all into a single commit? |
@jblackburne Yes, that's what @jreback is asking for |
…that start with newline. Changed a condition in tokenize_delim_customterm to account for data chunks that start with terminator. Added a unit test that fails in master and passes in this branch. Moved new unit test in order to test all parser engines. Added GH issue number. Added release note.
8f37413
to
e693c3a
Compare
pls take a look cc @evanpw |
The logic looks right to me. |
@jblackburne pls ping when this is green. |
Ok, green. |
Slight change to the logic in the
tokenize_delimited()
andtokenize_delim_customterm()
functions of the C parser.Fixes #10022.
I believe the new logic is correct, but perhaps someone with more familiarity can double-check.