-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH/BUG: ignore line comments in CSV files GH2685 #4505
Conversation
@@ -1282,9 +1280,8 @@ class MyDialect(csv.Dialect): | |||
|
|||
sniff_sep = True | |||
|
|||
if sep is not None: | |||
if (sep is not None) and (dia.quotechar is not None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for parens here is not
binds tighter than and
Can you add a test and release notes? thx! |
Sorry, I am new to pandas dev. I am guessing a unit test for commented lines in a CSV file is what you have in mind? |
Yep! |
Where should the test be created? There does not seem to be a particular file for parsers. Maybe test_frame since parsers return frames? |
check out |
* also fix bug in CSV format sniffer
Sorry, missed the tests folder in Having trouble setting up the test to expect different output for C and Python parsers. The tests seem to lock the parser engine and ignore the engine parameter in |
it normally goes thru 3 different version of the parser if you put your test in if you step thru it it call |
@holocronweaver how's this coming along? |
Almost done, though temporarily delayed due to work. I will try to get this finished up tomorrow if possible. Worst case would be next weekend. |
gr8 |
@holocronweaver how's this coming along? |
@jreback It is basically done, but I need time to test and debug. I am currently finishing a GSoC project which ends next week, so I will have a bit of free time again and will try to push this as soon as I get a chance. |
@holocronweaver perfect...pls ping back when to take a look |
@holocronweaver how's this coming? |
@holocronweaver ping! |
@holocronweaver going to be able to rebase this in the next couple of days? |
@jreback Sorry, have been very busy at work. Will be at least another week, though I will try to get it done sooner. Apologies again for the long delay. |
@holocronweaver ok...let us know |
@holocronweaver can are to rebase this? |
@jreback Sure, when I get back from holiday travels. |
@holocronweaver progress on this? |
@jreback No, but it is on my TODO list. Crunch time is preventing anything extracurricular. |
@holocronweaver update? |
@holocronweaver update on this? |
closing in favor of #7470 |
see here: https://github.com/pydata/pandas/pull/7470/files try skip_blank_lines=False (is the original behavior) |
Thanks!! On Sat, Dec 27, 2014 at 6:27 PM, jreback notifications@github.com wrote:
|
closes #2685
I have added the ability for both the C and Python CSV parsers to ignore commented lines (i.e., lines beginning with a comment character). Currently the C parser preserves commented lines as empty lines (all NaN), while the Python parser ignores them all together.
In addition, I fixed a small related problem with the CSV format sniffer in the Python parser.
I plan to finish up this work by ignoring empty lines as per #4466.