-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean and fast readtable
implementation
#287
Conversation
@StefanKarpinski and @ViralBShah, this is failing on Travis because @tshort and @HarlanH, we need to replace |
It would be nice if julia started tagging subversions during development, so that the next version of DataFrames could depend on the version of Julia where For such reasons, it might actually be good to wait before changing I'm going to file this as a julia issue... |
Just to let people know, I did forget one important thing: this implementation doesn't parse column names automatically. |
Ugh. There are other bugs in this implementation since I blew out the entire old |
@staticfloat Are the nightlies behind? |
Looks like they were last built on the 18th, so not significantly, no. The bundled libuv was recently updated as well. |
Ok, I believe |
What's the |
The pipe operator is now |
Change some keyword argument names Allow user to select whether to ignore whitespace padding Allow user to decide whether to convert strings to factors Read in column names correctly Allow user to skip lines at the start of a file Restore printtable and writetable Fix test that broke with Julia changes
This is now finished. It has all of the old functionality plus more (like keyword arguments, including one for automatic string to factor conversion if the user requests it). It's also super fast. @ViralBShah, it reads your really corrupt data set in about 7 seconds into a standard DataFrame with type inference. Barring objections, I'll merge it as soon as the nightlies catch up with the |
Merging since master is currently broken thanks to the removal of |
Clean and fast `readtable` implementation
This is an implementation of
readtable
that is much faster than our previous version and uses keyword arguments to provide a lot of additional functionality. Some of the functionality is still not in place, but I think this is close to the correct outline of all the features we should add to ourreadtable
function.