-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Tags to dummies helper function #3695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
N.B. this method is a bit slow, something more optimized would be nice |
Perhaps get_dummies could at the same time be made a Series method (as well as being top level)... Or does Series.str.get_dummies (also??) make sense for this delim-ing... |
I don't think this should be a string method you can always s.extract().get_dummies() which will operate in iterables inside the series elements (iow lists) |
assinging you for 0.14! |
sounds good :) |
While I'm looking at get_dummies, may also be a good idea to add bins argument, to exactly cover the use case with a cut (we already do the same with value_counts, which is related). Edit: Not sure what I was talking about re bins (doesn't make sense for categorical), additional munging can be done after I guess. |
I think it makes sense to iterate the other way (over the tags), in this specific example it makes it a lot faster. If there were a lot of tags perhaps it would be slower. With the #6132:
|
from @jreback's clever data alignment trick: http://stackoverflow.com/questions/16637171/pandas-reshaping-data
not sure where this should go, it does come up quite frequently, e.g. the Movielens data set https://raw.github.com/pydata/pydata-book/master/ch02/movielens/movies.dat:
The text was updated successfully, but these errors were encountered: