Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: %string to numerical value conversion #6

Open
vukosim opened this issue Mar 6, 2016 · 4 comments
Open

Feature: %string to numerical value conversion #6

vukosim opened this issue Mar 6, 2016 · 4 comments

Comments

@vukosim
Copy link

vukosim commented Mar 6, 2016

You have some datasets that have % values strings e.g. '95%',''82%' etc.

It would be great if this could be automatically dealt with. On Pandas dataframe this can be done with

df = df.replace('%','',regex=True).astype('float')

@rhiever
Copy link
Owner

rhiever commented Mar 6, 2016

I like this idea. Before we implement it I want to consider how this might affect other input values that aren't percentages.

For example, what if the user passes a DataFrame with class labels that are, say, ">50%" and "<=50%"? We obviously don't want to parse that into numerical percentages, nor do we want to remove the percentages.

Perhaps one way to accomplish this is:

  1. Check if the column is of type 'object'. If not, then it won't contain a '%' anyway.

  2. Check if any entry in the column contains a '%'. If not, skip the column.

  3. Make a copy of the column and apply the transformation you suggested. If it doesn't crash, then it very likely was a string encoding of a percentage. If it does crash, then it probably was some other string(s) that contained %s.

  4. In the non-crashing case, apply the change to the column.

Are there any cases that such a procedure would miss and incorrectly encode?

@MagnetonBora
Copy link

Will try to implement this feature request.

@rhiever
Copy link
Owner

rhiever commented Oct 10, 2016

Looking forward to it! 👍

@vukosim
Copy link
Author

vukosim commented Oct 10, 2016

Had forgotten about this, would be great 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants