-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make map conversions more efficient. #188
Conversation
@groutr just a quick glance through the test logs:
Seems like a similar failure causes a few of the tests to fail. Thoughts? |
The issue could be related to assumptions made in the original method. The original (slow) method was meant to work with both Python An additional nuance is that |
Actually come to think of it, it may be easier (and faster) to offload this task to return pd.Series(mapping, dtype=converter) |
That was my initial thought, however it seemed like there was effort made to make this general to any mapping type. If a pandas series is assumed, this is going to be the best way. I was adhering to the type information in the function signature. pd.Series is not a Mapping type (though it sort of mimics the python dictionary behavior) isinstance(pd.Series(), Mapping) # False |
Thus far, this method is only used internally to validate contingency tables. @groutr were you calling this method directly? If not, this might be a good argument for replacing this method with something more strict, but hopefully faster (that only returns |
@jarq6c I am not calling the method directly. Replacing it with something more strict seems like a good idea to me. |
Superseded by #191 |
Changes
convert_mapping_values
work more efficiently (for a dict of length 1000, runtime drops from 471us to 294us on my machine)Notes
Since the numpy dtypes are C functions, pushing the loop into C (by using map) will be far more efficient.
There was a type mismatch between the type of mapping and the return of the function. Updated the type of mapping to be MutableMapping.