Skip to content

copy_to_readonly_numpy_array needlessly copies pandas series objects #1081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
RZachLamberty opened this issue Jul 28, 2018 · 3 comments
Closed

Comments

@RZachLamberty
Copy link

RZachLamberty commented Jul 28, 2018

the function _plotly_utils.basevalidators.copy_to_readonly_numpy_array performs a full copy of pd.Series objects which contain existing np.ndarray data as the values attribute. we could utilize the values attribute to dramatically speed up trace generation, especially for large dataframes.

environment: plotly version 3.1.0. macos high sierra 10.13.6. plotly installation via conda

working example:

import plotly
import _plotly_utils.basevalidators
import numpy

print('plotly version: {}'.format(plotly.__version__))

df = pd.DataFrame({'x': np.random.randint(0, 100, 1000000)})

# using `ipython` time magic
print('\ncoercing series')
%time v1 = _plotly_utils.basevalidators.copy_to_readonly_numpy_array(df.x)

print('\naccessing np values directly')
%time v2 = _plotly_utils.basevalidators.copy_to_readonly_numpy_array(df.x.values)

example output:

plotly version: 3.1.0

coercing series
CPU times: user 987 ms, sys: 35.5 ms, total: 1.02 s
Wall time: 854 ms

accessing np values directly
CPU times: user 1.45 ms, sys: 17 µs, total: 1.46 ms
Wall time: 1.49 ms

so a performance difference of approx 1000x

@RZachLamberty
Copy link
Author

I may work on a pull request for this today, but I believe this can be directly addressed by adding an extra elseif level checking if the type of v is pd.Series and if the type of v.values is np.ndarray (it could be pd.Categorical, e.g.)

@jonmmease
Copy link
Contributor

Awesome, thanks for digging into this. A PR would be very welcome!

@jonmmease
Copy link
Contributor

Done in #1149

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants