You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be convenient to have a canonical set of dataframes for use in testing and/or benchmarking. Ideally this would be a set of named dataframes that represented common forms of data like the following:
Random floating point data
Random integer data
Strings with low entropy
Strings with high entropy
Mostly sorted datetimes
...
These could then be used either within Pandas or in other libraries for benchmarks. Having a consistent set of dataframes would probably aid consistent benchmarking.
Additionally if this was then separately arranged into pytest fixture we could imagine setting things up and tearing things down in a way that made benchmarking more consistent (such as controlling garbage collection), though this may be a separate endeavor. It would be nice to have access to the dataframes outside of the context of PyTest as well
It would be convenient to have a canonical set of dataframes for use in testing and/or benchmarking. Ideally this would be a set of named dataframes that represented common forms of data like the following:
These could then be used either within Pandas or in other libraries for benchmarks. Having a consistent set of dataframes would probably aid consistent benchmarking.
Additionally if this was then separately arranged into pytest fixture we could imagine setting things up and tearing things down in a way that made benchmarking more consistent (such as controlling garbage collection), though this may be a separate endeavor. It would be nice to have access to the dataframes outside of the context of PyTest as well
cc @jreback @wesm @cpcloud
The text was updated successfully, but these errors were encountered: