-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PerformanceWarning: what is actually the problem I can change? #3622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You are storing Stores (meaning not a Table), which means that PyTables is pickling some type of data. Several options. Split out the data to separate nodes (that node will still have the warning, but the rest will be faster), or you can save it as a Table (which should support it a little better). Can you show me a sample of the data and df.dtypes? |
also...update to master, I just added #3623 which should make the warnings slightly more informative |
Here is some code which produces these warnings:
The objects are strings of variable length (some are paragraph length). Performance is not a problem (~seconds? or less than a second, even for my biggest data file, which has ~300k rows), so I don't mind the time it takes, just the warnigns which make my IPython notebook longer and harder to read the important parts. |
the open/close twice should not be necessary can u post df._data.blocks? |
not sure if u can but would help if u post your data file (a link on say Dropbox) |
are some of your |
|
Try getting rid of the unicode
you may need something like
FYI very soon (with the release of PyTables 3.0) I think we will be able to support unicode |
Then I will simple wait until that happens. Right now the performance is no problem, just the annoying warnings :-) |
the warning is just to alert the user that u r basically pickling those fields rather than storing then in a c-type
|
closing for now, @JanSchulz reopen/new issue if you have questions/concerns |
Hi @jreback , im on pytables 3 (tables==3.2.0) and am still facing the same issue as @JanSchulz - warnings when i try to save my 'df' as 'h5'. My data frame does contain unicode. Any thing i can do to avoid them ? |
make sure you are storing with format='table' py3 handles the Unicode pls show code and version if this doesn't work |
I found a weird case when I ran the same command the second time then that warning disappeared:
P.S. I ran it in interactive mode, version: python==3.6.7, pandas==0.23.4 |
I get several PerformanceWarnings when I store my dataframe in a hdfstore:
What I can't get from this is what column gives me these problems, at least I don't have any "block0" columns :-) It would be nice if this warnings can give me an indicator what i can actually do about this warnings.
The text was updated successfully, but these errors were encountered: