-
Notifications
You must be signed in to change notification settings - Fork 67
MultiIndex pandas dataframe from uproot.iterate #263
Comments
This is a bug—you're using iterate the way it's supposed to be used. In fact, |
I fixed this in PR #264, where I found and fixed more issues that the one you found. I was wrong when I thought it might be a minor mismatch: so many (good) updates have gone into DataFrame handling, tested in See that PR for updates. This will be a new version of uproot when it's done. |
The fix is in master, but Travis is having issues and it won't get pushed to PyPI until that gets resolved. If you need this fix, |
Hello, thank you for the incredibly speedy response! I will try to set up a new uproot install using pip's git install feature (I'm currently using conda install which only pulls from binaries, I believe). Or else I'll just wait for the Travis issues to go away. |
Sure. :) Based on a Google talk, I'm trying to encourage a "live at head" lifestyle, but that only works if head consists of small changes (and therefore frequent, small changes). I just checked into Travis again, and they're apparently having serious issues. Only a few jobs have started and those that need to install dependencies from conda time-out at 10 minutes. I guess it won't happen today. The normal order is that Travis does the continuous integration, and if that's successful, I tag a release, Travis runs again but this time deploys to PyPI at the end of its test. The new version in PyPI notifies the conda package maintainer and he presses the button to deploy to conda. We're stuck at step one. |
That makes a lot of sense! Actually, if this Google talk is available publicly, it would be awesome to watch it, if you can share the link here! :) |
I thought it was at the last ROOT Workshop, but I can't find anything that looks like it. Even if I did manage to find slides, it actually wasn't what the speaker intended to talk about: he thought he was referencing a discipline we were familiar with, but it ended up being the most interesting thing in his talk. Apparently that phrase, "live at head," is the common way of describing it. |
Very interesting! Someone needs to make this phrase into a t-shirt... |
Hi! First of all thank you very much for this awesome package!
I have a question regarding MultiIndex pandas dataframes and uproot.iterate. When I open a ROOT file via uproot.open, I am able to select branches which contain JaggedArrays with the same dimensionality, and make them into a pandas dataframe with MultiIndex. For example:
Depending on the event, I can have (say) 0, 1, or 2 muons, so I can have accordingly 0, 1, or 2 subentries, and the resulting pandas dataframe reflects that.
But I would like to process several files with the same structure using uproot.iterate. I haven't found a way to make the pandas dataframe with MultiIndex by selecting the right branches from the iterate, e.g.:
Without "flatten=True" in the iterate command above, the dataframes come out containing JaggedArrays, and I'm not sure how to turn those into a MultiIndex structure. If I do include "flatten=True", however, I get an error about incompatible dimensionalities:
(I think this is because of the variable number of muons per entry). Is there a way to get the same behavior from uproot.iterate on many files, as I would from tree.pandas.df() on a single file?
Thank you!
Andre
The text was updated successfully, but these errors were encountered: