-
Notifications
You must be signed in to change notification settings - Fork 67
Duplicate scalar columns (or custom index) in Pandas DF with flatten=True #179
Comments
My thinking on the was that anyone could use Pandas's |
That would cause issues if you have multiple jagged-array columns with different lengths. I was suggesting duplicating only the scalar columns. |
That is doable. I'll use |
In uproot 3.2.9, scalar columns get duplicated down, but jagged columns of different lengths do not. |
Turns out there's a problem. The integer columns have turned into floats. |
That's something that Pandas does when it consolidates Numpy arrays internally. I don't know how to control it— I add columns to the DataFrame and it sometimes converts them. Do you know the mechanism behind that? It seems like something they really to be hidden/transparent. |
It's probably because you used |
That makes sense. However, I didn't put NaN in myself: that's what Pandas does when you merge a dataset into one with a larger index— namely the one with nonzero subentries. That's intrinsic to the process. I suppose I could afterward determine if any fillna'ed scalar columns used to be integers and change them back... |
In my case, my tree contains
runNumber
andeventNumber
columns that I would like to use as an index, but these columns areNaN
forsubentry != 0
.The text was updated successfully, but these errors were encountered: