-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doc: Adds example of exploding lists into columns instead of storing in dataframe cells #19215
Conversation
pdpark
commented
Jan 12, 2018
- closes DOC: section on caveats of storing lists inside DataFrame/Series #17027
Codecov Report
@@ Coverage Diff @@
## master #19215 +/- ##
==========================================
+ Coverage 91.53% 91.55% +0.02%
==========================================
Files 147 147
Lines 48797 48797
==========================================
+ Hits 44664 44676 +12
+ Misses 4133 4121 -12
Continue to review full report at Codecov.
|
|
||
|
||
Alternative to storing lists in DataFrame Cells | ||
------------------------------------------------------ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs to be the same length as the title
|
||
nearest_neighbors = [['Zach LaVine', 'Jeremy Lin', 'Nate Robinson', 'Isaia']]*3 | ||
nearest_neighbors | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make things into separate ipython:: python blocks
, rather than using comments (you can simply write text and not use the #
)
nearest_neighbors | ||
|
||
#. Create an index with the "parent" columns to be included in the final Dataframe | ||
df2 = pd.concat([df[['name','opponent']], pd.DataFrame(nearest_neighbors)], axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to keep naming the dataframes, just use
df = ..... or whatever
------------------------------------------------------ | ||
Storing nested lists/arrays inside a pandas object should be avoided for performance and memory use reasons. Instead they should be "exploded" into a flat ``DataFrame`` structure. | ||
|
||
Example of exploding nested lists into a DataFrame: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you have 2 examples you can use another level of sub-section
can you update |
can you rebase and update |
Will do - have been absent due to starting new job, but plan to spend some time on this. |
Closing as discontinued. Superseded by #23041 |