Skip to content

Conversation

pdpark
Copy link

@pdpark pdpark commented Jan 12, 2018

@codecov
Copy link

codecov bot commented Jan 13, 2018

Codecov Report

Merging #19215 into master will increase coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19215      +/-   ##
==========================================
+ Coverage   91.53%   91.55%   +0.02%     
==========================================
  Files         147      147              
  Lines       48797    48797              
==========================================
+ Hits        44664    44676      +12     
+ Misses       4133     4121      -12
Flag Coverage Δ
#multiple 89.92% <ø> (+0.02%) ⬆️
#single 41.6% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/plotting/_converter.py 66.95% <0%> (+1.73%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8347ff8...11ff8a7. Read the comment docs.

@jreback jreback added the Docs label Jan 13, 2018


Alternative to storing lists in DataFrame Cells
------------------------------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be the same length as the title

nearest_neighbors = [['Zach LaVine', 'Jeremy Lin', 'Nate Robinson', 'Isaia']]*3
nearest_neighbors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make things into separate ipython:: python blocks, rather than using comments (you can simply write text and not use the #)

nearest_neighbors
#. Create an index with the "parent" columns to be included in the final Dataframe
df2 = pd.concat([df[['name','opponent']], pd.DataFrame(nearest_neighbors)], axis=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to keep naming the dataframes, just use

df = ..... or whatever

------------------------------------------------------
Storing nested lists/arrays inside a pandas object should be avoided for performance and memory use reasons. Instead they should be "exploded" into a flat ``DataFrame`` structure.

Example of exploding nested lists into a DataFrame:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you have 2 examples you can use another level of sub-section

@jreback
Copy link
Contributor

jreback commented Feb 24, 2018

can you update

@jreback
Copy link
Contributor

jreback commented Aug 2, 2018

can you rebase and update

@pdpark
Copy link
Author

pdpark commented Aug 31, 2018

Will do - have been absent due to starting new job, but plan to spend some time on this.

@datapythonista
Copy link
Member

Closing as discontinued. Superseded by #23041

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DOC: section on caveats of storing lists inside DataFrame/Series

3 participants