Skip to content

HTML (and text) reprs for large dataframes. #5550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
5 commits merged into from
Nov 26, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 6 additions & 15 deletions doc/source/dsintro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -573,8 +573,9 @@ indexing semantics are quite different in places from a matrix.
Console display
~~~~~~~~~~~~~~~

For very large DataFrame objects, only a summary will be printed to the console
(here I am reading a CSV version of the **baseball** dataset from the **plyr**
Very large DataFrames will be truncated to display them in the console.
You can also get a summary using :meth:`~pandas.DataFrame.info`.
(Here I am reading a CSV version of the **baseball** dataset from the **plyr**
R package):

.. ipython:: python
Expand All @@ -587,6 +588,7 @@ R package):

baseball = read_csv('data/baseball.csv')
print(baseball)
baseball.info()

.. ipython:: python
:suppress:
Expand Down Expand Up @@ -622,19 +624,8 @@ option:

reset_option('line_width')

You can also disable this feature via the ``expand_frame_repr`` option:

.. ipython:: python

set_option('expand_frame_repr', False)

DataFrame(randn(3, 12))

.. ipython:: python
:suppress:

reset_option('expand_frame_repr')

You can also disable this feature via the ``expand_frame_repr`` option.
This will print the table in one block.

DataFrame column attribute access and IPython completion
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
28 changes: 16 additions & 12 deletions doc/source/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,21 +36,25 @@ horizontal scrolling, auto-detection of width/height.
To appropriately address all these environments, the display behavior is controlled
by several options, which you're encouraged to tweak to suit your setup.

As of 0.12, these are the relevant options, all under the `display` namespace,
(e.g. display.width, etc'):
As of 0.13, these are the relevant options, all under the `display` namespace,
(e.g. ``display.width``, etc.):

- notebook_repr_html: if True, IPython frontends with HTML support will display
dataframes as HTML tables when possible.
- expand_repr (default True): when the frame width cannot fit within the screen,
the output will be broken into multiple pages to accomedate. This applies to
textual (as opposed to HTML) display only.
- max_columns: max dataframe columns to display. a wider frame will trigger
a summary view, unless `expand_repr` is True and HTML output is disabled.
- max_rows: max dataframe rows display. a longer frame will trigger a summary view.
- width: width of display screen in characters, used to determine the width of lines
when expand_repr is active, Setting this to None will trigger auto-detection of terminal
width, this only works for proper terminals, not IPython frontends such as ipnb.
width is ignored in IPython notebook, since the browser provides horizontal scrolling.
- large_repr (default 'truncate'): when a :class:`~pandas.DataFrame`
exceeds max_columns or max_rows, it can be displayed either as a
truncated table or, with this set to 'info', as a short summary view.
- max_columns (default 20): max dataframe columns to display.
- max_rows (default 60): max dataframe rows display.

Two additional options only apply to displaying DataFrames in terminals,
not to the HTML view:

- expand_repr (default True): when the frame width cannot fit within
the screen, the output will be broken into multiple pages.
- width: width of display screen in characters, used to determine the
width of lines when expand_repr is active. Setting this to None will
trigger auto-detection of terminal width.

IPython users can use the IPython startup file to import pandas and set these
options automatically when starting up.
Expand Down
16 changes: 16 additions & 0 deletions doc/source/v0.13.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -375,6 +375,22 @@ HDFStore API Changes
via the option ``io.hdf.dropna_table`` (:issue:`4625`)
- pass thru store creation arguments; can be used to support in-memory stores

DataFrame repr Changes
~~~~~~~~~~~~~~~~~~~~~~

The HTML and plain text representations of :class:`DataFrame` now show
a truncated view of the table once it exceeds a certain size, rather
than switching to the short info view (:issue:`4886`, :issue:`5550`).
This makes the representation more consistent as small DataFrames get
larger.

.. image:: _static/df_repr_truncated.png
:alt: Truncated HTML representation of a DataFrame

To get the info view, call :meth:`DataFrame.info`. If you prefer the
info view as the repr for large DataFrames, you can set this by running
``set_option('display.large_repr', 'info')``.

Enhancements
~~~~~~~~~~~~

Expand Down
25 changes: 18 additions & 7 deletions pandas/core/config_init.py
Original file line number Diff line number Diff line change
Expand Up @@ -166,13 +166,19 @@

pc_max_info_rows_doc = """
: int or None
max_info_rows is the maximum number of rows for which a frame will
perform a null check on its columns when repr'ing To a console.
The default is 1,000,000 rows. So, if a DataFrame has more
1,000,000 rows there will be no null check performed on the
columns and thus the representation will take much less time to
display in an interactive session. A value of None means always
perform a null check when repr'ing.
Deprecated.
"""

pc_max_info_rows_deprecation_warning = """\
max_info_rows has been deprecated, as reprs no longer use the info view.
"""

pc_large_repr_doc = """
: 'truncate'/'info'

For DataFrames exceeding max_rows/max_cols, the repr (and HTML repr) can
show a truncated table (the default from 0.13), or switch to the view from
df.info() (the behaviour in earlier versions of pandas).
"""

pc_mpl_style_doc = """
Expand Down Expand Up @@ -220,6 +226,8 @@ def mpl_style_cb(key):
cf.register_option('max_colwidth', 50, max_colwidth_doc, validator=is_int)
cf.register_option('max_columns', 20, pc_max_cols_doc,
validator=is_instance_factory([type(None), int]))
cf.register_option('large_repr', 'truncate', pc_large_repr_doc,
validator=is_one_of_factory(['truncate', 'info']))
cf.register_option('max_info_columns', 100, pc_max_info_cols_doc,
validator=is_int)
cf.register_option('colheader_justify', 'right', colheader_justify_doc,
Expand Down Expand Up @@ -258,6 +266,9 @@ def mpl_style_cb(key):
msg=pc_height_deprecation_warning,
rkey='display.max_rows')

cf.deprecate_option('display.max_info_rows',
msg=pc_max_info_rows_deprecation_warning)

tc_sim_interactive_doc = """
: boolean
Whether to simulate interactive mode for purposes of testing
Expand Down
Loading