Skip to content

ENH: Styler.format_index() to display index values similarly to data-values with format() #43101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Sep 16, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
e03e014
build format_index mechanics
attack68 Aug 16, 2021
bb9b5b2
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Aug 17, 2021
fb86279
test index formatter display_value, and clearing
attack68 Aug 17, 2021
846e5a2
prelim doc string
attack68 Aug 17, 2021
7e9400a
format_index docs
attack68 Aug 18, 2021
9c63ab8
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Aug 19, 2021
87a6c88
refactor for perf
attack68 Aug 19, 2021
9c969ad
add test
attack68 Aug 20, 2021
26f3906
add tests: escape
attack68 Aug 21, 2021
ec40418
add tests: escape na_rep
attack68 Aug 21, 2021
0de5397
add tests: raises
attack68 Aug 21, 2021
6fe8285
test decimal and thousands
attack68 Aug 21, 2021
6b61e31
test precision
attack68 Aug 21, 2021
6eaa933
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Aug 30, 2021
666e460
whats new
attack68 Aug 30, 2021
49bb731
level tests
attack68 Aug 30, 2021
044cd05
user guide
attack68 Aug 30, 2021
8fc497d
typing fix
attack68 Aug 31, 2021
8fb9519
user guide refactor
attack68 Aug 31, 2021
d039105
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 1, 2021
df7548c
input to axis
attack68 Sep 1, 2021
0d91aff
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 2, 2021
e36f198
fix tests
attack68 Sep 2, 2021
b87ef09
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 4, 2021
9a7a8e4
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 5, 2021
ecd01dd
fix recent merged tests
attack68 Sep 5, 2021
1a32d17
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 7, 2021
afa6da3
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 8, 2021
7e6bf97
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 8, 2021
06ce6bf
add to style.rst
attack68 Sep 8, 2021
b8c225a
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 8, 2021
9562b3d
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 9, 2021
4aa58e8
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 9, 2021
f31763e
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 10, 2021
4c6580f
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 10, 2021
3487845
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 12, 2021
3569a2f
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 13, 2021
e1220ec
Merge remote-tracking branch 'upstream/master' into styler_format_index
attack68 Sep 15, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/reference/style.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Style application
Styler.apply_index
Styler.applymap_index
Styler.format
Styler.format_index
Styler.hide_index
Styler.hide_columns
Styler.set_td_classes
Expand Down
55 changes: 48 additions & 7 deletions doc/source/user_guide/style.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -150,15 +150,14 @@
"\n",
"### Formatting Values\n",
"\n",
"Before adding styles it is useful to show that the [Styler][styler] can distinguish the *display* value from the *actual* value. To control the display value, the text is printed in each cell, and we can use the [.format()][formatfunc] method to manipulate this according to a [format spec string][format] or a callable that takes a single value and returns a string. It is possible to define this for the whole table or for individual columns. \n",
"Before adding styles it is useful to show that the [Styler][styler] can distinguish the *display* value from the *actual* value, in both datavlaues and index or columns headers. To control the display value, the text is printed in each cell as string, and we can use the [.format()][formatfunc] and [.format_index()][formatfuncindex] methods to manipulate this according to a [format spec string][format] or a callable that takes a single value and returns a string. It is possible to define this for the whole table, or index, or for individual columns, or MultiIndex levels. \n",
"\n",
"Additionally, the format function has a **precision** argument to specifically help formatting floats, as well as **decimal** and **thousands** separators to support other locales, an **na_rep** argument to display missing data, and an **escape** argument to help displaying safe-HTML or safe-LaTeX. The default formatter is configured to adopt pandas' regular `display.precision` option, controllable using `with pd.option_context('display.precision', 2):`\n",
"\n",
"Here is an example of using the multiple options to control the formatting generally and with specific column formatters.\n",
"Additionally, the format function has a **precision** argument to specifically help formatting floats, as well as **decimal** and **thousands** separators to support other locales, an **na_rep** argument to display missing data, and an **escape** argument to help displaying safe-HTML or safe-LaTeX. The default formatter is configured to adopt pandas' regular `display.precision` option, controllable using `with pd.option_context('display.precision', 2):` \n",
"\n",
"[styler]: ../reference/api/pandas.io.formats.style.Styler.rst\n",
"[format]: https://docs.python.org/3/library/string.html#format-specification-mini-language\n",
"[formatfunc]: ../reference/api/pandas.io.formats.style.Styler.format.rst"
"[formatfunc]: ../reference/api/pandas.io.formats.style.Styler.format.rst\n",
"[formatfuncindex]: ../reference/api/pandas.io.formats.style.Styler.format_index.rst"
]
},
{
Expand All @@ -173,6 +172,49 @@
" })"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using Styler to manipulate the display is a useful feature because maintaining the indexing and datavalues for other purposes gives greater control. You do not have to overwrite your DataFrame to display it how you like. Here is an example of using the formatting functions whilst still relying on the underlying data for indexing and calculations."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"weather_df = pd.DataFrame(np.random.rand(10,2)*5, \n",
" index=pd.date_range(start=\"2021-01-01\", periods=10),\n",
" columns=[\"Tokyo\", \"Beijing\"])\n",
"\n",
"def rain_condition(v): \n",
" if v < 1.75:\n",
" return \"Dry\"\n",
" elif v < 2.75:\n",
" return \"Rain\"\n",
" return \"Heavy Rain\"\n",
"\n",
"def make_pretty(styler):\n",
" styler.set_caption(\"Weather Conditions\")\n",
" styler.format(rain_condition)\n",
" styler.format_index(lambda v: v.strftime(\"%A\"))\n",
" styler.background_gradient(axis=None, vmin=1, vmax=5, cmap=\"YlGnBu\")\n",
" return styler\n",
"\n",
"weather_df"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"weather_df.loc[\"2021-01-04\":\"2021-01-08\"].style.pipe(make_pretty)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -187,7 +229,7 @@
"\n",
"Hiding does not change the integer arrangement of CSS classes, e.g. hiding the first two columns of a DataFrame means the column class indexing will start at `col2`, since `col0` and `col1` are simply ignored.\n",
"\n",
"We can update our `Styler` object to hide some data and format the values.\n",
"We can update our `Styler` object from before to hide some data and format the values.\n",
"\n",
"[hideidx]: ../reference/api/pandas.io.formats.style.Styler.hide_index.rst\n",
"[hidecols]: ../reference/api/pandas.io.formats.style.Styler.hide_columns.rst"
Expand Down Expand Up @@ -1974,7 +2016,6 @@
}
],
"metadata": {
"celltoolbar": "Edit Metadata",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
Expand Down
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.4.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Styler

:class:`.Styler` has been further developed in 1.4.0. The following enhancements have been made:

- Styling of indexing has been added, with :meth:`.Styler.apply_index` and :meth:`.Styler.applymap_index`. These mirror the signature of the methods already used to style data values, and work with both HTML and LaTeX format (:issue:`41893`).
- Styling and formatting of indexes has been added, with :meth:`.Styler.apply_index`, :meth:`.Styler.applymap_index` and :meth:`.Styler.format_index`. These mirror the signature of the methods already used to style and format data values, and work with both HTML and LaTeX format (:issue:`41893`, :issue:`43101`).
- :meth:`.Styler.bar` introduces additional arguments to control alignment and display (:issue:`26070`, :issue:`36419`), and it also validates the input arguments ``width`` and ``height`` (:issue:`42511`).
- :meth:`.Styler.to_latex` introduces keyword argument ``environment``, which also allows a specific "longtable" entry through a separate jinja2 template (:issue:`41866`).
- :meth:`.Styler.to_html` introduces keyword arguments ``sparse_index``, ``sparse_columns``, ``bold_headers``, ``caption``, ``max_rows`` and ``max_columns`` (:issue:`41946`, :issue:`43149`, :issue:`42972`).
Expand Down
2 changes: 2 additions & 0 deletions pandas/io/formats/style.py
Original file line number Diff line number Diff line change
Expand Up @@ -1184,6 +1184,8 @@ def _copy(self, deepcopy: bool = False) -> Styler:
]
deep = [ # nested lists or dicts
"_display_funcs",
"_display_funcs_index",
"_display_funcs_columns",
"hidden_rows",
"hidden_columns",
"ctx",
Expand Down
177 changes: 177 additions & 0 deletions pandas/io/formats/style_render.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,12 @@ def __init__(
self._display_funcs: DefaultDict[ # maps (row, col) -> format func
tuple[int, int], Callable[[Any], str]
] = defaultdict(lambda: partial(_default_formatter, precision=precision))
self._display_funcs_index: DefaultDict[ # maps (row, level) -> format func
tuple[int, int], Callable[[Any], str]
] = defaultdict(lambda: partial(_default_formatter, precision=precision))
self._display_funcs_columns: DefaultDict[ # maps (level, col) -> format func
tuple[int, int], Callable[[Any], str]
] = defaultdict(lambda: partial(_default_formatter, precision=precision))

def _render_html(
self,
Expand Down Expand Up @@ -377,6 +383,7 @@ def _translate_header(
f"{col_heading_class} level{r} col{c}",
value,
_is_visible(c, r, col_lengths),
display_value=self._display_funcs_columns[(r, c)](value),
attributes=(
f'colspan="{col_lengths.get((r, c), 0)}"'
if col_lengths.get((r, c), 0) > 1
Expand Down Expand Up @@ -535,6 +542,7 @@ def _translate_body(
f"{row_heading_class} level{c} row{r}",
value,
_is_visible(r, c, idx_lengths) and not self.hide_index_[c],
display_value=self._display_funcs_index[(r, c)](value),
attributes=(
f'rowspan="{idx_lengths.get((c, r), 0)}"'
if idx_lengths.get((c, r), 0) > 1
Expand Down Expand Up @@ -834,6 +842,175 @@ def format(

return self

def format_index(
self,
formatter: ExtFormatter | None = None,
axis: int | str = 0,
level: Level | list[Level] | None = None,
na_rep: str | None = None,
precision: int | None = None,
decimal: str = ".",
thousands: str | None = None,
escape: str | None = None,
) -> StylerRenderer:
r"""
Format the text display value of index labels or column headers.

.. versionadded:: 1.4.0

Parameters
----------
formatter : str, callable, dict or None
Object to define how values are displayed. See notes.
axis : {0, "index", 1, "columns"}
Whether to apply the formatter to the index or column headers.
level : int, str, list
The level(s) over which to apply the generic formatter.
na_rep : str, optional
Representation for missing values.
If ``na_rep`` is None, no special formatting is applied.
precision : int, optional
Floating point precision to use for display purposes, if not determined by
the specified ``formatter``.
decimal : str, default "."
Character used as decimal separator for floats, complex and integers
thousands : str, optional, default None
Character used as thousands separator for floats, complex and integers
escape : str, optional
Use 'html' to replace the characters ``&``, ``<``, ``>``, ``'``, and ``"``
in cell display string with HTML-safe sequences.
Use 'latex' to replace the characters ``&``, ``%``, ``$``, ``#``, ``_``,
``{``, ``}``, ``~``, ``^``, and ``\`` in the cell display string with
LaTeX-safe sequences.
Escaping is done before ``formatter``.

Returns
-------
self : Styler

Notes
-----
This method assigns a formatting function, ``formatter``, to each level label
in the DataFrame's index or column headers. If ``formatter`` is ``None``,
then the default formatter is used.
If a callable then that function should take a label value as input and return
a displayable representation, such as a string. If ``formatter`` is
given as a string this is assumed to be a valid Python format specification
and is wrapped to a callable as ``string.format(x)``. If a ``dict`` is given,
keys should correspond to MultiIndex level numbers or names, and values should
be string or callable, as above.

The default formatter currently expresses floats and complex numbers with the
pandas display precision unless using the ``precision`` argument here. The
default formatter does not adjust the representation of missing values unless
the ``na_rep`` argument is used.

The ``level`` argument defines which levels of a MultiIndex to apply the
method to. If the ``formatter`` argument is given in dict form but does
not include all levels within the level argument then these unspecified levels
will have the default formatter applied. Any levels in the formatter dict
specifically excluded from the level argument will be ignored.

When using a ``formatter`` string the dtypes must be compatible, otherwise a
`ValueError` will be raised.

Examples
--------
Using ``na_rep`` and ``precision`` with the default ``formatter``

>>> df = pd.DataFrame([[1, 2, 3]], columns=[2.0, np.nan, 4.0]])
>>> df.style.format_index(axis=1, na_rep='MISS', precision=3) # doctest: +SKIP
2.000 MISS 4.000
0 1 2 3

Using a ``formatter`` specification on consistent dtypes in a level

>>> df.style.format_index('{:.2f}', axis=1, na_rep='MISS') # doctest: +SKIP
2.00 MISS 4.00
0 1 2 3

Using the default ``formatter`` for unspecified levels

>>> df = pd.DataFrame([[1, 2, 3]],
... columns=pd.MultiIndex.from_arrays([["a", "a", "b"],[2, np.nan, 4]]))
>>> df.style.format_index({0: lambda v: upper(v)}, axis=1, precision=1)
... # doctest: +SKIP
A B
2.0 nan 4.0
0 1 2 3

Using a callable ``formatter`` function.

>>> func = lambda s: 'STRING' if isinstance(s, str) else 'FLOAT'
>>> df.style.format_index(func, axis=1, na_rep='MISS')
... # doctest: +SKIP
STRING STRING
FLOAT MISS FLOAT
0 1 2 3

Using a ``formatter`` with HTML ``escape`` and ``na_rep``.

>>> df = pd.DataFrame([[1, 2, 3]], columns=['"A"', 'A&B', None])
>>> s = df.style.format_index('$ {0}', axis=1, escape="html", na_rep="NA")
<th .. >$ &#34;A&#34;</th>
<th .. >$ A&amp;B</th>
<th .. >NA</td>
...

Using a ``formatter`` with LaTeX ``escape``.

>>> df = pd.DataFrame([[1, 2, 3]], columns=["123", "~", "$%#"])
>>> df.style.format_index("\\textbf{{{}}}", escape="latex", axis=1).to_latex()
... # doctest: +SKIP
\begin{tabular}{lrrr}
{} & {\textbf{123}} & {\textbf{\textasciitilde }} & {\textbf{\$\%\#}} \\
0 & 1 & 2 & 3 \\
\end{tabular}
"""
axis = self.data._get_axis_number(axis)
if axis == 0:
display_funcs_, obj = self._display_funcs_index, self.index
else:
display_funcs_, obj = self._display_funcs_columns, self.columns
levels_ = refactor_levels(level, obj)

if all(
(
formatter is None,
level is None,
precision is None,
decimal == ".",
thousands is None,
na_rep is None,
escape is None,
)
):
display_funcs_.clear()
return self # clear the formatter / revert to default and avoid looping

if not isinstance(formatter, dict):
formatter = {level: formatter for level in levels_}
else:
formatter = {
obj._get_level_number(level): formatter_
for level, formatter_ in formatter.items()
}

for lvl in levels_:
format_func = _maybe_wrap_formatter(
formatter.get(lvl),
na_rep=na_rep,
precision=precision,
decimal=decimal,
thousands=thousands,
escape=escape,
)

for idx in [(i, lvl) if axis == 0 else (lvl, i) for i in range(len(obj))]:
display_funcs_[idx] = format_func

return self


def _element(
html_element: str,
Expand Down
4 changes: 2 additions & 2 deletions pandas/io/formats/templates/html_table.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,13 @@
{% if exclude_styles %}
{% for c in r %}
{% if c.is_visible != False %}
<{{c.type}} {{c.attributes}}>{{c.value}}</{{c.type}}>
<{{c.type}} {{c.attributes}}>{{c.display_value}}</{{c.type}}>
{% endif %}
{% endfor %}
{% else %}
{% for c in r %}
{% if c.is_visible != False %}
<{{c.type}} {%- if c.id is defined %} id="T_{{uuid}}_{{c.id}}" {%- endif %} class="{{c.class}}" {{c.attributes}}>{{c.value}}</{{c.type}}>
<{{c.type}} {%- if c.id is defined %} id="T_{{uuid}}_{{c.id}}" {%- endif %} class="{{c.class}}" {{c.attributes}}>{{c.display_value}}</{{c.type}}>
{% endif %}
{% endfor %}
{% endif %}
Expand Down
Loading