Skip to content

BUG: Styler cannot accept semicolon in CSS val #46152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
tehunter opened this issue Feb 25, 2022 · 3 comments · Fixed by #46239
Closed
2 of 3 tasks

BUG: Styler cannot accept semicolon in CSS val #46152

tehunter opened this issue Feb 25, 2022 · 3 comments · Fixed by #46239
Labels
Bug Styler conditional formatting using DataFrame.style
Milestone

Comments

@tehunter
Copy link
Contributor

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

df = pd.DataFrame({"A": [1, 0, -1]})

css = 'text-align: left; number-format: "0;[Red](0);-;@"; ;'

try:
    (
        df
        .style
        # Replace zero with dash, negative with red
        .applymap(lambda x: css)
        .to_excel("test.xlsx")
    )
except Exception as e:
    print(e)

Issue Description

Styler does not accept CSS value string types enclosed in double quotes. In particular, it crashes if the string contains semi-colons. This means full number-format strings cannot be passed to Styler for exporting to Excel. Excel number format is of the form: <positive>;<negative>;<zero>;<text>. The semi-colon dividers here will conflict with CSS semi-colon dividers unless they can be properly encapsulated in a string.

The problem is rooted in the naive string split in maybe_convert_css_to_tuples and CSSResolver. A possible fix using the TextFileReader parser is shown in the Expected Behavior section.

Expected Behavior

In the above example, the function should export the following sheet:

image

Here is a set of changes that can produce the desired behavior using pandas parsing libraries:

### In style_render.py
def maybe_convert_css_to_tuples(style):
    from io import StringIO
    from pandas.io.parsers.readers import TextFileReader

    if style and isinstance(style, str):
        # Use "C" parser because of line terminator
        s = TextFileReader(StringIO(style), engine="c", delimiter=":", lineterminator=";", skipinitialspace=True, index=False, header=None).read()

        try:
            return [
                (row[0].strip(), row[1].strip())
                for _, row in s.iterrows()
                if row[0] != ""
            ]
        except IndexError:
            raise ValueError(
                "Styles supplied as string must follow CSS rule formats, "
                f"for example 'attr: val;'. '{style}' was given."
            )
    return style

### In css.py
class CSSResolver():
    # ...
    def parse(self, declarations_str: str):
        from io import StringIO
        from pandas.io.parsers.c_parser_wrapper import CParserWrapper
        from pandas.io.parsers.readers import TextFileReader

        if (declarations_str):
            s = TextFileReader(StringIO(declarations_str), engine="c", delimiter=":", lineterminator=";", skipinitialspace=True, index=False, header=None).read()

            try:
                for _,row in s.iterrows():
                    prop, val = row[0], row[1]

                    prop = prop.strip().lower()
                    # TODO: don't lowercase case sensitive parts of values (strings)
                    val = val.strip().lower()

                    yield prop, val
            except Exception as e:
                print(e)

### In io/formats/excel.py
class CssExcelCell(ExcelCell):
    def __init__(
        self,
        row: int,
        col: int,
        val,
        style: dict | None,
        css_styles: dict[tuple[int, int], list[tuple[str, Any]]] | None,
        css_row: int,
        css_col: int,
        css_converter: Callable | None,
        **kwargs,
    ):
        if css_styles and css_converter:
            # Not exhaustively tested. Might be better to add quotes conditionally based on type of `v`
            css = ";".join(a + ":" + "\"" + str(v) + "\"" for (a, v) in css_styles[css_row, css_col])
            style = css_converter(css)

        return super().__init__(row=row, col=col, val=val, style=style, **kwargs)

Installed Versions

INSTALLED VERSIONS

commit : 06d2301
python : 3.10.1.final.0
python-bits : 64
OS : Windows
OS-release : 10
...
pandas : 1.4.1
numpy : 1.22.0
pytz : 2021.3
dateutil : 2.8.2
pip : 22.0.3
setuptools : 60.2.0
Cython : None
pytest : 7.0.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 3.0.2
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : 8.0.0
pandas_datareader: None
bs4 : None
bottleneck : 1.3.2
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.1
numba : None
numexpr : 2.8.0
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : None
zstandard : None

@tehunter tehunter added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 25, 2022
@tehunter
Copy link
Contributor Author

Related to #42276

@attack68
Copy link
Contributor

if you are considering a solution a similar issue arises for LaTeX captions, which use a colon. The solution was in _parse_latex_table_styles

    Notes
    -----
    The replacement of "§" with ":" is to avoid the CSS problem where ":" has structural
    significance and cannot be used in LaTeX labels, but is often required by them.
    """
    for style in table_styles[::-1]:  # in reverse for most recently applied style
        if style["selector"] == selector:
            return str(style["props"][0][1]).replace("§", ":")
    return None

@rhshadrach rhshadrach added the Styler conditional formatting using DataFrame.style label Feb 27, 2022
@attack68
Copy link
Contributor

attack68 commented Mar 5, 2022

Since CSS is a specific language a need to parse semi-colons and colons differently is not required, imo.

The number-format css attribute is a pandas construct to hack the to_excel output. Since it is a pandas construct we can direct users how to interact with it so that it works. See the PR and comment if you object.

@jreback jreback added this to the 1.5 milestone Mar 5, 2022
@jreback jreback removed the Needs Triage Issue that has not been reviewed by a pandas team member label Mar 5, 2022
@jreback jreback modified the milestones: 1.5, 1.4.2 Mar 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Styler conditional formatting using DataFrame.style
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants