Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'apply_style_by_indexes' fails if the dataframe has a column called "index" #108

Closed
GuidoBorrelli opened this issue Nov 16, 2021 · 9 comments

Comments

@GuidoBorrelli
Copy link

GuidoBorrelli commented Nov 16, 2021

Hi, I am using the following versions:

  • Python: 3.8.10
  • StyleFrame: 4.0.0
  • Pandas: 1.3.1
  • Openpyxl: 3.0.7

The following code:

def _decorate_excel(df: pd.DataFrame, xl: str):
  excel_writer = StyleFrame.ExcelWriter(xl)
  sf = StyleFrame(df.reset_index())

  for color, col_name in product(TRAFFIC_LIGHTS_COLORS, TRAFFIC_LIGHTS_COLS):
      # Set the background color
      style = Styler(bg_color=color.lower(), font_color=utils.colors.black)
  
      sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                                cols_to_style=col_name,
                                styler_obj=style,
                                overwrite_default_style=False)

where color can be YELLOW, GREEN, etc. and col_name are some of the columns of my DataFrame transformed into a StyleFrame.

Though I get the following error:

  File "/home/mlpoc/projects/churn-v2/vchurn/main_predict.py", line 40, in _decorate_excel
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
  File "/home/mlpoc/miniconda3/envs/churn/lib/python3.8/site-packages/styleframe/style_frame.py", line 638, in apply_style_by_indexes
    self.iloc[self.index.get_loc(index), self.columns.get_loc(col)].style = style_to_apply
  File "/home/mlpoc/miniconda3/envs/churn/lib/python3.8/site-packages/pandas/core/generic.py", line 5478, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'get_loc'
@DeepSpace2
Copy link
Owner

DeepSpace2 commented Nov 16, 2021

Hi. Can you please provide a better example?

If your dataframe has multi-index please be aware that styleframe does not currently support multi-indexes.

I've tried the below contrived example and it seems to be working:

from itertools import product
from styleframe import StyleFrame, Styler

TRAFFIC_LIGHT_COLORS = ('yellow', 'green', 'red')
TRAFFIC_LIGHT_COLS = ('yellow', 'green', 'red')

excel_writer = StyleFrame.ExcelWriter('output.xlsx')
sf = StyleFrame({'yellow': ['yellow', 2], 'green': [1, 'green'], 'red': [1, 'red']})

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)
    
sf.to_excel(excel_writer).save()

image

@GuidoBorrelli
Copy link
Author

Hi. Can you please provide a better example?

If your dataframe has multi-index please be aware that styleframe does not currently support multi-indexes.

I've tried the below contrived example and it seems to be working:

from itertools import product
from styleframe import StyleFrame, Styler

TRAFFIC_LIGHT_COLORS = ('yellow', 'green', 'red')
TRAFFIC_LIGHT_COLS = ('yellow', 'green', 'red')

excel_writer = StyleFrame.ExcelWriter('output.xlsx')
sf = StyleFrame({'yellow': ['yellow', 2], 'green': [1, 'green'], 'red': [1, 'red']})

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)
    
sf.to_excel(excel_writer).save()

image

Hi

In the first message I added the signature of the method.
I think the issue comes from the fact that I get the StyleFrame from a Pandas Dataframe and then the Styleframe library doesn't find something it would actually expect.

Could you run your snippet of code by using this startup?


TRAFFIC_LIGHT_COLORS = ('yellow', 'green', 'red')
TRAFFIC_LIGHT_COLS = ('status1', 'status2', 'status3')

excel_writer = StyleFrame.ExcelWriter('output.xlsx')
df = pd.DataFrame({"status1": ["yellow", "green", "yellow"], "status2": ["yellow", "red", "red"], "status3": ["green", "green", "green"]}
sf = StyleFrame(df)

@DeepSpace2
Copy link
Owner

DeepSpace2 commented Nov 16, 2021

I still can't reproduce the issue.

Full code:

import pandas as pd

from itertools import product
from styleframe import StyleFrame, Styler

TRAFFIC_LIGHT_COLORS = ('yellow', 'green', 'red')
TRAFFIC_LIGHT_COLS = ('status1', 'status2', 'status3')

df = pd.DataFrame({"status1": ["yellow", "green", "yellow"],
                   "status2": ["yellow", "red", "red"],
                   "status3": ["green", "green", "green"]})

sf = StyleFrame(df)

excel_writer = StyleFrame.ExcelWriter('output.xlsx')

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)

sf.to_excel(excel_writer).save()

image

from styleframe import version

print(version.get_all_versions())
Python 3.8.12 (default, Oct 13 2021, 09:15:35)
[GCC 10.2.1 20210110]
pandas 1.3.1
openpyxl 3.0.7
StyleFrame 4.0.0

@GuidoBorrelli
Copy link
Author

GuidoBorrelli commented Nov 17, 2021

I could reproduce it by exporting just 2 rows of my dataset and plotting some characteristics of it.

import pandas as pd
from itertools import product
from styleframe import StyleFrame, Styler

df = pd.read_pickle("df_tst.pkl")
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 6 columns):

Column   Non-Null Count  Dtype 
 0   index    2 non-null      int64 
 1   STATUS1  2 non-null      object
 2   STATUS2  2 non-null      object
 3   STATUS3  2 non-null      object
 4   STATUS4  2 non-null      object
 5   STATUS5  2 non-null      object

dtypes: int64(1), object(5)
memory usage: 224.0+ bytes

df

index | STATUS1 | STATUS2 | STATUS3 | STATUS4 | STATUS5
| -- | -- | -- | -- | -- | --
0 | RED | GREEN | RED | RED | RED
1 | RED | RED | RED | RED | RED
TRAFFIC_LIGHT_COLORS = ('YELLOW', 'GREEN', 'RED')
TRAFFIC_LIGHT_COLS = ('STATUS1', 'STATUS2', 'STATUS3', 'STATUS4', 'STATUS5')
sf = StyleFrame(df)
excel_writer = StyleFrame.ExcelWriter('output.xlsx')

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)
AttributeError                            Traceback (most recent call last)
<ipython-input-4-be916ccb326b> in <module>
     12 for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
     13     styler = Styler(bg_color=color.lower())
---> 14     sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
     15                               cols_to_style=col_name,
     16                               styler_obj=styler,

~/miniconda3/envs/py38/lib/python3.8/site-packages/styleframe/style_frame.py in apply_style_by_indexes(self, indexes_to_style, styler_obj, cols_to_style, height, complement_style, complement_height, overwrite_default_style)
    636             index.style = style_to_apply
    637             for col in cols_to_style:
--> 638                 self.iloc[self.index.get_loc(index), self.columns.get_loc(col)].style = style_to_apply
    639 
    640         if height:

~/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5476         ):
   5477             return self[name]
-> 5478         return object.__getattribute__(self, name)
   5479 
   5480     def __setattr__(self, name: str, value) -> None:

AttributeError: 'Series' object has no attribute 'get_loc'
from styleframe import version
print(version.get_all_versions())
Python 3.8.10 (default, Jun  4 2021, 15:09:15) 
[GCC 7.5.0]
pandas 1.3.1
openpyxl 3.0.7
StyleFrame 4.0.0

Ps. how do you format so well your code and outputs? Are you pasting from a notebook?

@DeepSpace2
Copy link
Owner

Can you please share the actual pkl file that reproduces this issue (or a smaller one that still reproduces the issue)?

I'm formatting code snippets with code blocks and the python language tag:

```python

code here

```

@GuidoBorrelli
Copy link
Author

Thanks for your answer again.
I created a CSV file out of the guilty pickle one and once reloaded as CSV it still reproduces error.
Safer than sharing pickle files :)

Namely:

import pandas as pd
from itertools import product
from styleframe import StyleFrame, Styler

df_pkl = pd.read_pickle("df_tst.pkl")
df_pkl.info()

df_pkl.to_csv("df_test.csv", index=False)
df_csv = pd.read_csv("df_test.csv")

df = df_csv

df_test.csv

Then by running

TRAFFIC_LIGHT_COLORS = ('YELLOW', 'GREEN', 'RED')
TRAFFIC_LIGHT_COLS = ('STATUS1', 'STATUS2', 'STATUS3', 'STATUS4', 'STATUS5')
sf = StyleFrame(df)
excel_writer = StyleFrame.ExcelWriter('output.xlsx')

for color, col_name in product(TRAFFIC_LIGHT_COLORS, TRAFFIC_LIGHT_COLS):
    styler = Styler(bg_color=color.lower())
    sf.apply_style_by_indexes(indexes_to_style=sf[sf[col_name] == color],
                              cols_to_style=col_name,
                              styler_obj=styler,
                              overwrite_default_style=False)

I get the error mentioned in the above comments.

@DeepSpace2
Copy link
Owner

DeepSpace2 commented Nov 17, 2021

Great, I've managed to reproduce the issue. The reason is that you have a column called 'index'. This is easily reproducible:

df = pd.DataFrame({'index': [1, 2]})
sf = StyleFrame(df)
sf.apply_style_by_indexes(sf.data_df.index, Styler())

AttributeError: 'Series' object has no attribute 'get_loc'

sf = StyleFrame({'index': [1, 2]}) produces the same error.

As a quick workaround I'd suggest that you rename the column before converting to styleframe:

df = pd.DataFrame({'index': [1, 2]})
sf = StyleFrame(df.rename(columns={'index': '_index'}))
sf.apply_style_by_indexes(sf.data_df.index, Styler())

# NO ERROR

I'll have to do some digging to permanently fix this.

@DeepSpace2 DeepSpace2 self-assigned this Nov 17, 2021
@DeepSpace2 DeepSpace2 added the bug label Nov 17, 2021
@DeepSpace2 DeepSpace2 changed the title Adding background color to some columns 'apply_style_by_indexes' fails if the dataframe has a column called "index" Nov 17, 2021
@GuidoBorrelli
Copy link
Author

Oh great, that's a quick workaround. Thank you.

Best,
Guido

@stale stale bot added the stale label Jan 18, 2022
Repository owner deleted a comment from stale bot Jan 18, 2022
@DeepSpace2 DeepSpace2 added this to the 4.2 milestone Oct 6, 2022
@DeepSpace2
Copy link
Owner

Fixed in version 4.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants