Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove the arrow up and down from the column name in the dataframe #1260

Closed
glemaitre opened this issue Jan 29, 2025 · 5 comments · Fixed by #1312
Closed

fix: Remove the arrow up and down from the column name in the dataframe #1260

glemaitre opened this issue Jan 29, 2025 · 5 comments · Fixed by #1312
Assignees
Milestone

Comments

@glemaitre
Copy link
Member

When showing a score dataframe, e.g. report.metrics.report_metrics(), the column name contains some arrow up/down depending if we show a score or a loss. However, adding this information directly in the column is annoying to index a specific score.

We should come with another way to provide this information. Maybe, a dirty way but possible way would be to patch the HTML repr from pandas for those specific dataset to incorporate the information in the HTML representation itself.

@glemaitre
Copy link
Member Author

From a discussion with @adrinjalali IRL, we could make the metric being the row instead of column of the dataframe. It allows 2 things:

  • we can add a new column with the "favorability" (I think it is called like this in some part of the code) that show the indicator for each columns and it could be controlled by a parameter if someone wants the info.
  • for this to works it means that we need to transpose the report.

Transposing makes sense because right now, if one want to plot with pandas, you need to write:

report.metrics.report_metrics().T.plot.barh()

We can simplify it to

report.metrics.report_metrics().plot.barh()

which make total sense to me.

@glemaitre
Copy link
Member Author

An additional thought is about polars/pandas dataframe. Currently, we have multindex by default (because I like them). polars would support it, @GaelVaroquaux and @adrinjalali complain about them.

I think that we all agree that multindex are nice when we look at the HTML representation but they are not easy to deal with when indexing. Therefore, I think that we need a parameter to explicitely request flat vs. multindex. For instance, in the model cards, we want only the representation and multindex make sense. When a user call report.metrics.report_metrics, we can provide one an option to take the type of index that one knows how to deal with.

@sylvaincom
Copy link
Contributor

sylvaincom commented Feb 4, 2025

Thanks, +1 for "multi index are nice for visualization but not easy for indexing"

@glemaitre
Copy link
Member Author

I also think that it should be in the 0.7 milestone. Once we transpose, it will be pretty straightforward. ping @MarieS-WiMLDS

@MarieSacksick
Copy link
Contributor

MarieSacksick commented Feb 12, 2025

Same as the other issue: if it's ready, it's a bonus! It will be at least necessary for 0.8 where we want to include improvement about the reports.
I'm adding it to the next milestone to show that it's not a blocker for 0.7 release, but it doesn't mean it's forbidden to start it before hand obviously :)!

@MarieSacksick MarieSacksick added this to the skore 0.8 milestone Feb 12, 2025
auguste-probabl pushed a commit that referenced this issue Feb 20, 2025
closes #1260 

Add whether a metric is greater is better or lower is better as a column
indicator instead to have it in the name of the index.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants