-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH
add metric frame
#298
ENH
add metric frame
#298
Conversation
@BenjaminBossan I feel I should write a test for this, but I'm unsure where it should go. I was thinking of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do a thorough review yet, just high level.
I feel I should write a test for this, but I'm unsure where it should go. I was thinking of skops/card/tests/test_card.py inside the TestTableSection class. Thoughts? Is there somewhere else it would be better?
I think it would be best to add a completely new TestAddMetricFrame
. You can take inspiration from other tests though.
skops/card/_model_card.py
Outdated
@@ -11,6 +11,7 @@ | |||
from typing import Any, Iterator, Literal, Protocol, Sequence, Union | |||
|
|||
import joblib | |||
from fairlearn.metrics import MetricFrame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not import fairlearn at the top level, since it would mean that it's a required dependency. It should be imported inside the corresponding method, similar to how you import pandas there.
However, you should not import them directly, but use the helper function we introduced, like here:
skops/skops/card/_model_card.py
Line 1130 in b324b8d
plt = import_or_raise("matplotlib.pyplot", "permutation importance") |
skops/card/_model_card.py
Outdated
self, metrics: dict, y_true, y_pred, sensitive_features, pivot=True | ||
) -> Card: | ||
""" | ||
Add a metric frame table to the model card. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the name add_metric_frame
, I'm just wondering out loud if we can find a better name. For users who don't know fairlearn, the name could be very confusing, and the docstring "Add a metric frame table to the model card" doesn't add much.
@BenjaminBossan Thanks for the tip! Sorry, I missed that from the Contribution Guide. I'm currently working on making the tests. I'll ping you whenever it's ready again! |
@BenjaminBossan This is ready for you to take a look at! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this. I have a couple of code comments, but before considering those, I would like to discuss a bigger design decision.
Right now, the user more or less passes the arguments for MetricFrame
to the add_fairlearn_metric_frame
method, which takes care of creating the MetricFrame
instance. I would propose to change this so that the user has to create the MetricFrame
instance themselves, then pass it to add_fairlearn_metric_frame
("inversion of control"), which does not need to construct it but just takes care of creating and adding the table.
This approach is similar to add_permutation_importances
, which takes the computed permutation importances as input, instead of computing them inside the method.
Why do I think this could be better? Here are a few reasons:
With the current implementation, the user loses control over the instantiation of MetricFrame
. Therefore, if they want to use something like control_features
or sample_params
, it's not possible. Of course, we could add more parameters to the signature of add_fairlearn_metric_frame
, but it only gets bigger and bigger that way, and we have to keep it up to date when fairlearn changes.
Another advantage of having the user pass the instance is that we don't need to import fairlearn
inside the method. If a user creates that object, they have already imported fairlearn
.
One more advantage is that if a user has a custom MetricFrame
class, they can pass that to the method, whereas right now, it's impossible to use. For testing, we could even create a mock object instead of using a real MetricFrame
, and then skops would have no dependency on fairlearn at all! But I think adding a test dependency is fine here.
A disadvantage of my proposal is that users have to do a little bit of extra work by instantiating the object themselves.
Overall, I think this price is worth paying. What do you think? If we decide to make this change and you refactor the code accordingly, then many of my comments are obsolete. I think you will see which ones.
skops/card/_model_card.py
Outdated
y_pred, | ||
sensitive_features, | ||
table_name: str, | ||
pivot=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need the pivot
option? Also, is it not "transpose" more than "pivot"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I guess we don't really need the pivot
if we don't want to.
So if we don't transpose it the table looks like:
difference | group_max | group_min | ratio | |
---|---|---|---|---|
selection_rate | 0.4 | 0.8 | 0.4 | 0.5 |
But when transposing it looks like:
selection_rate | |
---|---|
difference | 0.4 |
group_max | 0.8 |
group_min | 0.4 |
ratio | 0.5 |
Personally, I see the transposed version as more useful so we could just make that one the version of the table that is generated. What do you think?
I'm fine changing the name of the parameter to transpose
if we decide to keep it.
skops/card/_model_card.py
Outdated
""" | ||
Add a Fairlearn MetricFrame table to the model card. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
Add a Fairlearn MetricFrame table to the model card. | |
"""Add a Fairlearn MetricFrame table to the model card. |
Could you also add a few words of description here, + a link to to fairlearn? If you want to link to the class inside fairlearn docs, you would need to add an entry to the intersphinx mappings:
Line 145 in 3e1f138
intersphinx_mapping = { |
skops/card/_model_card.py
Outdated
|
||
Parameters | ||
---------- | ||
metrics: dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the fairlearn docs, metrics can also be a callable. I think the outputs of metric_frame.difference()
etc. will be scalars instead of lists, which is a bit annoying. So when the table is created, we might need to add np.atleast_1d(metric_frame.difference())
or something like this to support callables.
@BenjaminBossan Thanks for the review! I haven't looked at your code comments yet since the decision of whether to pass in the parameters or a
I agree! I was thinking of that while writing the method but was unsure if that was something we wanted to do. I do wonder if we require users to install and import Beyond that, I don't really see too many disadvantages that would outway the advantages. What're your thoughts? |
We would do that anyway, fairlearn would not be installed by default when installing skops. And even if it were, users don't just discover and try out random packages that were installed :)
This + documenting it well are definitely the way to go for users to discover and use the feature.
Great, I think we agree then. |
@BenjaminBossan Thanks for explaining this! I think I've addressed/responded to most of your comments. I still need to handle scalars from the |
Great, thanks, please ping Adrin and me once it's ready for review. |
@BenjaminBossan and @adrinjalali This is ready to review again!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good to me, and we should probably incorporate it in our existing examples, but that could be a separate PR.
skops/card/_model_card.py
Outdated
def add_fairlearn_metric_frame( | ||
self, | ||
metric_frame, | ||
table_name: str = "Fairlearn MetricFrame Table", | ||
transpose=True, | ||
) -> Card: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we be adding a description
to our add_*
methods? From the user's perspective, it seems odd that it's tricky to add a description here for sections. I think it'd make sense for all of them to allow a description as well as a title. WDYT? also cc @skops-dev/maintainers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree it would be good to have, some existing methods also don't have it, e.g. add_table
. It's probably easier to have a separate PR where this argument is added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adrinjalali For clarification, should I add the description
argument to this method in this PR? Or do it in another one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can keep it for another PR as @BenjaminBossan suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing: Could you please change the type annotation to return -> Self
? This is a change we recently made on all the other methods as well.
------- | ||
self: Card | ||
The model card with the metric frame added. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
Notes | |
-------- | |
You can check `fairlearn's documentation | |
<https://fairlearn.org/v0.8/user_guide/assessment/index.html>`__ on how to | |
work with `MetricFrame`s. | |
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not much to add from my side. I think the transpose feature could be implemented without pandas, but it's not a big deal, most users will probably have pandas installed anyway.
|
I didn't even think about that, then forget what I said :) |
@adrinjalali and @BenjaminBossan Thanks for reviewing this! I've addressed all the comments except for one I needed clarification on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic, just this minor comment, then we can merge.
skops/card/_model_card.py
Outdated
def add_fairlearn_metric_frame( | ||
self, | ||
metric_frame, | ||
table_name: str = "Fairlearn MetricFrame Table", | ||
transpose=True, | ||
) -> Card: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing: Could you please change the type annotation to return -> Self
? This is a change we recently made on all the other methods as well.
@BenjaminBossan I've addressed your comment! Sorry, it took me a couple of days to get to this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. Thanks a lot for your contributions, really appreciated.
We merged skops-dev#298 and skops-dev#310 shortly after each other but they contained an incompatibility that broke the fairlearn tests (the code itself was fine). This PR fixes this incompatibility. On top, I added the description argument to add_fairlearn_metric_frame, to be consistent with all the other methods, and also as a test for it. Finally, 2 small fixes: - Added type annotation to transpose argument - Changed order of arguments in docstring to match order in signature
We merged #298 and #310 shortly after each other but they contained an incompatibility that broke the fairlearn tests (the code itself was fine). This PR fixes this incompatibility. To be clear, the only change needed to fix the tests is the following: ```python - actual_table = card.select("Metric Frame Table").content.format() + actual_table = card.select("Metric Frame Table").format() ``` On top, I added the `description` argument to `add_fairlearn_metric_frame`, to be consistent with all the other methods (also changed in #310), and I also added as a test for it. Since we now have 2 tests, I moved the `metric_frame` variable to a fixture. Finally, 2 small fixes: - Added type annotation to transpose argument - Changed order of arguments in docstring to match order in signature
Closes #278