-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial shell implementation of the raitools package #427
Conversation
pass | ||
|
||
@staticmethod | ||
def load(path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a potential problem. If this is going to load a fully reusable RAIAnalyzer
then the saved file is going to have to embed the model
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can take in a model serializer object that defines how to serialize the model? Otherwise we can allow the user to set the model on the RAIAnalyzer. Allowing them to set the model will break data consistency though, since they may set the wrong model, so having a serializer seems safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will leave this question to a future PR, as I won't implement save/load in this PR since it's already quite large for the functionality it already has
b79c0f7
to
7e01207
Compare
:type classes: list | ||
""" | ||
self._model = model | ||
self._initialization_examples = \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling that all the managers will be doing something like this... might want to refactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't fairness use the whole dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually something which popped up in the latest update to the AzureML API proposal. The add()
method for the FairnessManager
has acquired a dataset
argument which can be initialisation
evaluation
or both
(which would concat the two first)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, are these going to be copies or references? We don't want to be keeping multiple copies of the data around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
references, but removing datasets on the explanations means we need to make changes in interpret-community, I think that's out of scope for this PR but it might be something we can do long-term
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm confused. Why would we need to reinsert them if we want to get rid of them from the explanation? And if we are ok with keeping them on the explanation, and they are references already, why would we want to reinsert them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will they still be references after we've saved to disk and reloaded?
If the user wants an explanation, then we would need to reinsert the datasets. However, if they're then firing up the ModelAssessmentDashboard, won't that know how to get the common datasets whenever required, rather than needing them embedded in the explanations themselves? Again, the the transition to the TS layer, will the data remain referenced, or would we end up with multiple copies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you've highlighted multiple issues here, but that code hasn't been implemented -- when it will be implemented, we should make sure to use references and not make copies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why not build it in from the beginning?
We certainly could make it so that each component is a literal copy of what we have in the individual dashboards now. For absolute MVP, that will work. However, if we take that approach, then we have technical debt going forward, in removing the duplication. And if we've written files like that, then things are particularly painful since we'll have to keep supporting the older files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a new comment on teams. Maybe we can allow users to return explanations without datasets on them. There was already a user request for this:
interpretml/interpret-community#368
I can add this as a new feature in interpret-community.
""" | ||
|
||
Classification = 'classification' | ||
Regression = 'regression' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probability? For when you have a classification problem but want to work with prediction probabilities for each class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a probability task? Sorry I'm confused. Is there a corresponding list of tasks in fairlearn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's one of the options for Fairlearn, where instead of a predicted class, you have a 'probability of class 1'. Perhaps a PM question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asked PM to clarify, waiting for PM reply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I'm a bit surprised about this issue because in the mock notebook from PM it just says:
task_type = 'classification or regression'
definitely would be great to get more clarity on this functionality
"""Provide model task constants. Can be 'classification' or 'regression'. | ||
""" | ||
|
||
Classification = 'classification' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BinaryClassification, surely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we automatically recognize this in our explainers, I'm not sure about other libraries like fairlearn, but if they need this specified I can add it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only have metrics for binary classification and regression (and the probability ones). The meaning of things like 'false positive rate' start becoming a little fuzzy when you have multiclass classification. Again, perhaps a PM question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asked PM to clarify, waiting for PM reply.
if self._is_run: | ||
return | ||
model_task = ModelTask.Unknown | ||
explainer = MimicExplainer(self._model, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do the explainers also embed copies of the data? I've never dug into the details of the structure of the returned objects. If so, then we ought to delete it from this copy, and refer to the centrally held data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation has a copy of the dataset, and the explainer does too. But, I'm not sure why we would want to delete the dataset from the explanation? Is that just to reduce memory usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RAIAnalyzer won't be modifying the input dataset internally, will it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, yes - to reduce memory usage. On the fairness side, I was expecting that we would only store the metrics, and not the copies of the y_true
y_pred
and sensitive_feature
arrays. Those will all add up, for no good reason. Then the dashboard wanted the particular arrays, it could look them up in the central store in the Analyser object.
17f740d
to
8b87f2d
Compare
SparseNumFeaturesThreshold = 1000 | ||
|
||
|
||
class ExplainerManager(BaseManager): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If managers are going to be public we should deliberate a bit on the name of explainability/interpret that we want to use for all of these things. One reason to prefer interpret is that counterfactuals are also considered explanations. The distinction might start to get confusing if we don't have good names for these two components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, but FeatureExplainerManager seems too long, and interpret is never used in code by industry, it's only used in product names. Counterfactual Explanations are different from Feature Importance Explanations, but since the latter is more prevalent I would argue to keep it as just Explainer?
8b87f2d
to
6d95593
Compare
6d95593
to
36aeaa7
Compare
36aeaa7
to
e62442f
Compare
The Responsible AI Tools SDK enables users to analyze their machine learning models in one API. Users will be able to analyze errors, explain the most important features, validate fairness, compute counterfactuals and run causal analysis in a single API.
Highlights of the package include:
fairness.add()
allows users to run Fairlearn to assess model fairnessexplainer.add()
allows users to explain their modelcounterfactuals.add()
allows users to compute counterfactualserror_analysis.add()
allows users to run error analysiscausal.add()
allows users to run causal analysis