-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added outlier detection with iterativetrimming #227
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, I let some comments.
from typing import Union | ||
try: | ||
from bofire.data_models.outlier_detection.outlier_detection import OutlierDetection,IterativeTrimming | ||
AnyOutlierDetection = Union[OutlierDetection,IterativeTrimming] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you have this try except structure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied it from api of data_models/surrogates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not needed here, please remove it.
type: str | ||
|
||
|
||
class IterativeTrimming(OutlierDetection): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a docstring in which you mention the method and the paper and explain the parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use google type docstrings
self.base_gp = data_model.base_gp | ||
super().__init__() | ||
|
||
def detect(self, experiments: pd.DataFrame): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def detect(self, experiments: pd.DataFrame): | |
def detect(self, experiments: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call it in the tests exactly as in the the main structure, outlier_validation
--> outlier_detection
Trimmed, outliers = ITGP.detect(experiments=experiments) | ||
assert isinstance(Trimmed, pd.DataFrame) | ||
assert isinstance(outliers, pd.DataFrame) | ||
assert len(experiments) == len(Trimmed) + len(outliers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also check it it was able to finde most of the outliers?
experiments["valid_y"] = 1 | ||
|
||
|
||
@pytest.mark.parametrize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do not need the parameterization here. just write everything starting in line 20 into the test_IterativeTrimming method, and remove the argument experiments
from it.
I have answered them |
Please also add a specs module as shown here: https://github.com/experimental-design/bofire/blob/main/tests/bofire/data_models/specs/surrogates.py and add the serialization and deserialization tests. |
Furthermore, a notebook under tutorials would be nice in which the ITGP outlier detection stuff is demonstrated based on the ITGP paper. Call it outlier_detection.ipynb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Hi @jduerholt ,
I have added the robust GP outlier as in this paper (https://arxiv.org/abs/2011.11057