You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some actions that should be taken to make the repository more usable and maintainable:
Test. Currently there is a dependency on an external repo for tests. The repo should be dismissed and the test rewritten. Tests should be standard among the extractors. Suggestion: define a class TestExtractor that has all the method needed to properly test an extractor, then for each extractor there should be one file test_extractor_name.py and the related test class that extends from TestExtractor. We can use a fake predictor that simply outputs the class or the real value of a sample with a given accuracy. We should avoid the usage of complex predictors in tests such as NN. What to test? Generation of rules and reproducibility for sure. Possibly other properties.
GUI. Following the separation of concerns, just like for point 2., we can use another repository to create a version of psyke that makes use of graphical interfaces.
API. Extractors should all have defaults values for the parameters. Obviously the predictor is the only parameter that has not default value. However all other algorithm-specific parameters should have one. In this way tests and usage are straightforward.
Preprocessing. All operations that affect the dataset used during the extraction could be done in other places and not inside the extractor itself. For instance, it may be useful to have a class dedicated for this work that handles data transformations. And this is done before the creation of the extractor. The common use case of psyke is the following scenario: a scientist who has a trained predictor want to extract symbolic knowledge from it to better understand its behavior. In this case he probably already have dane some preprocessing to the dataset used to train the predictor, so we should allow the extraction of knowledge without the necessity to specify further data processing operations.
If you need some help, or some parts are not clear, fell free to write to me.
The text was updated successfully, but these errors were encountered:
Some actions that should be taken to make the repository more usable and maintainable:
Test. Currently there is a dependency on an external repo for tests. The repo should be dismissed and the test rewritten. Tests should be standard among the extractors. Suggestion: define a class TestExtractor that has all the method needed to properly test an extractor, then for each extractor there should be one file test_extractor_name.py and the related test class that extends from TestExtractor. We can use a fake predictor that simply outputs the class or the real value of a sample with a given accuracy. We should avoid the usage of complex predictors in tests such as NN. What to test? Generation of rules and reproducibility for sure. Possibly other properties.
Demo. Avoid the use of jupyter notebook in this repository. There will be another one that use the package psyke from Pypi similarly to https://github.com/psykei/demo-psyki-python.
GUI. Following the separation of concerns, just like for point 2., we can use another repository to create a version of psyke that makes use of graphical interfaces.
API. Extractors should all have defaults values for the parameters. Obviously the predictor is the only parameter that has not default value. However all other algorithm-specific parameters should have one. In this way tests and usage are straightforward.
Preprocessing. All operations that affect the dataset used during the extraction could be done in other places and not inside the extractor itself. For instance, it may be useful to have a class dedicated for this work that handles data transformations. And this is done before the creation of the extractor. The common use case of psyke is the following scenario: a scientist who has a trained predictor want to extract symbolic knowledge from it to better understand its behavior. In this case he probably already have dane some preprocessing to the dataset used to train the predictor, so we should allow the extraction of knowledge without the necessity to specify further data processing operations.
If you need some help, or some parts are not clear, fell free to write to me.
The text was updated successfully, but these errors were encountered: