Enable multiclass support for TPOT-NN #1175

rachitk · 2021-02-23T22:57:53Z

What does this PR do?

Disables the requirement for binary output classes in TPOT-NN. Resolves #1149.

Where should the reviewer start?

Check tpot/builtins/nn.py and make sure the standard TPOT tests still pass.

How should this PR be tested?

Test against the standard nosetests suite and see if anything fails. There is currently a test for TPOT-NN specifically to ensure that multiclass problems are NOT supported that will fail as this PR disables that check; this can likely be modified to test that all estimators support multiclass problems.

Multiclass problems where the set of output classes does not exist in a sequence beginning from 0 will still fail - for example, a problem with classes [0, 2, 3, 5, 6] will fail due to not having class 1 and 4, and a problem with classes [1, 2, 3] will fail due to not having class 0. This will likely need to be addressed in a future PR.

The following code should test the multiclass functionality of the NN MLP classifier using TPOT's dependencies:

from tpot import TPOTClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split


if(__name__ == "__main__"):

	iris = load_iris()
	X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, train_size=0.75, test_size=0.25)

	nn_classifier = TPOTClassifier(config_dict='TPOT NN', template='Selector-Transformer-PytorchMLPClassifier', 
		generations=3, population_size=3)

	nn_classifier.fit(X_train, y_train)
	print(nn_classifier.score(X_test, y_test))

	nn_classifier.export('pipeline_nn_iris_norestriction.py')

Any background context you want to provide?

This addresses #1149. Multiclass support was tested and validated using a variety of standard multiclass datasets, with no notable score degradation from base TPOT on multiclass problems.

What are the relevant issues?

#1149

Screenshots (if appropriate)

Questions:

Do the docs need to be updated? No, but adding an example of using TPOT-NN for multiclass classification may be helpful.
Does this PR add new (Python) dependencies? No.

JDRomano2 · 2021-02-25T19:03:39Z

Please note that failed tests are not due to changes made in this PR - they fail elsewhere in the codebase. This work will be merged into master when the failed tests are resolved.

rachitk added 2 commits January 19, 2021 17:01

Comment out binary validation for multiclass testing

66ca59a

Delete lines relating to binary data check

05a1634

rachitk changed the title ~~Enable multiclass support~~ Enable multiclass support for TPOT-NN Feb 24, 2021

JDRomano2 merged commit 68ba5cb into EpistasisLab:nn Feb 25, 2021

rachitk mentioned this pull request Feb 25, 2021

Remove test checking that TPOT-NN errors for multiclass input #1177

Draft

rachitk mentioned this pull request Jul 1, 2021

Fix tests that fail due to changes in sklearn 0.24.0 #1215

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable multiclass support for TPOT-NN #1175

Enable multiclass support for TPOT-NN #1175

rachitk commented Feb 23, 2021 •

edited

Loading

JDRomano2 commented Feb 25, 2021

Enable multiclass support for TPOT-NN #1175

Enable multiclass support for TPOT-NN #1175

Conversation

rachitk commented Feb 23, 2021 • edited Loading

What does this PR do?

Where should the reviewer start?

How should this PR be tested?

Any background context you want to provide?

What are the relevant issues?

Screenshots (if appropriate)

Questions:

JDRomano2 commented Feb 25, 2021

rachitk commented Feb 23, 2021 •

edited

Loading