Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regression algorithms to give correct output dimensions #1335

Merged
merged 7 commits into from
Dec 12, 2021

Conversation

eddiebergman
Copy link
Contributor

@eddiebergman eddiebergman commented Dec 7, 2021

This PR addresses issue #1297 in which some regressors gave the wrong output dimensions. This was due to the use of StandardScaler on the target column(s) y. The main source of the issue was due to the regressors not being updated one multi-output regression was considered.

Issues

  • 1d arrays were converted to a 2d array, [0,0,0] -> [[0], [0], [0]]
  • This same conversion was done on 2d arrays, [[1,1], [1,1]] -> [[1], [1], [1], [1]]
  • The inverse transform of the standard scaler gave back a 2d array, even if the predictions were 1d. scaler.inverse_transform([1,1,1]) -> [[1], [1], [1]].

Created tests for all regression algorithms (classifiers do not use standard scalers). These test the algorithms with 3 different kinds of target shapes. These tests also check that the output shape is correct, where (n,1) is flattened back out to (n,).

  • (n,) -> (n,)
  • (n,1) -> (n,)
  • (n,m) -> (n,m)

This PR also consolidates all the ignored warnings into one file and provides a context manager to be used in tests if we want to ignore warnings.

@codecov
Copy link

codecov bot commented Dec 7, 2021

Codecov Report

Merging #1335 (84b01d3) into development (b90c228) will increase coverage by 0.07%.
The diff coverage is 97.01%.

Impacted file tree graph

@@               Coverage Diff               @@
##           development    #1335      +/-   ##
===============================================
+ Coverage        87.90%   87.97%   +0.07%     
===============================================
  Files              140      140              
  Lines            10938    10977      +39     
===============================================
+ Hits              9615     9657      +42     
+ Misses            1323     1320       -3     
Impacted Files Coverage Δ
...learn/pipeline/components/regression/libsvm_svr.py 89.77% <91.66%> (-1.70%) ⬇️
autosklearn/pipeline/components/regression/mlp.py 95.34% <93.33%> (-0.49%) ⬇️
...sklearn/pipeline/components/regression/adaboost.py 97.50% <100.00%> (+0.13%) ⬆️
...n/pipeline/components/regression/ard_regression.py 98.07% <100.00%> (+0.07%) ⬆️
...rn/pipeline/components/regression/decision_tree.py 94.82% <100.00%> (+0.18%) ⬆️
...earn/pipeline/components/regression/extra_trees.py 93.90% <100.00%> (+0.15%) ⬆️
...pipeline/components/regression/gaussian_process.py 97.36% <100.00%> (+0.07%) ⬆️
...ipeline/components/regression/gradient_boosting.py 93.39% <100.00%> (+0.12%) ⬆️
...eline/components/regression/k_nearest_neighbors.py 97.05% <100.00%> (+0.18%) ⬆️
...rn/pipeline/components/regression/liblinear_svr.py 98.03% <100.00%> (+0.08%) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b90c228...84b01d3. Read the comment docs.

@eddiebergman eddiebergman mentioned this pull request Dec 8, 2021
@eddiebergman eddiebergman merged commit c03438b into development Dec 12, 2021
@eddiebergman eddiebergman mentioned this pull request Jan 24, 2022
eddiebergman added a commit that referenced this pull request Jan 25, 2022
* Added ignored_warnings file

* Use ignored_warnings file

* Test regressors with 1d, 1d as 2d and 2d targets

* Flake'd

* Fix broken relative imports to ignore_warnings

* Removed print and updated parameter type for tests

* Type import fix
@eddiebergman eddiebergman mentioned this pull request Jan 25, 2022
eddiebergman added a commit that referenced this pull request Aug 18, 2022
* Added ignored_warnings file

* Use ignored_warnings file

* Test regressors with 1d, 1d as 2d and 2d targets

* Flake'd

* Fix broken relative imports to ignore_warnings

* Removed print and updated parameter type for tests

* Type import fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue with array dimension error in regression models
2 participants