-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code intrsospection into autosklearn components #1530
Open
eddiebergman
wants to merge
92
commits into
development
Choose a base branch
from
document_model_capabilities
base: development
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* only active if kernel == 'poly' * adapt the metadata to reflect this
* black checker * Simplified * add examples to black format check Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de>
* re-structure manual and use 'collapse' * ADD link to auto-sklearn-talks * unifying titles * Clarify default memory and cpu usage * FIX sphinx_gallery to <=0.10.0 0.10.1 would raise an error for '-D plot_gallery=0' * Re-structure faq * FIX comments by mfeurer * boldface items * merge manual into FAQ * FIX minor * FIX typo * Update doc/faq.rst Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com> * Update doc/faq.rst Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com> * Update doc/faq.rst Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com> * Update doc/faq.rst Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com> * Update doc/manual.rst Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com> * Update doc/manual.rst Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com> * Update doc/faq.rst Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com> * FIX link Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>
If you're only exposure to using... -> If your only exposure to using...
* np.bool deprecation * Invalid escape sequence \_ * Series specify dtype * drop na requires keyword args deprecation * unspecified np.int size deprecated, use int instead * deprecated unspeicifed np.int precision * Element wise comparison failed, will raise error in the future * Specify explicit dtype for empty series * metric warnings for mismatch between y_pred and y_true label count * Quantile transformer n_quantiles larger than n_samples warning ignored * Silenced convergence warnings * pass sklearn args as keywords * np.bool deprecation * Invalid escape sequence \_ * Series specify dtype * drop na requires keyword args deprecation * unspecified np.int size deprecated, use int instead * deprecated unspeicifed np.int precision * Element wise comparison failed, will raise error in the future * Specify explicit dtype for empty series * metric warnings for mismatch between y_pred and y_true label count * Quantile transformer n_quantiles larger than n_samples warning ignored * Silenced convergence warnings * pass sklearn args as keywords * flake8'd * flake8'd * Fixed CategoricalImputation not accounting for sparse matrices * Updated to use distro for linux distribution * Ignore convergence warnings for gaussian process regressor * Averaging metrics now use zero_division parameter * Readded scorers to module scope * flake8'd * Fix * Fixed dtype for metalearner no run * Catch gaussian process iterative fit warning * Moved ignored warnings to tests * Correctly type pd.Series * Revert back to usual iterative fit * Readded missing iteration increment * Removed odd backslash * Fixed imputer for sparse matrices * Ignore warnings we are aware about in tests * Flake'd: * Revert "Fixed imputer for sparse matrices" This reverts commit 05675ad. * Revert "Revert "Fixed imputer for sparse matrices"" This reverts commit d031b0d. * Back to default values * Reverted to default behaviour with comment * Added xfail test to document * flaked * Fixed test, moved to np.testing for assertion * Update autosklearn/pipeline/components/data_preprocessing/categorical_encoding/encoding.py Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de> Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de>
* Added manual dispatch to tests * Removed parameters to manual dispatch
…tors (#1332) * Update docstrings and types * doc typo fix * flake'd
* added python 3.10 to versions * Added quotes around versions * Trigger tests
* Add submodule * Port to abstract_ensemble, backend from automl_common * Updated workflow files * Update imports * Trigger actions * Another import fix * update import * m * Backend fixes * Backend parameter update * fixture fix for backend * Fix tests * readd old abstract ensemble for now * flake8'd * Added install from source to readme * Moved installation w.r.t submodules to the docs * Temporarily remove submodule * Readded submodule * Updated to use automl_common under autosklearn * Updated MANIFEST * Removed uneeded statements from MANIFEST * Fixed import * Fixed comment line in MANIFEST.in * Added automl_common/setup.py to MANIFEST * Added prefix to script * Re-added removed title # * Added note for submodule for CONTRIBUTING * Made the submodule step a bit more clear for contributing.md * CONTRIBUTING fixes
* Added versioning for sphinx, docutils - introduced by sphinxtoolbox * Fixed bug with config value for `plot_gallery` in doc makefile * Update linkcheck command as well
* Added ignored_warnings file * Use ignored_warnings file * Test regressors with 1d, 1d as 2d and 2d targets * Flake'd * Fix broken relative imports to ignore_warnings * Removed print and updated parameter type for tests * Type import fix
* Added random state to classifiers * Added some doc strings * Removed random_state again * flake'd * Fix some test issues * Re-added seed to test * Updated test doc for unknown test * flake'd
* Added ignored_warnings file * Use ignored_warnings file * Test regressors with 1d, 1d as 2d and 2d targets * Flake'd * Fix broken relative imports to ignore_warnings * Removed print and updated parameter type for tests * Added warning catches to fit methods in tests * Added more warning catches * Flake'd * Created top-level module to allow relativei imports * Deleted blank line in __init__ * Remove uneeded ignore warnings from tests * Fix bad indent * Fix github merge conflict editor whitespaces and indents
* update workflow files * typo fix * Update pytest * remove bad semi-colon * Fix test runner command * Remove explicit steps required from older version * Explicitly add Conda python to path for subprocess command in test * Fix the mypy compliance check * Added PEP 561 compliance * Add py.typed to MANIFEST for dist * Remove py.typed from setup.py
* rename OSX -> macOS as it is the new name rename OSX -> macOS as it is the new name for the operating system. e.g. see https://www.apple.com/macos * Update doc/installation.rst Co-authored-by: Matthias Feurer <lists@matthiasfeurer.de> * Update doc/installation.rst Co-authored-by: Matthias Feurer <lists@matthiasfeurer.de> Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de> Co-authored-by: Matthias Feurer <lists@matthiasfeurer.de>
…semble (#1321) * Changed show_models() function to return a dictionary of models in the ensemble instead of a string
* Remove flaky dep * Remove unused pytest import
* Fix: MLPRegressor tests * Fix: Ordering of statements in test * Fix: MLP n_calls
* Fix: Raises errors with the config * Add: Skip error for kernal_pca Seems kernel_pca emits the error: * `"zero-size array to reduction operation maximum which has no identity"` This is gotten on the line `max_eig = lambdas.max()` which makes me assume it emits a matrix with no real eigen values, not something we can really control for
…ures (#1250) * Moved to new splitter, moved to util file * flake8'd * Fixed errors, added test specifically for CustomStratifiedShuffleSplit * flake8'd * Updated docstring * Updated types in docstring * reduce_dataset_size_if_too_large supports more types * flake8'd * flake8'd * Updated docstring * Seperated out the data subsampling into individual functions * Improved typing from Automl.fit to reduce_dataset_size_if_too_large * flak8'd * subsample tested * Finished testing and flake8'd * Cleaned up transform function that was touched * ^ * Removed double typing * Cleaned up typing of convert_if_sparse * Cleaned up splitters and added size test * Cleanup doc in data * rogue line added was removed * Test fix * flake8'd * Typo fix * Fixed ordering of things * Fixed typing and tests of target_validator fit, transform, inv_transform * Updated doc * Updated Type return * Removed elif gaurd * removed extraneuous overload * Updated return type of feature validator * Type fixes for target validator fit * flake8'd * Moved to new splitter, moved to util file * flake8'd * Fixed errors, added test specifically for CustomStratifiedShuffleSplit * flake8'd * Updated docstring * Updated types in docstring * reduce_dataset_size_if_too_large supports more types * flake8'd * flake8'd * Updated docstring * Seperated out the data subsampling into individual functions * Improved typing from Automl.fit to reduce_dataset_size_if_too_large * flak8'd * subsample tested * Finished testing and flake8'd * Cleaned up transform function that was touched * ^ * Removed double typing * Cleaned up typing of convert_if_sparse * Cleaned up splitters and added size test * Cleanup doc in data * rogue line added was removed * Test fix * flake8'd * Typo fix * Fixed ordering of things * Fixed typing and tests of target_validator fit, transform, inv_transform * Updated doc * Updated Type return * Removed elif gaurd * removed extraneuous overload * Updated return type of feature validator * Type fixes for target validator fit * flake8'd * Fixed err message str and automl sparse y tests * Flak8'd * Fix sort indices * list type to List * Remove uneeded comment * Updated comment to make it more clear * Comment update * Fixed warning message for reduce_dataset_if_too_large * Fix test * Added check for error message in tests * Test Updates * Fix error msg * reinclude csr y to test * Reintroduced explicit subsample values test * flaked * Missed an uncomment * Update the comment for test of splitters * Updated warning message in CustomSplitter * Update comment in test * Update tests * Removed overloads * Narrowed type of subsample * Removed overload import * Fix `todense` giving np.matrix, using `toarray` * Made subsampling a little less aggresive * Changed multiplier back to 10 * Allow argument to specfiy how auto-sklearn handles compressing dataset size (#1341) * Added dataset_compression parameter and validation * Fix docstring * Updated docstring for `resampling_strategy` * Updated param def and memory_allocation can now be absolute * insert newline * Fix params into one line * fix indentation in docs * fix import breaks * Allow absolute memory_allocation * Tests * Update test on for precision omitted from methods * Update test for akslearn2 with same args * Update to use TypedDict for better Mypy parsing * Added arg to asklearn2 * Updated tests to remove some warnings * flaked * Fix broken link? * Remove TypedDict as it's not supported in Python3.7 * Missing import * Review changes * Fix magic mock for python < 3.9 * Fixed bad merge
* commit meta learning data bases * commit changed files * commit new files * fixed experimental settings * implemented last comments on old PR * adapted metalearning to last commit * add a text preprocessing example * intigrated feedback * new changes on *.csv files * reset changes * add changes for merging * add changes for merging * add changes for merging * try to merge * fixed string representation for metalearning (some sort of hot fix, maybe this needs to be fixed in a bigger scale) * fixed string representation for metalearning (some sort of hot fix, maybe this needs to be fixed in a bigger scale) * fixed string representation for metalearning (some sort of hot fix, maybe this needs to be fixed in a bigger scale) * init * init * commit changes for text preprocessing * text prepreprocessing commit * fix metalearning * fix metalearning * adapted test to new text feature * fix style guide issues * integrate PR comments * integrate PR comments * implemented the comments to the last PR * fitted operation is not in place therefore we have to assgin the fitted self.preprocessor again to it self * add first text processing tests * add first text processing tests * including comments from 01.25. * including comments from 01.28. * including comments from 01.28. * including comments from 01.28. * including comments from 01.31.
* Update FAQ with text stuff * Take suggestions into account
* Push * `fit_ensemble` now has priority for kwargs to take * Change ordering of prefernce for ensemble params * Add TODO note for metrics * Add `metrics` arg to `fit_ensemble` * Add test for pareto front sizes * Remove uneeded file * Re-added tests to `test_pareto_front` * Add descriptions to test files * Add test to ensure argument priority * Add test to make sure X_data only loaded when required * Remove part of test required for performance history * Default to `self._metrics` if `metrics` not available
* Create simple example and doc for naive early stopping * Fix doc, pass through SMAC callbacks directly * Fix `isinstance` check * Add test for early stopping * Fix signature of early stopping example/test * Fix doc build
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 3 to 4. - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](actions/setup-python@v3...v4) --- updated-dependencies: - dependency-name: actions/setup-python dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 2 to 3. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](actions/download-artifact@v2...v3) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 2 to 3. - [Release notes](https://github.com/codecov/codecov-action/releases) - [Changelog](https://github.com/codecov/codecov-action/blob/master/CHANGELOG.md) - [Commits](codecov/codecov-action@v2...v3) --- updated-dependencies: - dependency-name: codecov/codecov-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 2 to 3. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@v2...v3) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Fix logging server cleanup * Add comment relating to the `try: finally:` * Remove nested try: except: from `fit`
Bumps [peter-evans/find-comment](https://github.com/peter-evans/find-comment) from 1 to 2. - [Release notes](https://github.com/peter-evans/find-comment/releases) - [Commits](peter-evans/find-comment@v1...v2) --- updated-dependencies: - dependency-name: peter-evans/find-comment dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/stale](https://github.com/actions/stale) from 4 to 5. - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](actions/stale@v4...v5) --- updated-dependencies: - dependency-name: actions/stale dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Init commit * Fix logging server cleanup (#1503) * Fix logging server cleanup * Add comment relating to the `try: finally:` * Remove nested try: except: from `fit` * Bump peter-evans/find-comment from 1 to 2 (#1520) Bumps [peter-evans/find-comment](https://github.com/peter-evans/find-comment) from 1 to 2. - [Release notes](https://github.com/peter-evans/find-comment/releases) - [Commits](peter-evans/find-comment@v1...v2) --- updated-dependencies: - dependency-name: peter-evans/find-comment dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump actions/stale from 4 to 5 (#1521) Bumps [actions/stale](https://github.com/actions/stale) from 4 to 5. - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](actions/stale@v4...v5) --- updated-dependencies: - dependency-name: actions/stale dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Init commit * Update evaluation module * Clean up other occurences of the word validation * Re-add test for test predictions Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add debug statements and 30s timeouts * Fix formatting * Update internal timeout param * +timeout, use allocated tmpdir * +timeout, use allocated tmpdir * Remove another occurence of explicit `tmp` * Increase timelimits once again * Remove incomplete comment
* Init commit * Fix DummyClassifiers in _load_pareto_set * Add test for dummy only in classifiers * Update no ensemble docstring * Add automl case where automl only has dummy * Remove tmp file * Fix `include` statement to be regressor
* Create PR * Update MLP regressor values
* Make docker file install from `setup.py` * Add pytest cache to gitignore * Up timeouts on test_metadata_generation
This reverts commit 4f691a1.
…sklearn into document_model_capabilities
This reverts commit b03dddd.
Codecov Report
@@ Coverage Diff @@
## development #1530 +/- ##
===============================================
- Coverage 83.79% 83.75% -0.04%
===============================================
Files 152 154 +2
Lines 11667 11730 +63
Branches 2037 2049 +12
===============================================
+ Hits 9776 9825 +49
- Misses 1343 1359 +16
+ Partials 548 546 -2 |
eddiebergman
force-pushed
the
development
branch
from
August 18, 2022 18:14
d813838
to
259ed3d
Compare
auvipy
suggested changes
Jul 31, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a huge undertake in one go! it would be better to split the changes in smaller reviewable chunks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a first draft in an attempt to address #1429. Redone to get clean history after rebasing.
Sample output:
The
classifiers()
returnsdict[str, ClassifierInfo]
The same exists for all
classifiers
,regressors
,data_preprocessors
andfeature_preprocessors
Issues
Data preprocessors are their own beast, there's technically only one datapreprocessor component
FeatTypeSplit
There's a lot of duplication existing already, i.e.
handles_spares = True
andinput = (SPARSE, ...)
inget_properties()
..
notation instead of["this"]
.autosklearn.info
too.I would prefer to do what's below long term, it's a bit nicer to look at. The problem now is that if a user adds a custom component, I would like it to show up when they do
components
. This is fixable, just noting that it could be changed.