Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.15 #1561

Merged
merged 118 commits into from
Sep 20, 2022
Merged

Release 0.15 #1561

merged 118 commits into from
Sep 20, 2022

Conversation

mfeurer
Copy link
Contributor

@mfeurer mfeurer commented Aug 10, 2022

No description provided.

@codecov
Copy link

codecov bot commented Aug 10, 2022

Codecov Report

Merging #1561 (013d7ee) into master (b2ac331) will decrease coverage by 3.09%.
The diff coverage is 82.69%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1561      +/-   ##
==========================================
- Coverage   88.05%   84.96%   -3.10%     
==========================================
  Files         140      155      +15     
  Lines       10995    11898     +903     
  Branches        0     2058    +2058     
==========================================
+ Hits         9682    10109     +427     
+ Misses       1313     1244      -69     
- Partials        0      545     +545     

Impacted file tree graph

eddiebergman and others added 29 commits August 18, 2022 20:08
* black checker

* Simplified

* add examples to black format check

Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de>
* re-structure manual and use 'collapse'

* ADD link to auto-sklearn-talks

* unifying titles

* Clarify default memory and cpu usage

* FIX sphinx_gallery to <=0.10.0

0.10.1 would raise an error for '-D plot_gallery=0'

* Re-structure faq

* FIX comments by mfeurer

* boldface items

* merge manual into FAQ

* FIX minor

* FIX typo

* Update doc/faq.rst

Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>

* Update doc/faq.rst

Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>

* Update doc/faq.rst

Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>

* Update doc/faq.rst

Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>

* Update doc/manual.rst

Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>

* Update doc/manual.rst

Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>

* Update doc/faq.rst

Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>

* FIX link

Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>
* np.bool deprecation

* Invalid escape sequence \_

* Series specify dtype

* drop na requires keyword args deprecation

* unspecified np.int size deprecated, use int instead

* deprecated unspeicifed np.int precision

* Element wise comparison failed, will raise error in the future

* Specify explicit dtype for empty series

* metric warnings for mismatch between y_pred and y_true label count

* Quantile transformer n_quantiles larger than n_samples warning ignored

* Silenced convergence warnings

* pass sklearn args as keywords

* np.bool deprecation

* Invalid escape sequence \_

* Series specify dtype

* drop na requires keyword args deprecation

* unspecified np.int size deprecated, use int instead

* deprecated unspeicifed np.int precision

* Element wise comparison failed, will raise error in the future

* Specify explicit dtype for empty series

* metric warnings for mismatch between y_pred and y_true label count

* Quantile transformer n_quantiles larger than n_samples warning ignored

* Silenced convergence warnings

* pass sklearn args as keywords

* flake8'd

* flake8'd

* Fixed CategoricalImputation not accounting for sparse matrices

* Updated to use distro for linux distribution

* Ignore convergence warnings for gaussian process regressor

* Averaging metrics now use zero_division parameter

* Readded scorers to module scope

* flake8'd

* Fix

* Fixed dtype for metalearner no run

* Catch gaussian process iterative fit warning

* Moved ignored warnings to tests

* Correctly type pd.Series

* Revert back to usual iterative fit

* Readded missing iteration increment

* Removed odd backslash

* Fixed imputer for sparse matrices

* Ignore warnings we are aware about in tests

* Flake'd:

* Revert "Fixed imputer for sparse matrices"

This reverts commit 05675ad.

* Revert "Revert "Fixed imputer for sparse matrices""

This reverts commit d031b0d.

* Back to default values

* Reverted to default behaviour with comment

* Added xfail test to document

* flaked

* Fixed test, moved to np.testing for assertion

* Update autosklearn/pipeline/components/data_preprocessing/categorical_encoding/encoding.py

Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de>

Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de>
* Added manual dispatch to tests

* Removed parameters to manual dispatch
…tors (#1332)

* Update docstrings and types

* doc typo fix

* flake'd
* added python 3.10 to versions

* Added quotes around versions

* Trigger tests
* Add submodule

* Port to abstract_ensemble, backend from automl_common

* Updated workflow files

* Update imports

* Trigger actions

* Another import fix

* update import

* m

* Backend fixes

* Backend parameter update

* fixture fix for backend

* Fix tests

* readd old abstract ensemble for now

* flake8'd

* Added install from source to readme

* Moved installation w.r.t submodules to the docs

* Temporarily remove submodule

* Readded submodule

* Updated to use automl_common under autosklearn

* Updated MANIFEST

* Removed uneeded statements from MANIFEST

* Fixed import

* Fixed comment line in MANIFEST.in

* Added automl_common/setup.py to MANIFEST

* Added prefix to script

* Re-added removed title #

* Added note for submodule for CONTRIBUTING

* Made the submodule step a bit more clear for contributing.md

* CONTRIBUTING fixes
* Added versioning for sphinx, docutils - introduced by sphinxtoolbox

* Fixed bug with config value for `plot_gallery` in doc makefile

* Update linkcheck command as well
* Added ignored_warnings file

* Use ignored_warnings file

* Test regressors with 1d, 1d as 2d and 2d targets

* Flake'd

* Fix broken relative imports to ignore_warnings

* Removed print and updated parameter type for tests

* Type import fix
* Added random state to classifiers

* Added some doc strings

* Removed random_state again

* flake'd

* Fix some test issues

* Re-added seed to test

* Updated test doc for unknown test

* flake'd
* Added ignored_warnings file

* Use ignored_warnings file

* Test regressors with 1d, 1d as 2d and 2d targets

* Flake'd

* Fix broken relative imports to ignore_warnings

* Removed print and updated parameter type for tests

* Added warning catches to fit methods in tests

* Added more warning catches

* Flake'd

* Created top-level module to allow relativei imports

* Deleted blank line in __init__

* Remove uneeded ignore warnings from tests

* Fix bad indent

* Fix github merge conflict editor whitespaces and indents
* update workflow files

* typo fix

* Update pytest

* remove bad semi-colon

* Fix test runner command

* Remove explicit steps required from older version

* Explicitly add Conda python to path for subprocess command in test

* Fix the mypy compliance check

* Added PEP 561 compliance

* Add py.typed to MANIFEST for dist

* Remove py.typed from setup.py
* rename OSX -> macOS as it is the new name

rename OSX -> macOS as it is the new name for the operating system. e.g. see https://www.apple.com/macos

* Update doc/installation.rst

Co-authored-by: Matthias Feurer <lists@matthiasfeurer.de>

* Update doc/installation.rst

Co-authored-by: Matthias Feurer <lists@matthiasfeurer.de>

Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de>
Co-authored-by: Matthias Feurer <lists@matthiasfeurer.de>
…semble (#1321)

* Changed show_models() function to return a dictionary of models in the ensemble instead of a string
* Fix: MLPRegressor tests

* Fix: Ordering of statements in test

* Fix: MLP n_calls
…ures (#1250)

* Moved to new splitter, moved to util file

* flake8'd

* Fixed errors, added test specifically for CustomStratifiedShuffleSplit

* flake8'd

* Updated docstring

* Updated types in docstring

* reduce_dataset_size_if_too_large supports more types

* flake8'd

* flake8'd

* Updated docstring

* Seperated out the data subsampling into individual functions

* Improved typing from Automl.fit to reduce_dataset_size_if_too_large

* flak8'd

* subsample tested

* Finished testing and flake8'd

* Cleaned up transform function that was touched

* ^

* Removed double typing

* Cleaned up typing of convert_if_sparse

* Cleaned up splitters and added size test

* Cleanup doc in data

* rogue line added was removed

* Test fix

* flake8'd

* Typo fix

* Fixed ordering of things

* Fixed typing and tests of target_validator fit, transform, inv_transform

* Updated doc

* Updated Type return

* Removed elif gaurd

* removed extraneuous overload

* Updated return type of feature validator

* Type fixes for target validator fit

* flake8'd

* Moved to new splitter, moved to util file

* flake8'd

* Fixed errors, added test specifically for CustomStratifiedShuffleSplit

* flake8'd

* Updated docstring

* Updated types in docstring

* reduce_dataset_size_if_too_large supports more types

* flake8'd

* flake8'd

* Updated docstring

* Seperated out the data subsampling into individual functions

* Improved typing from Automl.fit to reduce_dataset_size_if_too_large

* flak8'd

* subsample tested

* Finished testing and flake8'd

* Cleaned up transform function that was touched

* ^

* Removed double typing

* Cleaned up typing of convert_if_sparse

* Cleaned up splitters and added size test

* Cleanup doc in data

* rogue line added was removed

* Test fix

* flake8'd

* Typo fix

* Fixed ordering of things

* Fixed typing and tests of target_validator fit, transform, inv_transform

* Updated doc

* Updated Type return

* Removed elif gaurd

* removed extraneuous overload

* Updated return type of feature validator

* Type fixes for target validator fit

* flake8'd

* Fixed err message str and automl sparse y tests

* Flak8'd

* Fix sort indices

* list type to List

* Remove uneeded comment

* Updated comment to make it more clear

* Comment update

* Fixed warning message for reduce_dataset_if_too_large

* Fix test

* Added check for error message in tests

* Test Updates

* Fix error msg

* reinclude csr y to test

* Reintroduced explicit subsample values test

* flaked

* Missed an uncomment

* Update the comment for test of splitters

* Updated warning message in CustomSplitter

* Update comment in test

* Update tests

* Removed overloads

* Narrowed type of subsample

* Removed overload import

* Fix `todense` giving np.matrix, using `toarray`

* Made subsampling a little less aggresive

* Changed multiplier back to 10

* Allow argument to specfiy how auto-sklearn handles compressing dataset size  (#1341)

* Added dataset_compression parameter and validation

* Fix docstring

* Updated docstring for `resampling_strategy`

* Updated param def and memory_allocation can now be absolute

* insert newline

* Fix params into one line

* fix indentation in docs

* fix import breaks

* Allow absolute memory_allocation

* Tests

* Update test on for precision omitted from methods

* Update test for akslearn2 with same args

* Update to use TypedDict for better Mypy parsing

* Added arg to asklearn2

* Updated tests to remove some warnings

* flaked

* Fix broken link?

* Remove TypedDict as it's not supported in Python3.7

* Missing import

* Review changes

* Fix magic mock for python < 3.9

* Fixed bad merge
* commit meta learning data bases

* commit changed files

* commit new files

* fixed experimental settings

* implemented last comments on old PR

* adapted metalearning to last commit

* add a text preprocessing example

* intigrated feedback

* new changes on *.csv files

* reset changes

* add changes for merging

* add changes for merging

* add changes for merging

* try to merge

* fixed string representation for metalearning (some sort of hot fix, maybe this needs to be fixed in a bigger scale)

* fixed string representation for metalearning (some sort of hot fix, maybe this needs to be fixed in a bigger scale)

* fixed string representation for metalearning (some sort of hot fix, maybe this needs to be fixed in a bigger scale)

* init

* init

* commit changes for text preprocessing

* text prepreprocessing commit

* fix metalearning

* fix metalearning

* adapted test to new text feature

* fix style guide issues

* integrate PR comments

* integrate PR comments

* implemented the comments to the last PR

* fitted operation is not in place therefore we have to assgin the fitted self.preprocessor again to it self

* add first text processing tests

* add first text processing tests

* including comments from 01.25.

* including comments from 01.28.

* including comments from 01.28.

* including comments from 01.28.

* including comments from 01.31.
… and #1250 (#1386)

* Add: Doc for `dataset_compression`

* Fix: Shorten line

* Doc: Make more clear that the argument None still provides defaults
…#1387)

* Fix: ignore for certain configuration

* Fix: Extend timeout duration for tests
* Draft tidy of workflows

* Fix: mypy should not ignore missing imports

* Added pydocstyle to checkers

* Added check and format make options

* Fix: mutiple entries in same line setup.py

* Change: black line length to 88

* Fix: make check to only perform checks

* Update: Flake8 ignores style (handled by black/isort)

* Add: pydocstyle, disabled in pre-commit

* Add: Makefile `make pre-commit`

* Fix: Ignores for mypy on untyped modules

* Limit scope of pre-commit steps

* Fix: flake8 no longer concerned about line length

* Add: Flake8 to `make check`

* Fix: reduce scope of black and isort

* Fix: Pydocstyle now uses numpy convention

* Fix: workaround for test imports of `automl_common`

* Fix: `mypy` ignores `automl_common` now

* Fix: Limit scope of `black` and `isort` formatting

* Fix: pre-commit performs no file changes now

* Add: `make pre-commit` to `make help`

* Fix: `make help` docstring for `make pre-commit`

* Fix: isort update sections autosklearn, types

* Fix: warnings by flake8 for line length

* Fix: Types section for isort

* Fix: reenable `flake8` formatting checking

* Update: flake8 to use black's line length of 88

* add: ignore D205 pydocstyle

* Fix: Import order for futures

* Fix: flake8 ignore E203

* Fix: Formatting and fixed long lines

* Del: black/isort checker, checked with pre-commit

* Fix: test dummy prediction error msg

* Add: `coverage` to `pyproject.yaml`

* Add: coverage ignore for `if TYPE_CHECKING`

* Fix: missing coma

* Fix: `toml` dependency for pydoctyle in pre-commit

* Fix: isort src path

* Add: `make test`

* Fix: Add name of module to check coverage of

* Maint: isort and black most recent dev

* Fix: import typo

* Change: format now performs individually on each directory
@eddiebergman
Copy link
Contributor

So turns out the major issue and why there was so many merge conflicts is beacuse I created a seperate branch when creating v0.14.4 which was not included in development. Therefore master and development disagreed on that state of things as this v0.14.4 commit was not in development, despite having all the same content except release notes.

My strategy to fix everything was

# master: -----* ---A --B --C --M
# devlop:       \-------------------D
# * = v0.14.7
# A, B, C useful commits from master


# Just stick development on top of the HEAD of master but all conflicts are `dev` favoured
git checkout development
git rebase -Xtheirs master

# This leaves us in a valid state to merge with master as it includes it but
# we've discarded any useful commits that caused a merged conflict

# master: -----* ---A --B --C --M
# devlop: -----*'---A'--B'--C'--M'----------D
# *', A', B', C' = the corrupted commits from master

# We then replay each commit from `v0.14.4` on master all the way up to its `HEAD` to replay
# the commits one by one, while being in a valid merge state on develop 

# `v0.14.4` is the biggest culprit of merge conflicts (700+),
# I simply took the release notes from here and favoured dev for everything else,
# as it should have only included content from dev at that time + new release notes.
git cherry-pick -m 1 v0.14.4

# master: -----* ---A --B --C --M
# devlop: -----*'---A'--B'--C'--M'----------D-* 

# For the remaining commits, like A', B', C', I replayed them on top of the branch D so 
# I can replay the changes and make sure it's all included
git cherry-pick -x -n A'..M'

# master: -----* ---A --B --C --M
# devlop: -----*'---A'--B'--C'--M'----------D-*--A--B--C--M

# We now have develop and master in a mergable state and the important
# changes after `v0.14.4` where things diverged are now incorporated propely.

@eddiebergman
Copy link
Contributor

Few points:

  • Doc failed to build, I think I missed a link that I need to update and I will do that
  • The meta-features failing test i"m not sure the root cause but it seems it's trying to index into a SingleBestModel which is the dummy model.
  • There seems to be an issue with the latest branch according to one user. callback function error! #1569 (comment). The issue isn't about callbacks but rather some meta-learning and data dtypes.

(cherry picked from commit b2ac331)
@eddiebergman
Copy link
Contributor

eddiebergman commented Aug 19, 2022

Actually, it's nothing to do with a model but it seems in the publish docker build workflow specifically, it seems to have some incorrect code? It seems to me like here info is set to DummyDataManager

if info["task"] in  REGRESSION_TASKS:

TypeError: 'DummyDatamanager' object is not subscriptable

E       CompletedProcess(args='python3 /workspace/scripts/02_retrieve_metadata.py --working-
directory /tmp/autosklearn-unittest-tmp-dir-99ac6cd59ac2-7412-46734 ', returncode=1, 
stdout=b'binary.classification accuracy 0\n', stderr=b'Traceback (most recent call last):\n  
File "/workspace/scripts/02_retrieve_metadata.py", line 274, in <module>\n    main()\n  File 
"/workspace/scripts/02_retrieve_metadata.py", line 237, in main\n    configuration_space = 
pipeline.get_configuration_space(\n  File "/usr/local/lib/python3.8/dist-packages/autosklearn
/util/pipeline.py", line 47, in get_configuration_space\n    if info["task"] in 
REGRESSION_TASKS:\nTypeError: \'DummyDatamanager\' object is not subscriptable\n')

mfeurer and others added 3 commits August 22, 2022 13:54
* Debug docker workflow failure

* Use new login action

* Remove broken arguments

* Add new push tag

* Use docker/meta action

* First push to docker, then to github, update repository names

* Add docker/meta workflow for docker push

* Disambiguate names

* Fix registry for github packages login

* extract hard-coded names into variables
* create new text preprocessing cs

* create new text preprocessing cs

* set new defaults for text encoding

* set new defaults for text encoding

* set new defaults for text encoding

* Fix bug, rework tests

Co-authored-by: lukas <lukas.j.m.strack@gmail.com>
@mfeurer mfeurer merged commit b7ff90c into master Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants