Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to support AutoGluon v0.4 #455

Merged
merged 2 commits into from
Aug 1, 2022

Conversation

Innixma
Copy link
Collaborator

@Innixma Innixma commented Mar 23, 2022

PR to add support for AutoGluon v0.4

  • Updates install from source logic to align with v0.4
  • Added AutoGluon_hq and AutoGluon_gq presets that represent mid-way points between the quality of AutoGluon_bestquality and the inference speed of AutoGluon.
  • Added loading test data prior to inference for a more genuine inference latency (previously loading data from file was included in inference time)
  • Added predictor.persist_models('best') call to avoid loading models from disk during inference (for more genuine inference latency estimates).

Comment on lines 20 to 22
PIP install "mxnet<2.0.0"
PIP install "scikit-learn-intelex<2021.3"
PIP install "scikit-learn-intelex<2021.6"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these dependencies installed separately?

Copy link
Collaborator Author

@Innixma Innixma May 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it isn't part of our build-from-source install by default, and I haven't tested across windows, mac, and ARM to ensure this isn't unstable in those configs. It is stable on Linux though. If it is important, I could switch to having it in the same line as the general pip install (PIP install autogluon.tabular[all,skex]).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically the alternative is to shift it to be specified in two locations:

for pypi install:

instead of:
PIP install autogluon

it would be:

PIP install autogluon.tabular[all,skex]

and for source install:

instead of

PIP install -e tabular/[all]

it would be:

PIP install -e tabular/[all,skex]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just trying to consider ways to make the relationship of the installation script to current versions less hard-coded. In principle, we would like to be able to provide backward compatibility to older versions of the frameworks when reasonable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard currently because of the existence of PIP being different from pip. I'm unsure the reasoning for this in AutoMLBenchmark, but it is the reason I can't use AutoGluon's default build-from-source logic mentioned in our install instructions: https://auto.gluon.ai/stable/index.html

This is the file I'd like to call to build AG from source: https://github.com/awslabs/autogluon/blob/master/full_install.sh

However, I can't because this file uses pip while I need to use PIP in AutoMLBenchmark.

If this can be resolved then it should be do-able to have backwards compatibility

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Innixma see my suggestion below.

I'm unsure the reasoning for this in AutoMLBenchmark

it was mainly to hide the automatic creation of the venv, but still be able to access both default python exec and the venv python exec if needed.

the exec is stored in $py_exec variable if you need it, for example to set a custom PATH as I suggest below. I think this should be enough for your needs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I wasn't so savvy with bash so this was very helpful.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re scikit-intelex, I've resolved this by not hardcoding the version but instead using the version defined via the skex extra_dependencies key.

Comment on lines +39 to +53
AutoGluon_hq:
extends: AutoGluon
description: |
AutoGluon with 'high_quality' preset provides a very strong predictor with 8x+ faster inference speed than 'best_quality'.
Refer to https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-quickstart.html#presets for a description of the presets.
params:
presets: high_quality

AutoGluon_gq:
extends: AutoGluon
description: |
AutoGluon with 'good_quality' preset provides a strong predictor with 16x+ faster inference speed than 'best_quality'.
Refer to https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-quickstart.html#presets for a description of the presets.
params:
presets: good_quality
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebhrusen it feels like this is getting unwieldy. Should we extend the configuration format to more easily allow for different presets?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, I don't have the feeling that the configuration format is the problem here (or is it? do you have sth in mind like a specific CLI syntax only for presets? sth that would be simple enough to work for various frameworks?)

I mean users can create as many framework definitions in their user frameworks.yaml, we don't have to accept all of those in the default frameworks.yaml.

Copy link
Collaborator

@sebhrusen sebhrusen May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently, CLI syntax allows passing framework params like
-Xf.presets=good_quality but this works only for local runs, this is not forwarded to ec2 instances or to docker images, we need custom frameworks.yaml for that, but maybe this could be improved (whitelisting?) although I'm not sure I like the idea of having params coming from multiple sources when executed remotely.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know what I had in mind at the time. But if every framework does this, it will blow up the configuration file many times over. MLJar has 4 presets, GAMA has more presets, and I suspect (and hope) more frameworks might follow.

A custom parameter which is specifically for the mode/preset might be reasonable, considering it's a "special" hyperparameter in the sense that it's the only one we allow for benchmarking. Consider the last three additional entries in the framework definition:

AutoGluon:
  version: "stable"
  description: |
    AutoGluon-Tabular: Unlike existing AutoML frameworks that primarily focus on model/hyperparameter selection,
    AutoGluon-Tabular succeeds by ensembling multiple models and stacking them in multiple layers.
  project: https://auto.gluon.ai
  refs: [https://arxiv.org/abs/2003.06505]
  presets: ['best_quality', 'high_quality', 'good_quality']
  preset: best_quality    # could also be the first of `modes`  by convention
  param_name: presets  # the framework name for the hyperparameter, so that we can easily convert internally to add it to `params` to avoid changing integration scripts

This could be complimented by a CLI argument which overrides the mode argument. It wouldn't solve the issue of forwarding.

On the other hand, it requires additional logic for something that works well already (even if it's minimal).

Copy link
Collaborator

@sebhrusen sebhrusen Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PGijsbers maybe the first mistake was to allow non-default presets in the resources/frameworks.yaml.
The tool offers the possibility for anyone to add custom configurations, we're not going to limit that, and for the paper's benchmarks, we use the franeworks_yyyyQn.yaml files where we can decide to use non-default presets after agreeing with each team ("was it a good idea to allow multiple configs for those?" is a separate question).
The problem with making preset special is that we're officially encouraging this approach (which I like btw), and, as we already see, it creates issues on our side (multiplication, way to fine tune the framework only for the benchmark, in contradiction with the original approach of benchmarking all tools with their default setup...).

That's why personally, I'd go backwards and accept only default setup in frameworks.yaml, rather than creating a new convoluted syntax. And we would decide "customizations" only for the franeworks_yyyyQn.yaml ones.
My second choice would be to just add a param/section to the framework description whitelisting framework params that are always forwarded, it's relatively simple to do and will just allow the usual CLI syntax also for docker and aws.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why personally, I'd go backwards and accept only default setup in frameworks.yaml, rather than creating a new convoluted syntax. And we would decide "customizations" only for the franeworks_yyyyQn.yaml ones.

Agree. Since there are already "duplicate" entries (varying only in mode) in the other files, I suggest we OK it for this PR and refactor them all out at once afterwards.

My second choice would be to just add a param/section to the framework description whitelisting framework params that are always forwarded, it's relatively simple to do and will just allow the usual CLI syntax also for docker and aws.

Let's hold off on that for now, I guess we should put more emphasis on people using their local (user) configuration file to keep variations of the different frameworks/configurations that they are interested in.

Copy link
Collaborator

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update and sorry for the inactivity. I will be defending my dissertation in two weeks, and I've been very busy with all the necessary preparations. Things should return to normal (for me) starting in June.

@Innixma
Copy link
Collaborator Author

Innixma commented May 6, 2022

No worries @PGijsbers , and thanks for your review :). Hope your dissertation goes well!

Comment on lines 33 to 37
# Note: Normally we would just call `./full_install.sh` but because `pip` and `PIP` are not the same,
# the script does not work here. Therefore, we have to explicitly install each submodule below.
# This has the downside that source installs of old versions of the package might not follow the below steps.
# It is recommended to instead use pip install of older AG versions to ensure it works correctly.
# https://github.com/awslabs/autogluon/blob/master/full_install.sh
Copy link
Collaborator

@sebhrusen sebhrusen May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about sth like:

env PATH=`dirname $py_exec`:$PATH  bash -c ./full_install.sh

haven't tried, but shouldn't it work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This makes things a lot nicer and allows us to support AutoGluon v0.1.0+ all at once. Refer to the updated PR.

@Innixma
Copy link
Collaborator Author

Innixma commented Jul 13, 2022

@PGijsbers @sebhrusen thanks for the comments! I have updated the PR to reflect them and this version should now be compatible with AutoGluon versions >=0.1.0 (including v0.4, v0.5, and soon to be released v0.5.1).

Copy link
Collaborator

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't test it currently (I can't install fairscale locally and right now have no access to aws), but in principle it looks good to me.

@Innixma
Copy link
Collaborator Author

Innixma commented Jul 31, 2022

Any other action on my end to take for this PR?

@PGijsbers
Copy link
Collaborator

I was waiting to see if @sebhrusen agrees. We don't (yet) have a policy in place that determines what to do if someone does not check in.

Copy link
Collaborator

@sebhrusen sebhrusen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PGijsbers do we agree to create a follow-up ticket to clean up the frameworks.yaml?

@sebhrusen sebhrusen merged commit 88186b3 into openml:master Aug 1, 2022
PGijsbers added a commit that referenced this pull request Jun 20, 2023
* Add a workflow to tag latest `v*` release as `stable` (#399)

Currenty limited to alphabetical ordering which means that any one number in the version can not exceed one digit.

* Bump auto-sklearn to 0.14.0 (#400)

* Update version to 2.0

* Revert "Update version to 2.0"

This reverts commit 9e0791a.

* Fix/docker tag (#404)

* Add the version tag to the image name if present

* Fix casing for MLNet framework definition

* Sync stable-v2 and master (#407)

* Update version to 2.0.2

* Revert version change

* Add support for the OpenML test server (#423)

* Add support for the OpenML test server

* change domain from openmltestserver to test.openml

* update error message

* Apply suggestions from code review

Co-authored-by: seb. <sebastien@h2o.ai>

* fix syntax error due to online merging

Co-authored-by: seb. <sebastien@h2o.ai>

* Switch from release:created to release:published (#429)

* Added support for dataset files stored on s3 (#420)

* s3 functionality

* Update amlb/datasets/fileutils.py

Co-authored-by: Pieter Gijsbers <p.gijsbers@tue.nl>

* OOD

* add s3n

* move boto3 import

Co-authored-by: Weisu Yin <weisuyin96@gmail.com>
Co-authored-by: Pieter Gijsbers <p.gijsbers@tue.nl>

* Respect TMP, TMPDIR, TEMP (#442)

* Respect tmpdir

* Fixed submodule

* feat: retain environment vars for framework venv

* minor fix on compatibility (#454)

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>

* Ignore decoding errors on Windows (#459)

By default it can use cp1252 decoding which sometimes raises an error
and halts the process.

* Fix a typo (#462)

will used -> will be used

* Merge back stable-v2 to master (#472)

* Add `stable` tag workflow, bump auto-sklearn (#401)

* Add a workflow to tag latest `v*` release as `stable` (#399)

Currenty limited to alphabetical ordering which means that any one number in the version can not exceed one digit.

* Bump auto-sklearn to 0.14.0 (#400)

* Fix/docker tag (#404)

* Add the version tag to the image name if present

* Fix casing for MLNet framework definition

* Changed latest from master to main

* Update version to 2.0.1

* Improv/aws meta (#413)

* Add volume meta data to aws meta info

* Add constraints for v2 benchmark (#415)

* Add constraints for v2 benchmark

For ease of reproducibility, we want to include our experimental setup
in the constraints file. For our experiments we increase the volume size
to 100gb and require gp3 volumes (general purpose SSD).

* Update version to 2.0.2

* Fix AWS random cancel issue (#422)

* let the job runner handle the rescheduling logic to ensure that the job is always can't be acted upon by current worker after being rescheduled

* remove commented code

* Add a GAMA configuration intended for benchmarking (#426)

Made the previous version abstract to avoid accidentally running the
wrong version of GAMA for the benchmark.

* Unsparsify target variables for (Tuned)RF (#425)

* Unsparsify target variables for (Tuned)RF

Sparse targets are not supported in scikit-learn 0.24.2, and are used
with tasks 360932 and 360933 (QSAR) in the benchmark.

* cosmetic change to make de/serialization easier to debug

Co-authored-by: Sebastien Poirier <sebastien@h2o.ai>

* ensure that openml is configured when loading the tasks (#427)

* Expect a possible `NoSuchProcess` error (#428)

Since it's entirely possible that the processes were already
terminating, but only completed termination between the process.children
call and the proc.terminate/kill calls.

* Reset version for versioning workflow

* Update version to 2.0.3

* ensure that the docker images can be built from linux (#437)

* Avoid querying terminated instance with CloudWatch (#438)

* fixes #432 add precision to runtimes in results.csv (#433)

* fixes #432 add precision to runtimes in results.csv

* Update amlb/results.py

Co-authored-by: seb. <sebastien@h2o.ai>

Co-authored-by: seb. <sebastien@h2o.ai>

* Iteratively build the forest to honor constraints (#439)

* Iteratively build the forest to honor constraints

In particular depending on the dataset size either memory or time
constraints can become a problem which makes it unreliable as a
baseline. Gradually growing the forest sidesteps both issues.

* Make iterative fit default, parameterize execution

* Step_size as script parameter, safer check if done

When final_forest_size is not an exact multiple of step_size,
randomforest should still terminate. Additionally step_size is escaped
with an underscore as it is not a RandomForestEstimator hyperparameter.

* Iterative fit for TunedRandomForest to meet memory and time constraints (#441)

* Iterative fit to meet memory and time constraints

Specifically for each value of `max_features` to try, an equal time
budget is alloted, with one additional budget being reserved for the
final fit. This does mean that different `max_features` can lead to
different number of trees, but it keeps it simple.

* Abort tuning when close to total time budget

The first fit of each iterative fit for a `max_features` value was not
guarded, which can lead to exceeding the total time budget. This adds a
check before the first fit to estimate whether the budget will be
exceeded, and if so aborting further tuning and continue with the final
fit.

* Make k_folds configurable

* Add scikit-learn code with explanation

* Modify cross_validate, allow 1 estimator per split

This is useful when we maintain a warm_started model for each individual
split.

* Use custom cv function to allow warm-start

By default estimators are cloned in any scikit-learn cross_validate
function (which stops warm-start) and it is not possible to specify a
specific estimator-object per fold (which stops warm-start). The added
custom_validate module makes changes to the scikit-learn code to allow
warm-starting to work in conjunction with the cross-validate
functionality. For more info see scikit-learn#22044 and
scikit-learn#22087.

* Add parameter to set tune time, rest is for fit

The previous iteration where the final fit was treated as an equivalent
budget to any other optimization sometimes left too little time to train
the final forest, in particular when the last fit took longer than
expected. This would often lead to very small forests for the final
model. The new system guarantees roughly 10% of budget for the final
forest, guaranteeing a better final fit.

* Revert version to _dev_version to prepare release (#444)

* Update version to 2.0.4

* Signal to encode predictions as proba now works (#447)

In a previous iteration it was encoded as a numpy file, but now it's
serialized to JSON which means that results.probabilities is simply a
string if imputation is required.

* Monkeypatch openml to keep whitespace in features (#446)

Technically monkeypatch xmltodict function used by openml when reading the features xml

* fixe for mlr3automl (#443)

* Reset version for Github workflow (#448)

* Update version to 2.0.5

* Update mlr3automl to latest

Was supposed to be included with #443

* Update MLR3 (#461)

* Reset version for version bump

* Updatet version because GA failed

* Issue 416: fixing versioning workflow for releases and merges to master (#468)

* change workflow to correctly modify the app version on releases and when forcing merged version back to master

* protect main branch from accidental releases

* fix stress test

Co-authored-by: PGijsbers <p.gijsbers@tue.nl>
Co-authored-by: eddiebergman <eddiebergmanhs@gmail.com>
Co-authored-by: github-actions <github-actions@github.com>
Co-authored-by: Erin LeDell <erin@h2o.ai>
Co-authored-by: Stefan Coors <stefan.coors@gmx.net>

* useless workflow reintroduced during merge (#475)

* tag all AWS entities (#469)

* fixed parsing of int targets when loading file in CSV format (#467)

* Avoid root owned files from docker (#464)

* New site (#479)

* First draft of new website

* Add framework descriptions, papers and logos

* Update footer with Github link

* Remove under construction banner

* Add redirect from old page to new one

* Update page title

* Add text links to new paper to be added later

* Move static site to /docs

* Whitelist documentation images

* Remove temporary work directory

* Add documentation images

* Place holder for mobile

* Move old notebooks and visualizations

To make sure they are not confusing for new users, as these will no longer work out-of-the-box.
New notebooks will be added soon but I don't have the files available right now.

* Tell github this is not Jekyll

* Update minimal responsiveness (#480)

* Make results responsive (hacky)

* Make Frameworks page more responsive

* Make Home more responsive

* Bare minimum mobile navbar

* Make sure phones report fake width

* Link to arxiv paper (#481)

* Update to support AutoGluon v0.4 (#455)

* Update to support AutoGluon v0.4

* Address comments

* Updated setup.py for `hyperoptsklearn` as it no longer uses PyPi (also now accepts shas) (#410)

* Updated hyper opt not to use PyPi and accept shas

* case-sensitive PIP command in setup

Co-authored-by: Sebastien Poirier <sebastien@h2o.ai>

* AutoGluon TimeSeries Support (first version) (#494)

* Add AutoGluon TimeSeries Prototype

* AutoMLBenchmark TimeSeries Prototype. (#6)

* fixed loading test & train, changed pred.-l. 5->30

* ignore launch.json of vscode

* ensuring timestamp parsing

* pass config, save pred, add results

* remove unused code

* add readability, remove slice from timer

* ensure autogluonts has required info

* add comments for readability

* setting defaults for timeseries task

* remove outer context manipulation

* corrected spelling error for quantiles

* adding mape, correct available metrics

* beautify config options

* fixed config for public access

* Update readme

* Autogluon timeseries, addressed comments by sebhrusen (#7)

* fixed loading test & train, changed pred.-l. 5->30

* ignore launch.json of vscode

* ensuring timestamp parsing

* pass config, save pred, add results

* remove unused code

* add readability, remove slice from timer

* ensure autogluonts has required info

* add comments for readability

* setting defaults for timeseries task

* remove outer context manipulation

* corrected spelling error for quantiles

* adding mape, correct available metrics

* beautify config options

* fixed config for public access

* no outer context manipulation, add dataset subdir

* add more datasets

* include error raising for too large pred. length.

* mergin AutoGluonTS framework folder into AutoGluon

* renaming ts.yaml to timeseries.yaml, plus ext.

* removing presets, correct latest config for AGTS

* move dataset timeseries ext to datasets/file.py

* dont bypass test mode

* move quantiles and y_past_period_error to opt_cols

* remove whitespaces

* deleting merge artifacts

* delete merge artifacts

* renaming prediction_length to forecast_range_in_steps

* use public dataset, reduced range to maximum

* fix format string works

* fix key error bug, remove magic time limit

* Addressed minor comments, and fixed version call for tabular and timeseries modularities (#8)

* fixed loading test & train, changed pred.-l. 5->30

* ignore launch.json of vscode

* ensuring timestamp parsing

* pass config, save pred, add results

* remove unused code

* add readability, remove slice from timer

* ensure autogluonts has required info

* add comments for readability

* setting defaults for timeseries task

* remove outer context manipulation

* corrected spelling error for quantiles

* adding mape, correct available metrics

* beautify config options

* fixed config for public access

* no outer context manipulation, add dataset subdir

* add more datasets

* include error raising for too large pred. length.

* mergin AutoGluonTS framework folder into AutoGluon

* renaming ts.yaml to timeseries.yaml, plus ext.

* removing presets, correct latest config for AGTS

* move dataset timeseries ext to datasets/file.py

* dont bypass test mode

* move quantiles and y_past_period_error to opt_cols

* remove whitespaces

* deleting merge artifacts

* delete merge artifacts

* renaming prediction_length to forecast_range_in_steps

* use public dataset, reduced range to maximum

* fix format string works

* fix key error bug, remove magic time limit

* swapped timeseries and tabular to set version

* make warning message more explicit

* remove outer context manipulation

* split timeseries / tabular into functions

Co-authored-by: Leo <LeonhardSommer96@gmail.com>

* Add workflow to manually run `runbenchmark.py` on Github Actions (#516)

* Add workflow for manually running a test benchmark

* Use built-in context for getting the branch

* Add more info to step names

* Add ability to specify options

* Fixed user and sudo under docker (#495)

* Fixed user and sudo under docker

* Reverted format

* Update docker.py

* Addressing #497

#497

* Keep wget quiet

* Use :, . is deprecated

Co-authored-by: seb. <sebastien@h2o.ai>

* Set username and userid in Dockerfile generation

* Install HDF5 to Docker for tables

* Avoid using unix-specific workarounds on Windows

* Re-enable caching for building docker images

---------

Co-authored-by: seb. <sebastien@h2o.ai>
Co-authored-by: PGijsbers <p.gijsbers@tue.nl>

* [no-ci] Fix broken link (#514)

* Remove autoxgboost, add `removed` field for frameworks (#519)

* Add redirect for dataset page (#521)

* Upgrade Python version and dependencies (#520)

* Remove usage of np.float alias and just use float

* Bump to Py3.9

* Update requirements for March 2023, Py3.9

* Pin packaging, since LegacyVersion was removed.

Also remove scipy pin, since later autosklearn needs higher scipy.

* Install packages to ranger/lib

* Set secret PAT used when installing with R remotes

Specifically for mlr3automl integration

* Update usage for oct 21 release

* Disable custom installed packages

* Remove installation of reqiurements altogether

* Insert oboe example

* Add monkeypatch

* Make error matrix numpy array

* Upgrade to Ubuntu 22.04 from 18.04

* Update pip cache to look at 3.9 directory

* Add Github PAT to run_all_frameworks script

* bump github action versions

* Adding tarfile member sanitization to extractall() (#508)

* Included lightautoml in frameworks_stable (#412)

* Included lightautoml in frameworks_stable

* Added MLNet to frameworks_latest

* Added mlr3 to both stable and latest

* copy/paste fix

* Remove travis file (#529)

* Remove travis file since it is not used

* Update readme to reflect Python 3.9 support

* Add github action workflow to replace old travis file

* Add job id, improve name

* Fix bug where task inference would lead to KeyError

* Update type data for new openml/pandas

Probably ought to remove the specific check if we don't enforce it.

* Write numeric categories as str, see renatopp/liac-arff/issues/126

* [Open for review] Store results after each job completion (#526)

* ensure that results are solved progressively in all situations instead of only when all jobs are completed

* rename config flag

* don't forget to cleanup job runner exec thread

* Improve type hints

* Adding file lock on global results file (#453)

* adding file lock on global results file

* fix imports

* fix amlb.utils export

* cosmetic

* clranup util imports (also magic strings) + remove ruamel dependency in subprocesses

---------

Co-authored-by: Sebastien Poirier <sebastien@h2o.ai>

* Update the requirements files to exclude yaml and include filelock

The remainder of dependencies are not re-generated to avoid
additional changes in the PR.

* Add missing import

* Add fallback for when job is not started

* Return an empty dataframe if dataframe is empty

This avoids a bug where an empty dataframe is indexed.

* Inform the user result summary is not available in AWS mode

As results are processed in a different manner (files are directly
copied over from S3). This avoids a bug where a benchmark
results.csv file tries to be accessed.

* Separate scoreboard generation to two lines instead

Which makes it easier to tell which part of the generation generates
an error, if any.

* re-enable logging

* Provide a warning and return early if no process output is detected

This avoids potentially crashing if the logging is configured incorrectly.
In the future, we should expand this to first check how logging is
configured in order to see whether or not the issue should be reported
and possibly give a more detailed warning if it is likely the cause
of an error.

---------

Co-authored-by: Sebastien Poirier <sebastien@h2o.ai>
Co-authored-by: seb <sebastien.poirier@h2o.ai>

* maint: upgrade AMI to Ubuntu 22.04 #512 (#525)

* Add `flaml_benchmark` (#528)

* dont discard setup_args if it already is a list

* Add flaml and flaml_benchmark

It is not added to latest since install from latest seems to be broken

* Set up alternative way for benchmark mode of flaml

This is only temporarily allowed - we expect an easily configurable
algorithm, instead of having to carefully install specific
dependencies.

* limit install, since >2 incompatible

* Measure inference time (#532)

Add the option to measure inference time (disabled by default) for most frameworks.
For those frameworks, inference time is measured capturing both the data loading and the inference.
This is done to make things more equal between the different frameworks (as some _need_ to read the file if they don't operator in Python). Inference time is measured multiple times for different batch sizes (configurable). By default, the median is reported in the results file (as it is less sensitive to e.g., cold-starts) but all measured inference times are stored in the predictions folder of a run.
For Python frameworks, inference time for in-memory single row predictions is also measured.

* Upload to OpenML (#523)

Adds a script that allows uploading run results to openml.
Additional metadata is stored in the task information to be able to provide a complete description for openml upload.
Additional parameters are added to `run_benchmark` to allow runs to automatically be tagged, and to connect to the test server.
Also fixes TPOT integration for newer versions, where if a model has no `predict_proba` an `AttributeError` is raised instead of a `RuntimeError`.

* Fix a race condition of checking vs adding results (#535)

Specifically, adding results was queued in a job executor, while
checking results was directly called by the worker threads.
If the worker thread checks before the executor had added results,
it is possible to get into a deadlock condition. The deadlock
arises from the fact that the `stop` condition is never called
and the main thread will continue to wait for its END_Q signal.

* Add scikit_safe inference time measurement files (#537)

* Add scikit_safe inference time measurement files

These files have categorical values numerically encoded and missing
values imputed, which makes them usable for any scikit-learn algo.

* Only generate inference measurement files if enabled

* Optionally limit inference time measurements by dataset size (#538)

* Add versions 2023 q2 (#539)

* Fix versions for June 2023 benchmark

* Add 2023Q2 framework tag

* Use encoded values for inference

* Add us-east-2 AMI

* Run docker as root on AWS

* Add option to add build options for docker build command

* Remove 'infer_speed' artifact as it is not supported in main repo

* Fix pandas 2 not compatible with autosklearn 2 see askl#1672

---------

Co-authored-by: github-actions <github-actions@github.com>
Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de>
Co-authored-by: seb. <sebastien@h2o.ai>
Co-authored-by: Weisu Yin <weisy@amazon.com>
Co-authored-by: Weisu Yin <weisuyin96@gmail.com>
Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>
Co-authored-by: Qingyun Wu <qw2ky@virginia.edu>
Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
Co-authored-by: Robinnibor <robinksskss@gmail.com>
Co-authored-by: Erin LeDell <erin@h2o.ai>
Co-authored-by: Stefan Coors <stefan.coors@gmx.net>
Co-authored-by: Alan Silva <3899850+alanwilter@users.noreply.github.com>
Co-authored-by: Nick Erickson <neerick@amazon.com>
Co-authored-by: Leo <LeonhardSommer96@gmail.com>
Co-authored-by: TrellixVulnTeam <112716341+TrellixVulnTeam@users.noreply.github.com>
Co-authored-by: seb <sebastien.poirier@h2o.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants