Question Paper #583

cynthiamaia · 2023-07-29T14:52:29Z

cynthiamaia
Jul 29, 2023

Hello, I didn't quite understand this excerpt from the article: "Instead, we impute missing values with the constant predictor, or prior. This baseline returns the empirical class distribution for classification and the empirical mean for regression. This is a very penalizing imputation strategy, as the constant predictor is often much worse than results obtained by the AutoML frameworks that produce predictions for the task or fold. However, we feel this penalty for ill-behaved systems is appropriate and fairer towards the well-behaved frameworks and hope that it encourages a standard of robust, well-behaved AutoML frameworks." Could you explain better how this imputation was performed? Thank you in advance. I'm analyzing my experiments, and it returned missing values in terms of failures. I wanted to know better how you handled it.

PGijsbers · 2023-07-30T14:15:20Z

PGijsbers
Jul 30, 2023
Maintainer

When running experiments, sometimes AutoML frameworks experience failures. You may encounter something like this in your results file (simplified for readability):

id, task, framework, constraint, fold, type, result, metric, ..., info ,...
openml.org/t/359991, kick, NaiveAutoML, 1h8c_gp3, 1, binary, , auc, ..., CalledProcessError: Command '/bench/frameworks/NaiveAutoML/venv/bin/python -W ignore /bench/frameworks/NaiveAutoML/exec.py' returned non-zero exit status 137., ...

Here, NaiveAutoML for whatever reason failed to create predictions and no result is available. However, when we compare results across tasks and folds, such as when creating the critical difference diagrams, we need some performance measure. Concretely, when creating critical difference plots we first rank each framework by their mean score on each task. Now we have to decide how we calculate a mean score for NaiveAutoML on kick even though it crashed one (or more) time. We decided that we first impute such missing results with the score obtained by a constant predictor, and only then calculate the mean for NaiveAutoML on kick.

To get scores for our constant predictor, we use scikit-learn's Dummy Classifier and Dummy Regressor. It simply predicts the mean response (regression) or the empirical class probabilities of the training data (classification). We run that on each (task, fold). We can then find the score of the constant predictor on fold 1 of kick:

openml.org/t/359991,kick,constantpredictor,1h8c_gp3,1,binary,0.5,auc,local,0.24.2,,2.0.5,2022-02-09T20:04:52,0.2,0.02,0.0001,1.0,1435973,,0.876969,0.5,0.37292,0.5,,,,

The constant predictor has an AUC of 0.5 for fold 1 of kick (as expected), so now we impute the result of the Naive AutoML experiment on fold 1 of kick as if it obtained an AUC of 0.5, and use that in calculating the mean score for Naive AutoML on kick. Now that we have a mean score for NaiveAutoML on kick, we can compare it to mean scores of other AutoML frameworks on kick and so analysis as usual. Hope that clears things up!

0 replies

cynthiamaia · 2023-07-31T00:01:09Z

cynthiamaia
Jul 31, 2023
Author

Thanks! So one more question, if even using the constraintpredictor, it continues to show errors. Did you impute the value of the task as 0?

1 reply

PGijsbers Oct 4, 2023
Maintainer

Sorry for the late reply, I somehow missed the notification. We currently only support fully supervised problems, as such the constantpredictor is always be able to provide predictions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question Paper #583

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Question Paper #583

cynthiamaia Jul 29, 2023

Replies: 2 comments · 1 reply

PGijsbers Jul 30, 2023 Maintainer

cynthiamaia Jul 31, 2023 Author

PGijsbers Oct 4, 2023 Maintainer

cynthiamaia
Jul 29, 2023

Replies: 2 comments 1 reply

PGijsbers
Jul 30, 2023
Maintainer

cynthiamaia
Jul 31, 2023
Author

PGijsbers Oct 4, 2023
Maintainer