Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only dummy predictions in custom metric #1639

Open
konstantin-doncov opened this issue Jan 6, 2023 · 1 comment
Open

Only dummy predictions in custom metric #1639

konstantin-doncov opened this issue Jan 6, 2023 · 1 comment

Comments

@konstantin-doncov
Copy link

konstantin-doncov commented Jan 6, 2023

I want to use my own metric, but I get a lot of troubles during implementing this. Many of them the are related to each other. So, I hope I will solve all of them.
E.g. if I use this code with 5 minutes max runtime(time_left_for_this_task=5*60):

def metric_which_needs_x(solution, prediction, X_data):
  print(prediction)
  print(len(X_data))
  return 1

accuracy_scorer = askl.metrics.make_scorer(
    name="accu_X",
    score_func=metric_which_needs_x,
    optimum=1,
    greater_is_better=True,
    needs_proba=True,
    needs_X=True,
    needs_threshold=False
)


automl = askl.classification.AutoSklearnClassifier(
  ensemble_size = 1,
    time_left_for_this_task=5*60,
    per_run_time_limit=5*60,
    metric=accuracy_scorer,
    resampling_strategy=logo,
    resampling_strategy_arguments={"groups": groups}
)
automl.fit(x, y)

Then all fine and my metric function gets real predictions(not 0.5 0.5):

:3: DeprecationWarning: ensemble_size has been deprecated, please use ensemble_kwargs = {'ensemble_size': 1}. Inserting ensemble_size into ensemble_kwargs for now. ensemble_size will be removed in auto-sklearn 0.16.
automl = askl.classification.AutoSklearnClassifier(
[WARNING] [2023-01-06 15:18:13,967:Client-AutoML(1):52f808ae-8dd5-11ed-840e-0242ac1c000c] Time limit for a single run is higher than total time limit. Capping the limit for a single run to the total time given to SMAC (294.777947)
[WARNING] [2023-01-06 15:18:13,967:Client-AutoML(1):52f808ae-8dd5-11ed-840e-0242ac1c000c] Capping the per_run_time_limit to 147.0 to have time for a least 2 models in each process.
[WARNING] [2023-01-06 15:18:14,003:Client-AutoMLSMBO(1)::52f808ae-8dd5-11ed-840e-0242ac1c000c] Could not find meta-data directory /usr/local/lib/python3.8/dist-packages/autosklearn/metalearning/files/accu_X_binary.classification_dense
[[0.5 0.5]
[0.5 0.5]
[0.5 0.5]
...
[0.5 0.5]
[0.5 0.5]
[0.5 0.5]]
227226
[WARNING] [2023-01-06 15:20:20,378:Client-EnsembleBuilder] No runs were available to build an ensemble from
[[0.5 0.5]
[0.5 0.5]
[0.5 0.5]
...
[0.5 0.5]
[0.5 0.5]
[0.5 0.5]]
227226
[[0.5 0.5]
[0.5 0.5]
[0.5 0.5]
...
[0.5 0.5]
[0.5 0.5]
[0.5 0.5]]
227226
[[0.2602794 0.7397206 ]
[0.2947102 0.7052898 ]
[0.26641977 0.73358023]
...
[0.8857727 0.11422727]
[0.83059156 0.16940844]
[0.8350615 0.16493851]]
227226
[WARNING] [2023-01-06 15:22:11,886:Client-EnsembleBuilder] No models better than random - using Dummy losses!
Models besides current dummy model: 0
Dummy models: 1
[WARNING] [2023-01-06 15:22:11,930:smac.runhistory.runhistory2epm.RunHistory2EPM4LogCost] Got cost of smaller/equal to 0. Replace by 0.000010 since we use log cost.
[[0.2602794 0.7397206 ]
[0.2947102 0.7052898 ]
[0.26641977 0.73358023]
...
[0.8857727 0.11422727]
[0.83059156 0.16940844]
[0.8350615 0.16493851]]
227226
[WARNING] [2023-01-06 15:22:53,545:Client-EnsembleBuilder] No models better than random - using Dummy losses!
Models besides current dummy model: 0
Dummy models: 1
[WARNING] [2023-01-06 15:22:53,608:smac.runhistory.runhistory2epm.RunHistory2EPM4LogCost] Got cost of smaller/equal to 0. Replace by 0.000010 since we use log cost.

But if I use 4 minutes max runtime, then I get only dummy predictions(only 0.5 0.5).

You may say 'Well, then just use more time', but this is not a cure. Because when I use more complicated and time consuming (like 1-2 minute for one metric run) metrics, then it's not enough even one hour(and I don't know how much time it takes).
So, how can I fix this?

@eddiebergman
Copy link
Contributor

Sorry for the delay, I think the solution here is actually remove it all together. The logs say it only gets to try two models and both are worse than the dummy model, it seems like it needs to try more of them.

It could be that the logo resampling strategy (which I guess is Leave One Group Out) could be creating many subsets of data which just means there's just too much data to fit if the number of groups is too much. Say for example you have 1_000_000 samples with 10 groups. My impression of logo is that you would need to fit 10 models each on 900_000 equaling 9_000_000 data points total to get one model evaluation. This gets amplified more as the number of groups increases. Have you tried simple holdout just to test this hypothesis?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants