Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoMLRegressor does not support task binary #1603

Open
dadangsetio opened this issue Nov 6, 2022 · 7 comments
Open

AutoMLRegressor does not support task binary #1603

dadangsetio opened this issue Nov 6, 2022 · 7 comments
Labels

Comments

@dadangsetio
Copy link

dadangsetio commented Nov 6, 2022

cant fit model with AutoMLRegression

from autosklearn.regression import AutoSklearnRegressor
reg = AutoSklearnRegressor(time_left_for_this_task=5*60, per_run_time_limit=30, n_jobs=8)
reg.fit(X=X_train, y=y_train)

this my log

ValueError                                Traceback (most recent call last)
Input In [25], in <cell line: 2>()
      1 reg = AutoSklearnRegressor(time_left_for_this_task=5*60, per_run_time_limit=30, n_jobs=8)
----> 2 reg.fit(X=X_train, y=y_train)

File ~/miniforge3/lib/python3.10/site-packages/autosklearn/estimators.py:1587, in AutoSklearnRegressor.fit(self, X, y, X_test, y_test, feat_type, dataset_name)
   1576     raise ValueError(
   1577         "Regression with data of type {} is "
   1578         "not supported. Supported types are {}. "
   (...)
   1582         "".format(target_type, supported_types)
   1583     )
   1585 # Fit is supposed to be idempotent!
   1586 # But not if we use share_mode.
-> 1587 super().fit(
   1588     X=X,
   1589     y=y,
   1590     X_test=X_test,
   1591     y_test=y_test,
   1592     feat_type=feat_type,
   1593     dataset_name=dataset_name,
   1594 )
   1596 return self

File ~/miniforge3/lib/python3.10/site-packages/autosklearn/estimators.py:540, in AutoSklearnEstimator.fit(self, **kwargs)
    538 if self.automl_ is None:
    539     self.automl_ = self.build_automl()
--> 540 self.automl_.fit(load_models=self.load_models, **kwargs)
    542 return self

File ~/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py:2394, in AutoMLRegressor.fit(self, X, y, X_test, y_test, feat_type, dataset_name, only_return_configuration_space, load_models)
   2383 def fit(
   2384     self,
   2385     X: SUPPORTED_FEAT_TYPES,
   (...)
   2392     load_models: bool = True,
   2393 ) -> AutoMLRegressor:
-> 2394     return super().fit(
   2395         X,
   2396         y,
   2397         X_test=X_test,
   2398         y_test=y_test,
   2399         feat_type=feat_type,
   2400         dataset_name=dataset_name,
   2401         only_return_configuration_space=only_return_configuration_space,
   2402         load_models=load_models,
   2403         is_classification=False,
   2404     )

File ~/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py:611, in AutoML.fit(self, X, y, task, X_test, y_test, feat_type, dataset_name, only_return_configuration_space, load_models, is_classification)
    609     y_task = type_of_target(y)
    610     if not self._supports_task_type(y_task):
--> 611         raise ValueError(
    612             f"{self.__class__.__name__} does not support" f" task {y_task}"
    613         )
    614     self._task = self._task_type_id(y_task)
    615 else:

ValueError: AutoMLRegressor does not support task binary

System Details (if relevant)

  • 0.15.0
  • Macbook Air M1
@eddiebergman
Copy link
Contributor

Hi @dadangsetio, we use sklearn.utils.multiclass.type_of_target to identify the task type based on the y you pass in. My guess is that it looks something like [0, 1, 0, 1, 1, ...] which gets identified as a binary classification problem. Is this your intended behavior? If so, then I'm not sure we have any way to overwrite this behaviour but I can look into it if it is.

@dadangsetio
Copy link
Author

dadangsetio commented Nov 7, 2022

Hi @dadangsetio, we use sklearn.utils.multiclass.type_of_target to identify the task type based on the y you pass in. My guess is that it looks something like [0, 1, 0, 1, 1, ...] which gets identified as a binary classification problem. Is this your intended behavior? If so, then I'm not sure we have any way to overwrite this behaviour but I can look into it if it is.

thank you for response @eddiebergman you are right that the content of y is binary, so how can i solve them?

@eddiebergman
Copy link
Contributor

You may prefer to use probability scores from predict_proba and use a Classifier instead of a Regressor.

If you really need to skip the type_of_target check then you'll need to use the AutoML class instead of the AutoSklearnRegresssor, which is just a fancy wrapper that makes some things simpler, however depending on your use case this should be okay.

Here's a sample snippet:

from sklearn.datasets import make_classification

from autosklearn.automl import AutoML
from autosklearn.constants import REGRESSION

X, y = make_classification()
print(y)   # [0, 0, 1, ...]

automl = AutoML(
    time_left_for_this_task=30,
    per_run_time_limit=5,
    ...,
)

regressor.fit(X, y, task=REGRESSION, ...)

Here's the __init__(...) and the fit(...) calls from AutoML for you.

Best,
Eddie

@dadangsetio
Copy link
Author

iam use sample snippet of AutoML , but getting error like this

[ERROR] [2022-11-07 19:18:21,120:Client-AutoML(1):441115fc-5e96-11ed-acf3-363077345c9d] (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\\nTraceback (most recent call last):\\n  File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\\n    self.run()\\n  File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run\\n    self._target(*self._args, **self._kwargs)\\n  File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 108, in subprocess_func\\n    resource.setrlimit(resource.RLIMIT_AS, (mem_in_b, mem_in_b))\\nValueError: current limit exceeds maximum limit\\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)
[ERROR] [2022-11-07 19:18:21,120:Client-AutoML(1):441115fc-5e96-11ed-acf3-363077345c9d] (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\\nTraceback (most recent call last):\\n  File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\\n    self.run()\\n  File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run\\n    self._target(*self._args, **self._kwargs)\\n  File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 108, in subprocess_func\\n    resource.setrlimit(resource.RLIMIT_AS, (mem_in_b, mem_in_b))\\nValueError: current limit exceeds maximum limit\\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)
Traceback (most recent call last):
  File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py", line 765, in fit
    self._do_dummy_prediction()
  File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/autosklearn/automl.py", line 489, in _do_dummy_prediction
    raise ValueError(msg)
ValueError: (' Dummy prediction failed with run state StatusType.CRASHED and additional output: {\'error\': \'Result queue is empty\', \'exit_status\': "<class \'pynisher.limit_function_call.AnythingException\'>", \'subprocess_stdout\': \'\', \'subprocess_stderr\': \'Process pynisher function call:\\nTraceback (most recent call last):\\n  File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap\\n    self.run()\\n  File "/Users/dadangbudi/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run\\n    self._target(*self._args, **self._kwargs)\\n  File "/Users/dadangbudi/miniforge3/lib/python3.10/site-packages/pynisher/limit_function_call.py", line 108, in subprocess_func\\n    resource.setrlimit(resource.RLIMIT_AS, (mem_in_b, mem_in_b))\\nValueError: current limit exceeds maximum limit\\n\', \'exitcode\': 1, \'configuration_origin\': \'DUMMY\'}.',)

@eddiebergman
Copy link
Contributor

eddiebergman commented Nov 7, 2022

You should use the same parameters you use when you constructed the estimator as you do in your original code, my guess is you had set the memory_limit=None.

The issue is that there is no way to limit the memory of processes on Mac as far as I know.
See https://github.com/automl/pynisher#features

The above version of pynisher we use is actually newer and we need to update to it.

@ViktorooReps
Copy link

classifier = AutoSklearn2Classifier(
    time_left_for_this_task=15 * 60,
    per_run_time_limit=30,
    memory_limit=None,
    n_jobs=1, 
    max_models_on_disc=10,
    ensemble_size=10
).fit(preprocessor.transform(train_x), train_y, preprocessor.transform(valid_x), valid_y)

There is an internal check that prohibits running without memory limit:

[ERROR] [2024-07-18 15:19:23,002:Client-AutoML(1):5923f702-4508-11ef-82ea-42442fa1d044] '>' not supported between instances of 'NoneType' and 'int'
Traceback (most recent call last):
  File "/Users/Viktor/PycharmProjects/laion-copyright/.venv39/lib/python3.9/site-packages/autosklearn/automl.py", line 680, in fit
    X, y = reduce_dataset_size_if_too_large(
  File "/Users/Viktor/PycharmProjects/laion-copyright/.venv39/lib/python3.9/site-packages/autosklearn/util/data.py", line 430, in reduce_dataset_size_if_too_large
    assert memory_limit > 0
TypeError: '>' not supported between instances of 'NoneType' and 'int'

It's such a shame we cannot use auto-sklearn on Apple Silicon.. Hopefully one day you find a workaround!

@dadangsetio
Copy link
Author

Yes, it's true, I used to feel like that @ViktorooReps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants