Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom pipelines #379

Open
bmmalone opened this issue Oct 27, 2017 · 3 comments
Open

Custom pipelines #379

bmmalone opened this issue Oct 27, 2017 · 3 comments
Labels
enhancement A new improvement or feature

Comments

@bmmalone
Copy link

Hi,

This is again more of a question than an issue. What is the best way to incorporate new pipelines into autosklearn?

For example, looking at SimpleClassificationPipeline, it seems the main things are to implement _get_hyperparameter_search_space and _get_pipeline (and the other simple methods). Using this and SimpleRegressionPipeline as examples, that part is fairly straightforward.

However, from there, I have difficultly figuring out what to do. In particular, AbstractEvaluator seems hard-coded to use the appropriate SimpleXXXPipeline depending on the task type (Lines ~115-130 of evaluation/abstract_evaluator.py).

Monkey patching the methods on the SimplePipelines seems to work, but that feels extremely brittle. Another option would be to use smac directly, but that sounds like it would necessitate recreating much of the infrastructure already present in autosklearn.

Please let me know if there is a good way to go about this. I can also provide a simple use case if that would be helpful. Thanks.

Have a good day,
Brandon

@mfeurer
Copy link
Contributor

mfeurer commented Oct 27, 2017

Hi,

This is an excellent question to which there is no satisfactory answer. As you correctly realized, the pipelines merely do more than defining a search space and what a scikit-learn pipeline does. Ideally, a user would be able to pass in a custom pipeline, which is subclassed from a pipeline base class. The best way to achieve this would probably be to refactor Auto-sklearn to accept a pipeline class similarly to how it accepts a metric class.

Monkey patching only AbstractEvaluator won't help as you also need to at least change autosklearn.util.pipeline. Two more limitation you should be aware of: Auto-sklearn won't accept FeatureUnion as part of the pipeline, and adding too many different prerprocessors after each other will make the pipeline generation process kind of slow.

I can help you through this process if you want add this feature to Auto-sklearn.

Cheers,
Matthias

@bmmalone
Copy link
Author

Hi Matthias,

Okay, thanks for letting me know. By monkey patching, I mean doing something like this:

import autosklearn.pipeline.classification
import autosklearn.util.pipeline

autosklearn.pipeline.classification.SimpleClassificationPipeline = MyPipeline
autosklearn.util.pipeline.SimpleClassificationPipeline = MyPipeline

where MyPipeline implements the same interface as SimpleClassificationPipeline. Since AbstractEvaluator refers to the fully qualified class (i.e., self.model_class = autosklearn.pipeline.classification.SimpleClassificationPipeline), that covers all current uses. Obviously, this is quite fragile.

As you described, directly passing pipelines would be much more robust. I do have some interest in doing this, and I will contact you offline on Monday to discuss some more details about what we are doing and some options (e.g., maybe there is an appropriate BMBF call or something).

I don't quite have the monkey patching approach working, yet, though I believe there are just some coding issues to work through. I will post back here how that goes.

Have a good day,
Brandon

@github-actions
Copy link
Contributor

github-actions bot commented May 5, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs for the next 7 days. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new improvement or feature
Projects
None yet
Development

No branches or pull requests

3 participants