Open
Description
Currently I see no support for auto-sklearn to read and write from s3. Providing support for s3 opens up a door in running auto-sklearn pipelines in distributed mode in cloud or in any other on-prem cluster
As of now, after a quick code walkthrough, I can see there are many places auto-sklearn interact with filesystem directly using shutil, os, and lockfile modules.
This means we need to tackle this issue in two steps.
- Create an abstraction layer for filesystem access and refactor the code to use this layer for all filesystem related activities.
- Add support for s3 by providing concrete implementation of the abstractions for s3
What all are your thoughts?