Skip to content

S3 support for auto-sklearn to store and load models and configurations for each run #986

Open
@pkvprakash

Description

@pkvprakash

Currently I see no support for auto-sklearn to read and write from s3. Providing support for s3 opens up a door in running auto-sklearn pipelines in distributed mode in cloud or in any other on-prem cluster

As of now, after a quick code walkthrough, I can see there are many places auto-sklearn interact with filesystem directly using shutil, os, and lockfile modules.

This means we need to tackle this issue in two steps.

  1. Create an abstraction layer for filesystem access and refactor the code to use this layer for all filesystem related activities.
  2. Add support for s3 by providing concrete implementation of the abstractions for s3

What all are your thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions