Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC Should we document how people can deploy their models? #277

Open
adrinjalali opened this issue Jan 23, 2023 · 5 comments
Open

RFC Should we document how people can deploy their models? #277

adrinjalali opened this issue Jan 23, 2023 · 5 comments

Comments

@adrinjalali
Copy link
Member

I started writing a document explaining how people can deploy their models, and soon I was overwhelmed by the number of options out there.

One could simply implement a REST endpoint like what is done in the https://github.com/huggingface/api-inference-community repo, but that has issues with requiring feature names for instance, and that it uses JSON to transfer data which is not the fastest when it comes to numerical data.

Another option would be a simple gRPC which would use protobuf, but that requires defining the structure of the data. If the developer is okay with that, they could follow a bunch of guidelines out there on the internet, but it's not general and it'd be different for each use case, since it involves some manual work to define the data and the model.

There are also tons of companies which have their own solutions to deploy sklearn based models and a bunch of them are open sourced. Should we have a list of them?

Right now a user/developer who's new to deploying models, would be quite overwhelmed by the ecosystem if they were to find a solution, and I think a contribution of this library can be to ease that load, but I'm not sure how.

WDYT @skops-dev/maintainers
Also cc'ing some folks who might have better ideas: @ogrisel @thomasjpfan @mrocklin @jnothman @betatim

@BenjaminBossan
Copy link
Collaborator

That sounds like a big expansion of scope for skops, since, as you mentioned, there are so many options out there. I have used only a handful, so making a fair comparison sounds almost impossible. There are also a bunch of articles out there that do some of the work you mentioned (one example that comes to mind).

I have a slightly different proposal: Maybe we can contact the providers/maintainers/companies behind those solutions and have them write the stuff? Maybe not even on the skops level, but perhaps a separate repo that they can do PRs on to add/update their solution's doc? We could start with a typical sklearn deployment (using, say, flask/fastAPI/Starlette on EC2/Lambda) as the base and then accept contributions that show how to achieve something similar with solution X. We could also provide sort of a template that could be filled, e.g. how does X solve infra, logging, versioning, etc.

@merveenoyan
Copy link
Collaborator

merveenoyan commented Jan 31, 2023

@BenjaminBossan this is being done with transformers already, I think it would be good, but I'm also in favor of a standalone doc for REST or gRPC endpoint. (I really don't think people use gRPC often from what I see though) @adrinjalali

@BenjaminBossan
Copy link
Collaborator

this is being done with transformers already

Can you give a link?

but I'm also in favor of a standalone doc for REST or gRPC endpoint

Yeah, we could provide basic examples for REST and gRPC (of course using skops) but without any specific framework. We can also add an example using HF inference API. But it would be a looot of work to give examples for all existing model serving solutions out there like sagemaker.

@tuscland
Copy link
Contributor

tuscland commented Dec 5, 2023

Hi,

gRPC requires a schema definition and the subsequent generation of client stubs. Therefore maintenance is required to serialize data between two systems. This effort is typically accepted for internal communications because it provides performance and reliability. However, you don't often see gRPC in public communications because of the lack of flexibility imposed by type checking.

What is the volume of numerical data? Do you need to stream the data? Can gzip compression over JSON representation help?

My 2 cents,
Cam

@adrinjalali
Copy link
Member Author

Yes upon my investigation of gRPC I also came to the same conclusion. There are efforts out there to generalize it for a given data, but it requires a bit of work to make it more usable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants