-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serializable Fonduer model #259
Comments
This is a great idea! Really excited to see/chat about how we can approach it to make Fonduer much easier to use. |
I did some research how to serialize a dynamically created class (e.g., mention/candidate subclasses in Fonduer) and I found that cloudpickle or dill can serialize such a class. |
Hello, |
I've created a custom MLflow model to package a (trained) Fonduer model. Currently this custom MLflow model includes some hard-coded part, hence needs cleanup; but I'd love to contribute it to the community if it is useful for other people. |
@HiromuHota Awesome! We definitely love to have it since more and more people want to use it! Happy to chat and contribute as well! We should have a tutorial for that as well. 👍 |
@HiromuHota great! I am happy to contribute on this milestone for Fonduer. |
@senwu @trungtv I've created a new repository (https://github.com/HiromuHota/fonduer-mlflow) for this custom MLflow model for Fonduer. |
I think fonduer-mlflow became in good shape and ready for to be submitted as a PR against fonduer. |
Users would like to be able to develop a Fonduer-based application locally on their workstation, and then package the whole pipeline (parsing, extraction, featurization, and classification) to be deployed somewhere remote to serve. However, FOnduer-based applications are not easily packaged. This commit changes that. Specifically, this creates a serializable Fonduer model that is capable of executing any phase of the Fonduer pipeline. It is manageable by MLflow [1], a platform for managing machine learning pipelines, using the MLflow Model as a storage format. Further documentation has been added in docs/user/packaging.rst, and usage is also shown in the additional tests. [1]: https://mlflow.org/ Closes #259.
Is your feature request related to a problem? Please describe.
I develop a Fonduer-based app locally on my laptop.
Once it's done, I'd like to package the whole Fonduer pipeline (parsing, extraction, featurization, and classification) and deploy it to a remote place to serve.
However, a Fonduer-based app is not easy to package hence not easy to deploy.
Describe the solution you'd like
Add a Fonduer model class that is
save
andload
member methods like below)Describe alternatives you've considered
I can create one or more of python scripts that do all the phase, package them, and deploy it.
This is cumbersome because the python script has to include many things (matchers, mention_classes, mention_spaces, candidate_classes, etc.) and it is not obvious what should be included for serving.
Additional context
I'd like to make Fonduer more deployable and servable.
I've been testing MLflow to package a Fonduer-based app and found it was difficult to do so when there is no serializable Fonduer model.
The text was updated successfully, but these errors were encountered: