BentoML-0.7.2
Introducing 2 Major New Features
- Adaptive micro-batching mode in API server
- Web UI for model and deployment management
Adaptive Micro Batching
Adaptive micro-batching is a technique used in advanced serving system, where prediction requests coming in are grouped into small batches for inference. With version 0.7.2, we've implemented Micro Batching mode for API server, and all existing BentoService can benefit from this by simply enable it via the --enable-microbatch
flag or BENTOML_ENABLE_MICROBATCH
environment variable when running API server docker image:
$ bentoml serve-gunicorn IrisClassifier:latest --enable-microbatch
$ docker run -p 5000:5000 -e BENTOML_ENABLE_MICROBATCH=True iris-classifier:latest
Currently, the micro-batch mode is only effective for DataframeHandler, JsonHandler, and TensorflowTensorHandler. We are working on support for ImageHandler, along with a few new handler types coming in the next release.
Model Management Web UI
BentoML has a standalone component YataiService that handles model storage and deployment via gRPC calls. By default, BentoML launches a local YataiService instance when being imported. This local YataiService instance saves BentoService files to ~/bentoml/repository/
directory and other metadata to ~/bentoml/storage.db
.
In release 0.7.x, we introduced a new CLI command for running YataiService as a standalone service that can be shared by multiple bentoml clients. This makes it easy to share, use and discover models and serving deployments created by others in your team.
To play with the YataiService gRPC & Web server, run the following command:
$ bentoml yatai-service-start
$ docker run -v ~/bentoml:/bentoml -p 3000:3000 -p 50051:50051 bentoml/yatai-service:0.7.2 --db-url=sqlite:///bentoml/storage.db --repo-base-url=/bentoml/repository
For team settings, we recommend using a remote database instance and cloud storage such as s3 for storage. E.g.:
$ docker run -p 3000:3000 -p 50051:50051 \
-e AWS_SECRET_ACCESS_KEY=... -e AWS_ACCESS_KEY_ID=... \
bentoml/yatai-service:0.7.2 \
--db-url postgresql://scott:tiger@localhost:5432/bentomldb \
--repo-base-url s3://my-bentoml-repo/
Documentation Updates
- Added a new section working through all the main concepts and best practices using BentoML, we recommend it as a must-read for new BentoML users
- BentoML Core Concepts: https://docs.bentoml.org/en/latest/concepts.html#core-concepts
Version 0.7.0 and 0.7.1 are not recommended due to an issue with including the Benchmark directory in its PyPI distribution. But other than that, they are identical to version 0.7.2.