Releases: bentoml/BentoML
BentoML-0.7.5
What's new:
- Added FastAI2 support, contributed by @HenryDashwood
Bug fixes:
Documentation updates:
- Added Kubeflow deployment guide
- Added Kubernetes deployment guide
- Added Knative deployment guide
BentoML-0.7.4
- Added support for Fasttext models, contributed by @GCHQResearcher83493
- Fixed Windows compatibility while packaging model, contributed by @codeslord
- Added benchmark using Tensorflow-based Bert model
- Fixed an issue with pip installing a BentoService saved bundle with the new release of pip
pip==20.1
Documentation:
- AWS ECS deployment guide https://docs.bentoml.org/en/latest/deployment/aws_ecs.html
- Heroku deployment guide: https://docs.bentoml.org/en/latest/deployment/heroku.html
- Knative deployment guide: https://docs.bentoml.org/en/latest/deployment/knative.html
BentoML-0.7.3
Improvements:
- Added
--timeout
option to SageMaker deployment creation command - Fixed an issue with the new GRPCIO PyPI release when deploying to AWS Lambda
Documentation:
- Revamped the Core Concept walk-through documentation
- Added notes on using micro-batching and deploying YataiService
BentoML-0.7.2
Introducing 2 Major New Features
- Adaptive micro-batching mode in API server
- Web UI for model and deployment management
Adaptive Micro Batching
Adaptive micro-batching is a technique used in advanced serving system, where prediction requests coming in are grouped into small batches for inference. With version 0.7.2, we've implemented Micro Batching mode for API server, and all existing BentoService can benefit from this by simply enable it via the --enable-microbatch
flag or BENTOML_ENABLE_MICROBATCH
environment variable when running API server docker image:
$ bentoml serve-gunicorn IrisClassifier:latest --enable-microbatch
$ docker run -p 5000:5000 -e BENTOML_ENABLE_MICROBATCH=True iris-classifier:latest
Currently, the micro-batch mode is only effective for DataframeHandler, JsonHandler, and TensorflowTensorHandler. We are working on support for ImageHandler, along with a few new handler types coming in the next release.
Model Management Web UI
BentoML has a standalone component YataiService that handles model storage and deployment via gRPC calls. By default, BentoML launches a local YataiService instance when being imported. This local YataiService instance saves BentoService files to ~/bentoml/repository/
directory and other metadata to ~/bentoml/storage.db
.
In release 0.7.x, we introduced a new CLI command for running YataiService as a standalone service that can be shared by multiple bentoml clients. This makes it easy to share, use and discover models and serving deployments created by others in your team.
To play with the YataiService gRPC & Web server, run the following command:
$ bentoml yatai-service-start
$ docker run -v ~/bentoml:/bentoml -p 3000:3000 -p 50051:50051 bentoml/yatai-service:0.7.2 --db-url=sqlite:///bentoml/storage.db --repo-base-url=/bentoml/repository
For team settings, we recommend using a remote database instance and cloud storage such as s3 for storage. E.g.:
$ docker run -p 3000:3000 -p 50051:50051 \
-e AWS_SECRET_ACCESS_KEY=... -e AWS_ACCESS_KEY_ID=... \
bentoml/yatai-service:0.7.2 \
--db-url postgresql://scott:tiger@localhost:5432/bentomldb \
--repo-base-url s3://my-bentoml-repo/
Documentation Updates
- Added a new section working through all the main concepts and best practices using BentoML, we recommend it as a must-read for new BentoML users
- BentoML Core Concepts: https://docs.bentoml.org/en/latest/concepts.html#core-concepts
Version 0.7.0 and 0.7.1 are not recommended due to an issue with including the Benchmark directory in its PyPI distribution. But other than that, they are identical to version 0.7.2.
BentoML-0.6.3
New Features:
- Automatically discover all pip dependencies via
@env(auto_pip_dependencies=True)
- CLI command auto-completion support
Beta Features:
Contact us via Slack for early access and documentation related to these features.
- Adaptive micro-batching in BentoML API server, including performance tracing and benchmark
- Standalone YataiService gRPC server for model management and deployment
Improvements & Bug fixes
- Improved end-to-end tests, covering entire BentoML workflow
- Fixed issues with using YataiService with PostgreSQL databases as storage
bentoml delete
command now supports deleting multiple BentoService at once, seebentoml delete --help
BentoML-0.6.2
Improvements:
- [ISSUE-505] Make "application/json" the default Content-Type in DataframeHandler #507
- CLI improvement - Add bento service as column for deployment list #514
- SageMaker deployment - error reading Azure user role info #510 by @HenryDashwood
- BentoML cli improvments #520, #519
- Add handler configs to BentoServiceMetadata proto and bentoml.yml file # 517
- Add support for list by labels #521
Bug fixes:
BentoML-0.6.1
- Bugfix:
bentoml serve-gunicorn
command was broken in 0.6.0, which also breaks the API Server docker container. This is a minor release including a fix this issue #499
BentoML-0.6.0
The biggest change in release 0.6.0 is revamped BentoML CLI, introducing new model/deployment management commands and new syntax for CLI inferencing.
- New commands for managing your model repository:
> bentoml list
BENTO_SERVICE CREATED_AT APIS ARTIFACTS
IrisClassifier:20200123004254_CB6865 2020-01-23 08:43 predict::DataframeHandler model::SklearnModelArtifact
IrisClassifier:20200122010013_E0292E 2020-01-22 09:00 predict::DataframeHandler clf::PickleArtifact
> bentoml get IrisClassifier
> bentoml get IrisClassifier:20200123004254_CB6865
> bentoml get IrisClassifier:latest
- Add support for using saved BentoServices by
name:version
tag instead of {saved_path}, here are some example commands:
> bentoml serve {saved_path}
> bentoml serve IrisClassifier:latest
> bentoml serve IrisClassifier:20200123004254_CB6865
> bentoml run IrisClassifier:latest predict --input='[[5.1, 3.5, 1.4, 0.2]]'
> bentoml get IrisClassifier:latest
- Separated deployment commands to sub-commands
AWS Lambda model serving deployment:
https://docs.bentoml.org/en/latest/deployment/aws_lambda.html
AWS Sagemaker model serving deployment:
https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html
- Breaking Change: Improved
bentoml run
command for inferencing from CLI
Changing from:
> bentoml {API_NAME} {saved_path} {run_args}
> bentoml predict {saved_path} --input=my_test_data.csvo:
To:
> bentoml run {BENTO/saved_path} {API_NAME} {run_args}
> bentoml run IrisClassifier:latest predict --input='[[1,2,3,4]]'
previous users can directly use the API name as the command to load and run a model API from cli, it looks like this: bentoml predict {saved_path} --input=my_test_data.csv
. The problem is that the API names are dynamically loaded and this makes it hard for bentoml command to provide useful --help
docs. And the default command
workaround with Click, makes it very confusing when the user types a wrong command. So we decided to make this change.
- Breaking Change:
--quiet
and--verbose
options position
Previously both --quiet
and --verbose
options must follow immediately after bentoml
command, now they are being added to options list of all subcommands.
If you are using these two options, you will need to change your CLI from:
> bentoml --verbose serve ...
To:
> bentoml serve ... --verbose
BentoML-0.5.8
- Fixed an issue with API server docker image build, where updating conda to newly released version causes the build to fail
- Documentation updates
- Removed the option to configure API endpoint output format by setting the HTTP header
BentoML-0.5.7
- SageMaker model serving deployment improvements:
- Added num_of_gunicorn_workers_per_instance deployment option
- Gunicorn worker count can be set automatically based on host CPU now
- Improved testing for SageMaker model serving deployment