OpenVINO™ Model Server 2024.1
The 2024.1 has a few improvements in the serving functionality, demo enhancements and bug fixes.
Changes and improvements
-
Updated OpenVINO Runtime backend to 2024.1 Link
-
Added support for OpenVINO models with string data type on output. Together with the features introduced in 2024.0, now OVMS can support models with input and output of string type. That way you can take advantage of the tokenization built into the model as the first layer. You can also rely on any post-processing embedded into the model which returns just text. Check universal sentence encoder demo and image classification with string output demo
-
Updated MediaPipe python calculators to support relative path for all related configuration and python code files. Now, the complete graph configuration folder can be deployed in arbitrary path without any code changes. It is demonstrated in the updated text generation demo.
-
Extended support for KServe REST API for MediaPipe graph endpoints. Now you can send the data in KServe JSON body. Check how it is used in text generation use case.
-
Added demo showcasing full RAG algorithm entirely delegated to the model server Link
-
Added RedHat UBI based Dockerfile for python demos, usage documented in python demos
Breaking changes
No breaking changes.
Bug fixes
- Improvements in error handling for invalid requests and incorrect configuration
- Fixes in the demos and documentation
You can use an OpenVINO Model Server public Docker images based on Ubuntu via the following command:
docker pull openvino/model_server:2024.1
- CPU device support with the image based on Ubuntu22.04
docker pull openvino/model_server:2024.1-gpu
- GPU and CPU device support with the image based on Ubuntu22.04
or use provided binary packages.
The prebuilt image is available also on RedHat Ecosystem Catalog