@sphinxdirective
.. toctree:: :maxdepth: 1 :hidden:
ovms_docs_dag ovms_docs_binary_input ovms_docs_text ovms_docs_model_version_policy ovms_docs_shape_batch_layout ovms_docs_online_config_changes ovms_docs_stateful_models ovms_docs_metrics ovms_docs_dynamic_input ovms_docs_c_api ovms_docs_advanced ovms_docs_mediapipe
@endsphinxdirective
Connect multiple models in a pipeline and reduce data transfer overhead with Directed Acyclic Graph (DAG) Scheduler. Implement model inference and data transformations using a custom node C/C++ dynamic library.
Send data in JPEG or PNG formats to reduce traffic and offload data pre-processing to the server.
The model repository structure enables adding or deleting numerical version directories and the server will automatically adjust which models are served.
Control which model versions are served by setting the model version policy to serve all models, a specific model or set of models or just the latest version of the model (default setting).
Change the batch size, shape and layout of the model at runtime to achieve high throughput and low latency.
OpenVINO Model Server regularly checks for changes to the configuration file and applies them during runtime. This means that you can change model configurations (for example, change the device where a model is served), add a new model or completely remove one that is no longer needed. These changes will be applied without any disruption to the service.
Serve models that operate on sequences of data and maintain their state between inference requests.
Use the metrics endpoint compatible with the Prometheus to access performance and utilization statistics.
Configure served models to accept data with variable batch sizes and input shapes.
Use in process inference via model server to leverage the model management and model pipelines functionality of OpenVINO Model Server within an application. This allows to reuse existing OVMS functionality to execute inference locally without network overhead.
Use CPU Extensions, model cache feature or a custom model loader.