Skip to content

POC Generic BERT serving framework built for practice by a noob rustacean

License

Notifications You must be signed in to change notification settings

HarryCaveMan/bert-serving-rust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bert Serving Rust

A Crate for serving Transformer models written by a noob rustacean for practice, meant to actually be useful and robust.

Background

Heavily dependent on rust_bert, which itself is inspired by HuggingFace Transformers. This crate is essentially a platform for building the rust_bert pipelines into actix_web microservices.

The ultimate near-term goal of this project is to support all the pipelines currently supported by rust_bert. The long-term would not be limited to these, because botht this crate and rust_bert offer flexibility and extensibility.

In Progress Features (4 Oct 2024)

  • Working on performance/load benchmarks
  • Added BIO span tagging to NER service (4 Oct 2024)
  • Used rayon to add parallelization to easily targeted (4 Oct 2024)
  • Reranking added (16 Mar 2024)
  • Sequence Classification added 17 October 2023
  • NER support added 15 October 2023
  • (Paused 14 October 2023) Workig on support for remote models from HF Hub

Current Services (16 Mar 2024)

Building a service image

All service images use the same Dockerfile. You can select which service to build using the SERVICE build arg IE:

docker build --build-arg SERVICE=bert_embedding_service .

Launching a service image

The models are not included in the image and must currently be provided as a volume mount. Support for remote models on HF hub will be added very soon. S3 will be next. Followed by other cloud buckets/blobstores.

Once you have downloaded your HF model package, you can share it with the container via a volume mount and setting the MODEL_PATH environment variable to the mounted path.

There is a convenience sctipt launch-in-docker which allows you to pass a local_model_path and a container_model_path and will automatically take care of both sharing the volume and setting MODEL_PATH correctly. The launch-in-docker script takes two additional optional parameters for your convenience:

  • rebuild: Whether or not to rebuild image_tag the image even if it exists
  • image_tag: The tag that will be given to the local image, if it doesn't exist, or if rebuild=yes, it wil be built

Exapmle

# You would replace these with paths that matched your model
local_path="$(pwd)/notebooks/models/all-MiniLM-L12-v2"
container_path="/opt/ml/models/all-MiniLM-L12-v2"
./launch-in-docker \
    service=bert_embedding_service \
    image=embedding_test \
    local_model_path=$local_path \
    container_model_path=$container_path

This would serve all-MiniLM-L12-v2 embeddings at localhost:5000/embeddings/encode

About

POC Generic BERT serving framework built for practice by a noob rustacean

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published