A Crate for serving Transformer models written by a noob rustacean for practice, meant to actually be useful and robust.
Heavily dependent on rust_bert
, which itself is inspired by HuggingFace Transformers. This crate is essentially a platform for building the rust_bert
pipelines into actix_web
microservices.
The ultimate near-term goal of this project is to support all the pipelines currently supported by rust_bert
. The long-term would not be limited to these, because botht this crate and rust_bert
offer flexibility and extensibility.
- Working on performance/load benchmarks
- Added BIO span tagging to NER service (4 Oct 2024)
- Used rayon to add parallelization to easily targeted (4 Oct 2024)
- Reranking added (16 Mar 2024)
- Sequence Classification added 17 October 2023
- NER support added 15 October 2023
- (Paused 14 October 2023) Workig on support for remote models from HF Hub
All service images use the same Dockerfile
. You can select which service to build using the SERVICE
build arg IE:
docker build --build-arg SERVICE=bert_embedding_service .
The models are not included in the image and must currently be provided as a volume mount. Support for remote models on HF hub will be added very soon. S3 will be next. Followed by other cloud buckets/blobstores.
Once you have downloaded your HF model package, you can share it with the container via a volume mount and setting the MODEL_PATH
environment variable to the mounted path.
There is a convenience sctipt launch-in-docker
which allows you to pass a local_model_path
and a container_model_path
and will automatically take care of both sharing the volume and setting MODEL_PATH
correctly. The launch-in-docker
script takes two additional optional parameters for your convenience:
rebuild
: Whether or not to rebuildimage_tag
the image even if it existsimage_tag
: The tag that will be given to the local image, if it doesn't exist, or ifrebuild=yes
, it wil be built
# You would replace these with paths that matched your model
local_path="$(pwd)/notebooks/models/all-MiniLM-L12-v2"
container_path="/opt/ml/models/all-MiniLM-L12-v2"
./launch-in-docker \
service=bert_embedding_service \
image=embedding_test \
local_model_path=$local_path \
container_model_path=$container_path
This would serve all-MiniLM-L12-v2
embeddings at localhost:5000/embeddings/encode