Skip to content

Commit aa3e947

Browse files
jianshen92aarnphmparano
authored
docs: update general README structure (#5)
Co-authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Chaoyu <paranoyang@gmail.com>
1 parent c106a49 commit aa3e947

File tree

5 files changed

+80
-148
lines changed

5 files changed

+80
-148
lines changed

.github/workflows/docker-push.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,10 @@ jobs:
4545
run: |
4646
if [ "${{ matrix.tag }}" == 'gpu' ]; then
4747
BENTOFILE='bentofile.gpu.yaml'
48-
TAG='multi-tasks-nlp-gpu'
48+
TAG="$(basename ${{ steps.repository.outputs.lowercase }})-gpu"
4949
else
5050
BENTOFILE='bentofile.yaml'
51-
TAG='multi-tasks-nlp'
51+
TAG="$(basename ${{ steps.repository.outputs.lowercase }})"
5252
fi
5353
5454
bentoml build -f "${BENTOFILE}" && bentoml containerize "$TAG" --opt progress=plain --image-tag ${{ env.REGISTRY }}/${{ steps.repository.outputs.lowercase }}:${{ matrix.tag }}

README.md

Lines changed: 73 additions & 141 deletions
Original file line numberDiff line numberDiff line change
@@ -1,67 +1,87 @@
11
<div align="center">
2-
<h1 align="center">NLP-multi-tasks Service</h1>
2+
<h1 align="center">Transformers NLP Service</h1>
33
<br>
4-
<strong>A modular, composable, and scalable solution for building NLP services<br></strong>
4+
<strong>A modular, composable, and scalable solution for building NLP services with Transformers<br></strong>
55
<i>Powered by BentoML 🍱 + HuggingFace 🤗</i>
66
<br>
77
</div>
8-
98
<br>
109

11-
## Shortcuts
12-
13-
* [Clone me 🤗](#git-clone--recommended-)
14-
* [Running this project from a container](#container)
15-
* [Interacting with the service](#interacting-with-the-service)
16-
* [Sending requests in Python](#in-python-🐍-)
17-
* [How about calling service in JS?](#in-javascript)
18-
* [Server, Client and Inference Python API](#i-want-to-use-python-api)
19-
* [How about gRPC?](#grpc-)
20-
* [NLP tasks support](#what-if-i-want-to-add-tasks--x--)
21-
* [Container deployment](#container-deployment)
22-
* [Serverless with BentoCloud](#serverless)
23-
* [Kubernetes with Yatai](#kubernetes)
24-
25-
26-
## Let's see it in action!
27-
28-
### Git clone (recommended)
10+
## 📖 Introduction 📖
11+
- This project showcase how one can serve HuggingFace's transformers models for various NLP with ease.
12+
- It incorporates BentoML's best practices, from setting up model services and handling pre/post-processing to deployment in production.
13+
- User can explore the example endpoints such as summarization and categorization via an interactive Swagger UI.
2914

15+
## 🏃‍♂️ Running the Service 🏃‍♂️
3016
To fully take advantage of this repo, we recommend you to clone it and try out the service locally.
17+
18+
### BentoML CLI
3119
This requires Python3.8+ and `pip` installed.
3220

3321
```bash
34-
git clone https://github.com/bentoml/NLP-multi-task-service.git && cd NLP-multi-task-service
22+
git clone https://github.com/bentoml/transformers-nlp-service.git && cd transformers-nlp-service
3523

3624
pip install -r requirements/tests.txt
3725

3826
bentoml serve
3927
```
4028

41-
You can then open your browser at http://127.0.0.1:3000 to view the Swagger UI to send requests.
29+
You can then open your browser at http://127.0.0.1:3000 and interact with the service through Swagger UI.
4230

43-
### Container
31+
### Containers
4432

4533
We also provide two pre-built container to run on CPU and GPU respectively.
4634
This requires any container engine, such as docker, podman, ...
4735
You can then quickly try out the service by running the container:
4836

4937
```bash
5038
# cpu
51-
docker run -p 3000:3000 ghcr.io/bentoml/nlp-multi-task-service:cpu
39+
docker run -p 3000:3000 ghcr.io/bentoml/transformers-nlp-service:cpu
5240

5341
# gpu
54-
docker run --gpus all -p 3000:3000 ghcr.io/bentoml/nlp-multi-task-service:gpu
42+
docker run --gpus all -p 3000:3000 ghcr.io/bentoml/transformers-nlp-service:gpu
5543
```
5644

5745
> Note that to run with GPU, you will need to have [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) setup.
5846
47+
### Python API
48+
One can also use the BentoML Python API to serve their models.
5949

60-
## Interacting with the service
50+
Run the following to build a Bento within the Bento Store:
51+
```bash
52+
bentoml build
53+
```
54+
Then, start a server with `bentoml.HTTPServer`:
55+
56+
```python
57+
import bentoml
58+
59+
# Retrieve Bento from Bento Store
60+
bento = bentoml.get("transformers-nlp-service")
6161

62-
### CURL
62+
server = bentoml.HTTPServer(bento, port=3000)
63+
server.start(blocking=True)
64+
```
6365

64-
You can send requests to serivce with curl. The following example shows how to send a request to the service to summarize a text:
66+
### gRPC?
67+
If you wish to use gRPC, this project also include gRPC support. Run the following:
68+
69+
```bash
70+
bentoml serve-grpc
71+
```
72+
73+
To run the container with gRPC, do
74+
75+
```bash
76+
docker run -p 3000:3000 -p 3001:3001 ghcr.io/bentoml/nlp:cpu serve-grpc
77+
```
78+
79+
To find more information about gRPC with BentoML, refer to [our documentation](https://docs.bentoml.org/en/latest/guides/grpc.html)
80+
81+
## 🌐 Interacting with the Service 🌐
82+
The default mode of BentoML's model serving is via HTTP server. Here, we showcase a few examples of how one can interact with the service:
83+
### cURL
84+
The following example shows how to send a request to the service to summarize a text via cURL:
6585

6686
```bash
6787
curl -X 'POST' \
@@ -83,13 +103,7 @@ Celebrity stylist Law Roach on dressing Zendaya and '\''faking it '\''till you m
83103
A quill strapped across her chest, Schafer let us know she is still writing her narrative — and defining herself on her own terms. There'\''s an entire story contained in those two garments. As De Saint Sernin said in the show notes: "Thirty-six looks, each one a heartfelt sentence."
84104
The powerful ensemble may become one of Law Roach'\''s last celebrity styling credits. Roach announced over social media on Tuesday that he would be retiring from the industry after 14 years of creating conversation-driving looks for the likes of Zendaya, Bella Hadid, Anya Taylor-Joy, Ariana Grande and Megan Thee Stallion.'
85105
```
86-
87-
You can also see the OpenAPI UI at http://127.0.0.1:3000
88-
89-
![OpenAPI UI](./images/openapi.png)
90-
91-
### in Python 🐍
92-
106+
### Via BentoClient 🐍
93107
To send requests in Python, one can use ``bentoml.client.Client`` to send requests to the service:
94108

95109
```python
@@ -106,9 +120,14 @@ Run `python client.py` to see it in action.
106120

107121
> Checkout the [`client.py`](./client.py) file for more details.
108122
123+
Note that all API endpoints defined in `service.py` can be access through client through its sync and async methods. For example, the [`service.py`](./service.py) contains three endpoints: `/summarize`, `/categorize` and `/make_analysis`, and hence the following
124+
methods are available on the client instance:
109125

110-
### in Javascript
126+
- `client.async_summarize` | `client.summarize`
127+
- `client.async_categorize` | `client.categorize`
128+
- `client.async_make_analysis` | `client.make_analysis`
111129

130+
### Via Javascript
112131
You can also send requests to this service with `axios` in JS.
113132
The following example sends a request to make analysis on a given text and categories:
114133

@@ -152,125 +171,38 @@ igner Ludovic de Saint Sernin, who is renowned for his eponymous label .",
152171

153172
> Checkout the [`client.js`](./client.js) for more details.
154173
155-
## I want to use Python API.
156-
157-
BentoML also provides a Python API for serving models.
158-
159-
To start a server, use ``bentoml.HTTPServer``:
160-
161-
```python
162-
163-
import bentoml
164-
165-
bento = bentoml.get("multi-task-nlp")
166-
167-
server = bentoml.HTTPServer(bento, production=True, port=3000)
168-
server.start()
169-
```
170-
171-
To interact with this server, one can also create a client with `bentoml.client.Client`:
172-
173-
```python
174-
175-
client = bentoml.client.Client.from_url("http://127.0.0.1:3000")
176-
177-
result = client.summarize("Try to summarize this text")
178-
```
179-
180-
Note that all API endpoints defined in `service.py` can be access through client through its sync and async methods. For example, the [`service.py`](./service.py) contains three endpoints: `/summarize`, `/categorize` and `/make_analysis`, and hence the following
181-
methods are available on the client instance:
182-
183-
- `client.async_summarize` | `client.summarize`
184-
- `client.async_categorize` | `client.categorize`
185-
- `client.async_make_analysis` | `client.make_analysis`
186-
187-
## gRPC?
188-
189-
If you wish to use gRPC, this project also include gRPC support. To serve gRPC, do
190-
191-
```bash
192-
bentoml serve-grpc
193-
```
194-
195-
To run the container with gRPC, do
196-
197-
```bash
198-
docker run -p 3000:3000 -p 3001:3001 ghcr.io/bentoml/nlp:cpu serve-grpc
199-
```
200-
201-
## Testing
202-
203-
To run the tests, use `pytest`:
204-
205-
```bash
206-
pytest tests
207-
```
208-
209-
## Seems nice, but how can I customize this?
210-
174+
## ⚙️ Customization ⚙️
211175
### What if I want to add tasks *X*?
212176

213177
This project is designed to be used with different [NLP tasks](https://huggingface.co/tasks) and its corresponding models:
214178

215179
| Tasks | Example model |
216180
|------------------------------------------------------------------------------------- |----------------------------------------------------------------------------------------------------------------------------- |
217-
| - [Conversational](https://huggingface.co/tasks/conversational) | [`facebook/blenderbot-400M-distill`](https://huggingface.co/facebook/blenderbot-400M-distill) |
218-
| - [Fill-Mask](https://huggingface.co/tasks/fill-mask) | [`distilroberta-base`](https://huggingface.co/distilroberta-base) |
219-
| - [Question Answering](https://huggingface.co/tasks/question-answering) | [`deepset/roberta-base-squad2`](https://huggingface.co/deepset/roberta-base-squad2) |
220-
| - [Sentence Similarity](https://huggingface.co/tasks/sentence-similarity) | [`sentence-transformers/all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) |
221-
| - [Summarisation](https://huggingface.co/tasks/summarization) | [`sshleifer/distilbart-cnn-12-6`](https://huggingface.co/sshleifer/distilbart-cnn-12-6) [included] |
222-
| - [Table Question Answering](https://huggingface.co/tasks/table-question-answering) | [`google/tapas-base-finetuned-wtq`](https://huggingface.co/google/tapas-base-finetuned-wtq) |
223-
| - [Text Classification](https://huggingface.co/tasks/text-classification) | [`distilbert-base-uncased-finetuned-sst-2-english`](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) |
224-
| - [Text Generation](https://huggingface.co/tasks/text-generation) | [`bigscience/T0pp`](https://huggingface.co/bigscience/T0pp) |
225-
| - [Token Classification](https://huggingface.co/tasks/token-classification) | [`dslim/bert-base-NER`](https://huggingface.co/dslim/bert-base-NER) |
226-
| - [Zero-Shot Classification](https://huggingface.co/tasks/zero-shot-classification) | [`facebook/bart-large-mnli`](https://huggingface.co/facebook/bart-large-mnli) [included] |
227-
| - [Translation](https://huggingface.co/tasks/translation) | [`Helsinki-NLP/opus-mt-en-fr`](https://huggingface.co/Helsinki-NLP/opus-mt-en-fr) |
181+
| [Conversational](https://huggingface.co/tasks/conversational) | [`facebook/blenderbot-400M-distill`](https://huggingface.co/facebook/blenderbot-400M-distill) |
182+
| [Fill-Mask](https://huggingface.co/tasks/fill-mask) | [`distilroberta-base`](https://huggingface.co/distilroberta-base) |
183+
| [Question Answering](https://huggingface.co/tasks/question-answering) | [`deepset/roberta-base-squad2`](https://huggingface.co/deepset/roberta-base-squad2) |
184+
| [Sentence Similarity](https://huggingface.co/tasks/sentence-similarity) | [`sentence-transformers/all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) |
185+
| [Summarisation](https://huggingface.co/tasks/summarization) | [`sshleifer/distilbart-cnn-12-6`](https://huggingface.co/sshleifer/distilbart-cnn-12-6) [included] |
186+
| [Table Question Answering](https://huggingface.co/tasks/table-question-answering) | [`google/tapas-base-finetuned-wtq`](https://huggingface.co/google/tapas-base-finetuned-wtq) |
187+
| [Text Classification](https://huggingface.co/tasks/text-classification) | [`distilbert-base-uncased-finetuned-sst-2-english`](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) |
188+
| [Text Generation](https://huggingface.co/tasks/text-generation) | [`bigscience/T0pp`](https://huggingface.co/bigscience/T0pp) |
189+
| [Token Classification](https://huggingface.co/tasks/token-classification) | [`dslim/bert-base-NER`](https://huggingface.co/dslim/bert-base-NER) |
190+
| [Zero-Shot Classification](https://huggingface.co/tasks/zero-shot-classification) | [`facebook/bart-large-mnli`](https://huggingface.co/facebook/bart-large-mnli) [included] |
191+
| [Translation](https://huggingface.co/tasks/translation) | [`Helsinki-NLP/opus-mt-en-fr`](https://huggingface.co/Helsinki-NLP/opus-mt-en-fr) |
228192

229193
### Where can I add models?
230-
231194
You can add more tasks and models by editing the `download_model.py` file.
232195

233196
### Where can I add API logics?
234-
235197
Pre/post processing logics can be set in the `service.py` file.
236198

237-
### Where can I find more docs about Transformers and BentoML?
238199

200+
### Where can I find more docs about Transformers and BentoML?
239201
BentoML supports Transformers models out of the box. You can find more details in the [BentoML support](https://docs.bentoml.org/en/latest/frameworks/transformers.html) for [Transformers](https://huggingface.co/docs/transformers/index).
240202

203+
## 🚀 Bringing it to Production 🚀
204+
BentoML offers a number of options for deploying and hosting online ML services into production, learn more at [Deploying Bento Docs](https://docs.bentoml.org/en/latest/concepts/deploy.html).
241205

242-
## How can I deploy this to production?
243-
244-
We have a few options for you to deploy this service to production:
245-
246-
### Container deployment
247-
248-
If you wish to deploy this as a container, you can use the following:
249-
250-
```bash
251-
bentoml build && bentoml containerize multi-tasks-nlp --opt platform=linux/amd64
252-
```
253-
254-
To build the container with GPU support, you can use the following:
255-
256-
```bash
257-
bentoml build -f bentofile.gpu.yaml && bentoml containerize multi-tasks-nlp-gpu --opt platform=linux/amd64
258-
```
259-
260-
### Serverless
261-
262-
Checkout [☁️ BentoCloud](https://www.bentoml.com/bento-cloud/), the serverless cloud for AI applications
263-
264-
### Kubernetes
265-
266-
For Kubernetes, [🦄️ Yatai](https://github.com/bentoml/Yatai) gets the best performance and scalability for your AI workloads
267-
268-
### Cloud platforms
269-
270-
To deploy on cloud services, such as EC2, Sagemaker or Azure function, checkout [🚀 bentoctl](https://github.com/bentoml/bentoctl)
271-
272-
273-
## Community 💬
274-
206+
## 👥 Community 👥
275207
BentoML has a thriving open source community where thousands of ML/AI practitioners are
276-
contributing to the project, helping other users and discussing the future of AI. 👉 [Pop into our Slack community!](https://l.bentoml.com/join-slack)
208+
contributing to the project, helping other users and discussing the future of AI. 👉 [Pop into our Slack community!](https://l.bentoml.com/join-slack)

bentofile.gpu.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
service: 'service.py:svc'
2-
name: multi-tasks-nlp-gpu
2+
name: transformers-nlp-service-gpu
33
labels:
44
owner: bentoml-team
5-
project: multi-tasks-nlp
5+
project: transformers-nlp-service
66
include:
77
- '*.py'
88
- '/tests'

bentofile.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
service: 'service.py:svc'
2-
name: multi-tasks-nlp
2+
name: transformers-nlp-service
33
labels:
44
owner: bentoml-team
5-
project: multi-tasks-nlp
5+
project: transformers-nlp-service
66
include:
77
- '*.py'
88
- '/tests'

service.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
categorizer_runner = get_runner("zero-shot-classification", classification_model)
1919

2020
svc = bentoml.Service(
21-
name="multi-tasks-nlp", runners=[summarizer_runner, categorizer_runner]
21+
name="transformers-nlp-service", runners=[summarizer_runner, categorizer_runner]
2222
)
2323

2424

0 commit comments

Comments
 (0)