docs: update general README structure (#5)

jianshen92 · aarnphm · parano · web-flow · commit aa3e947632b4 · 2023-05-09T16:47:55.000-07:00
Co-authored-by: Aaron Pham &lt;29749331+aarnphm@users.noreply.github.com&gt;
Co-authored-by: Chaoyu &lt;paranoyang@gmail.com&gt;
diff --git a/.github/workflows/docker-push.yml b/.github/workflows/docker-push.yml
@@ -45,10 +45,10 @@ jobs:
         run: |
           if [ "${{ matrix.tag }}" == 'gpu' ]; then
             BENTOFILE='bentofile.gpu.yaml'
-            TAG='multi-tasks-nlp-gpu'
+            TAG="$(basename ${{ steps.repository.outputs.lowercase }})-gpu"
           else
             BENTOFILE='bentofile.yaml'
-            TAG='multi-tasks-nlp'
+            TAG="$(basename ${{ steps.repository.outputs.lowercase }})"
           fi
 
           bentoml build -f "${BENTOFILE}" && bentoml containerize "$TAG" --opt progress=plain --image-tag ${{ env.REGISTRY }}/${{ steps.repository.outputs.lowercase }}:${{ matrix.tag }}
diff --git a/README.md b/README.md
@@ -1,67 +1,87 @@
 <div align="center">
-    <h1 align="center">NLP-multi-tasks Service</h1>
+    <h1 align="center">Transformers NLP Service</h1>
     <br>
-    <strong>A modular, composable, and scalable solution for building NLP services<br></strong>
+    <strong>A modular, composable, and scalable solution for building NLP services with Transformers<br></strong>
     <i>Powered by BentoML 🍱 + HuggingFace 🤗</i>
     <br>
 </div>
-
 <br>
 
-## Shortcuts
-
-* [Clone me 🤗](#git-clone--recommended-)
-* [Running this project from a container](#container)
-* [Interacting with the service](#interacting-with-the-service)
-* [Sending requests in Python](#in-python-🐍-)
-* [How about calling service in JS?](#in-javascript)
-* [Server, Client and Inference Python API](#i-want-to-use-python-api)
-* [How about gRPC?](#grpc-)
-* [NLP tasks support](#what-if-i-want-to-add-tasks--x--)
-* [Container deployment](#container-deployment)
-* [Serverless with BentoCloud](#serverless)
-* [Kubernetes with Yatai](#kubernetes)
-
-
-## Let's see it in action!
-
-### Git clone (recommended)
+## 📖 Introduction 📖
+- This project showcase how one can serve HuggingFace's transformers models for various NLP with ease.
+- It incorporates BentoML's best practices, from setting up model services and handling pre/post-processing to deployment in production.
+- User can explore the example endpoints such as summarization and categorization via an interactive Swagger UI.
 
+## 🏃‍♂️ Running the Service 🏃‍♂️
 To fully take advantage of this repo, we recommend you to clone it and try out the service locally. 
+
+### BentoML CLI
 This requires Python3.8+ and `pip` installed.
 
 ```bash
-git clone https://github.com/bentoml/NLP-multi-task-service.git && cd NLP-multi-task-service
+git clone https://github.com/bentoml/transformers-nlp-service.git && cd transformers-nlp-service
 
 pip install -r requirements/tests.txt
 
 bentoml serve
 ```
 
-You can then open your browser at http://127.0.0.1:3000 to view the Swagger UI to send requests.
+You can then open your browser at http://127.0.0.1:3000 and interact with the service through Swagger UI.
 
-### Container
+### Containers
 
 We also provide two pre-built container to run on CPU and GPU respectively. 
 This requires any container engine, such as docker, podman, ...
 You can then quickly try out the service by running the container:
 
 ```bash
 # cpu
-docker run -p 3000:3000 ghcr.io/bentoml/nlp-multi-task-service:cpu
+docker run -p 3000:3000 ghcr.io/bentoml/transformers-nlp-service:cpu
 
 # gpu
-docker run --gpus all -p 3000:3000 ghcr.io/bentoml/nlp-multi-task-service:gpu
+docker run --gpus all -p 3000:3000 ghcr.io/bentoml/transformers-nlp-service:gpu
 ```
 
 > Note that to run with GPU, you will need to have [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) setup.
 
+### Python API
+One can also use the BentoML Python API to serve their models.
 
-## Interacting with the service
+Run the following to build a Bento within the Bento Store:
+```bash
+bentoml build
+```
+Then, start a server with `bentoml.HTTPServer`:
+
+```python
+import bentoml
+
+# Retrieve Bento from Bento Store
+bento = bentoml.get("transformers-nlp-service")
 
-### CURL
+server = bentoml.HTTPServer(bento, port=3000)
+server.start(blocking=True)
+```
 
-You can send requests to serivce with curl. The following example shows how to send a request to the service to summarize a text:
+### gRPC?
+If you wish to use gRPC, this project also include gRPC support. Run the following:
+
+```bash
+bentoml serve-grpc
+```
+
+To run the container with gRPC, do
+
+```bash
+docker run -p 3000:3000 -p 3001:3001 ghcr.io/bentoml/nlp:cpu serve-grpc
+``` 
+
+To find more information about gRPC with BentoML, refer to [our documentation](https://docs.bentoml.org/en/latest/guides/grpc.html)
+
+## 🌐 Interacting with the Service 🌐
+The default mode of BentoML's model serving is via HTTP server. Here, we showcase a few examples of how one can interact with the service:
+### cURL
+The following example shows how to send a request to the service to summarize a text via cURL:
 
 ```bash
 curl -X 'POST' \
@@ -83,13 +103,7 @@ Celebrity stylist Law Roach on dressing Zendaya and '\''faking it '\''till you m
 A quill strapped across her chest, Schafer let us know she is still writing her narrative — and defining herself on her own terms. There'\''s an entire story contained in those two garments. As De Saint Sernin said in the show notes: "Thirty-six looks, each one a heartfelt sentence."
 The powerful ensemble may become one of Law Roach'\''s last celebrity styling credits. Roach announced over social media on Tuesday that he would be retiring from the industry after 14 years of creating conversation-driving looks for the likes of Zendaya, Bella Hadid, Anya Taylor-Joy, Ariana Grande and Megan Thee Stallion.'
 ```
-
-You can also see the OpenAPI UI at http://127.0.0.1:3000
-
-![OpenAPI UI](./images/openapi.png)
-
-### in Python 🐍
-
+### Via BentoClient 🐍
 To send requests in Python, one can use ``bentoml.client.Client`` to send requests to the service:
 
 ```python
@@ -106,9 +120,14 @@ Run `python client.py` to see it in action.
 
 > Checkout the [`client.py`](./client.py) file for more details.
 
+Note that all API endpoints defined in `service.py` can be access through client through its sync and async methods. For example, the [`service.py`](./service.py) contains three endpoints: `/summarize`, `/categorize` and `/make_analysis`, and hence the following
+methods are available on the client instance:
 
-### in Javascript
+- `client.async_summarize` | `client.summarize`
+- `client.async_categorize` | `client.categorize`
+- `client.async_make_analysis` | `client.make_analysis`
 
+### Via Javascript
 You can also send requests to this service with `axios` in JS. 
 The following example sends a request to make analysis on a given text and categories:
 
@@ -152,125 +171,38 @@ igner Ludovic de Saint Sernin, who is renowned for his eponymous label .",
 
 > Checkout the [`client.js`](./client.js) for more details.
 
-## I want to use Python API.
-
-BentoML also provides a Python API for serving models.
-
-To start a server, use ``bentoml.HTTPServer``:
-
-```python
-
-import bentoml
-
-bento = bentoml.get("multi-task-nlp")
-
-server = bentoml.HTTPServer(bento, production=True, port=3000)
-server.start()
-```
-
-To interact with this server, one can also create a client with `bentoml.client.Client`:
-
-```python
-
-client = bentoml.client.Client.from_url("http://127.0.0.1:3000")
-
-result = client.summarize("Try to summarize this text")
-```
-
-Note that all API endpoints defined in `service.py` can be access through client through its sync and async methods. For example, the [`service.py`](./service.py) contains three endpoints: `/summarize`, `/categorize` and `/make_analysis`, and hence the following
-methods are available on the client instance:
-
-- `client.async_summarize` | `client.summarize`
-- `client.async_categorize` | `client.categorize`
-- `client.async_make_analysis` | `client.make_analysis`
-
-## gRPC?
-
-If you wish to use gRPC, this project also include gRPC support. To serve gRPC, do
-
-```bash
-bentoml serve-grpc
-```
-
-To run the container with gRPC, do
-
-```bash
-docker run -p 3000:3000 -p 3001:3001 ghcr.io/bentoml/nlp:cpu serve-grpc
-```
-
-## Testing
-
-To run the tests, use `pytest`:
-
-```bash
-pytest tests
-```
-
-## Seems nice, but how can I customize this?
-
+## ⚙️ Customization ⚙️
 ### What if I want to add tasks *X*?
 
 This project is designed to be used with different [NLP tasks](https://huggingface.co/tasks) and its corresponding models:
 
 | Tasks                                                                               	| Example model                                                                                                               	|
 |-------------------------------------------------------------------------------------	|-----------------------------------------------------------------------------------------------------------------------------	|
-| - [Conversational](https://huggingface.co/tasks/conversational)                     	| [`facebook/blenderbot-400M-distill`](https://huggingface.co/facebook/blenderbot-400M-distill)                               	|
-| - [Fill-Mask](https://huggingface.co/tasks/fill-mask)                               	| [`distilroberta-base`](https://huggingface.co/distilroberta-base)                                                           	|
-| - [Question Answering](https://huggingface.co/tasks/question-answering)             	| [`deepset/roberta-base-squad2`](https://huggingface.co/deepset/roberta-base-squad2)                                         	|
-| - [Sentence Similarity](https://huggingface.co/tasks/sentence-similarity)           	| [`sentence-transformers/all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)                   	|
-| - [Summarisation](https://huggingface.co/tasks/summarization)                       	| [`sshleifer/distilbart-cnn-12-6`](https://huggingface.co/sshleifer/distilbart-cnn-12-6) [included]                          	|
-| - [Table Question Answering](https://huggingface.co/tasks/table-question-answering) 	| [`google/tapas-base-finetuned-wtq`](https://huggingface.co/google/tapas-base-finetuned-wtq)                                 	|
-| - [Text Classification](https://huggingface.co/tasks/text-classification)           	| [`distilbert-base-uncased-finetuned-sst-2-english`](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) 	|
-| - [Text Generation](https://huggingface.co/tasks/text-generation)                   	| [`bigscience/T0pp`](https://huggingface.co/bigscience/T0pp)                                                                 	|
-| - [Token Classification](https://huggingface.co/tasks/token-classification)         	| [`dslim/bert-base-NER`](https://huggingface.co/dslim/bert-base-NER)                                                         	|
-| - [Zero-Shot Classification](https://huggingface.co/tasks/zero-shot-classification) 	| [`facebook/bart-large-mnli`](https://huggingface.co/facebook/bart-large-mnli) [included]                                    	|
-| - [Translation](https://huggingface.co/tasks/translation)                           	| [`Helsinki-NLP/opus-mt-en-fr`](https://huggingface.co/Helsinki-NLP/opus-mt-en-fr)                                           	|
+| [Conversational](https://huggingface.co/tasks/conversational)                     	| [`facebook/blenderbot-400M-distill`](https://huggingface.co/facebook/blenderbot-400M-distill)                               	|
+| [Fill-Mask](https://huggingface.co/tasks/fill-mask)                               	| [`distilroberta-base`](https://huggingface.co/distilroberta-base)                                                           	|
+| [Question Answering](https://huggingface.co/tasks/question-answering)             	| [`deepset/roberta-base-squad2`](https://huggingface.co/deepset/roberta-base-squad2)                                         	|
+| [Sentence Similarity](https://huggingface.co/tasks/sentence-similarity)           	| [`sentence-transformers/all-MiniLM-L6-v2`](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)                   	|
+| [Summarisation](https://huggingface.co/tasks/summarization)                       	| [`sshleifer/distilbart-cnn-12-6`](https://huggingface.co/sshleifer/distilbart-cnn-12-6) [included]                          	|
+| [Table Question Answering](https://huggingface.co/tasks/table-question-answering) 	| [`google/tapas-base-finetuned-wtq`](https://huggingface.co/google/tapas-base-finetuned-wtq)                                 	|
+| [Text Classification](https://huggingface.co/tasks/text-classification)           	| [`distilbert-base-uncased-finetuned-sst-2-english`](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) 	|
+| [Text Generation](https://huggingface.co/tasks/text-generation)                   	| [`bigscience/T0pp`](https://huggingface.co/bigscience/T0pp)                                                                 	|
+| [Token Classification](https://huggingface.co/tasks/token-classification)         	| [`dslim/bert-base-NER`](https://huggingface.co/dslim/bert-base-NER)                                                         	|
+| [Zero-Shot Classification](https://huggingface.co/tasks/zero-shot-classification) 	| [`facebook/bart-large-mnli`](https://huggingface.co/facebook/bart-large-mnli) [included]                                    	|
+| [Translation](https://huggingface.co/tasks/translation)                           	| [`Helsinki-NLP/opus-mt-en-fr`](https://huggingface.co/Helsinki-NLP/opus-mt-en-fr)                                           	|
 
 ### Where can I add models?
-
 You can add more tasks and models by editing the `download_model.py` file.
 
 ### Where can I add API logics?
-
 Pre/post processing logics can be set in the `service.py` file.
 
-### Where can I find more docs about Transformers and BentoML?
 
+### Where can I find more docs about Transformers and BentoML?
 BentoML supports Transformers models out of the box. You can find more details in the [BentoML support](https://docs.bentoml.org/en/latest/frameworks/transformers.html) for [Transformers](https://huggingface.co/docs/transformers/index).
 
+## 🚀 Bringing it to Production 🚀
+BentoML offers a number of options for deploying and hosting online ML services into production, learn more at [Deploying Bento Docs](https://docs.bentoml.org/en/latest/concepts/deploy.html).
 
-## How can I deploy this to production?
-
-We have a few options for you to deploy this service to production:
-
-### Container deployment
-
-If you wish to deploy this as a container, you can use the following:
-
-```bash
-bentoml build && bentoml containerize multi-tasks-nlp --opt platform=linux/amd64
-```
-
-To build the container with GPU support, you can use the following:
-
-```bash
-bentoml build -f bentofile.gpu.yaml && bentoml containerize multi-tasks-nlp-gpu --opt platform=linux/amd64
-```
-
-### Serverless
-
-Checkout [☁️ BentoCloud](https://www.bentoml.com/bento-cloud/), the serverless cloud for AI applications
-
-### Kubernetes
-
-For Kubernetes, [🦄️ Yatai](https://github.com/bentoml/Yatai) gets the best performance and scalability for your AI workloads
-
-### Cloud platforms
-
-To deploy on cloud services, such as EC2, Sagemaker or Azure function, checkout [🚀 bentoctl](https://github.com/bentoml/bentoctl)
-
-
-## Community 💬
-
+## 👥 Community 👥
 BentoML has a thriving open source community where thousands of ML/AI practitioners are 
-contributing to the project, helping other users and discussing the future of AI. 👉 [Pop into our Slack community!](https://l.bentoml.com/join-slack)
+contributing to the project, helping other users and discussing the future of AI. 👉 [Pop into our Slack community!](https://l.bentoml.com/join-slack)
diff --git a/bentofile.gpu.yaml b/bentofile.gpu.yaml
@@ -1,8 +1,8 @@
 service: 'service.py:svc'
-name: multi-tasks-nlp-gpu
+name: transformers-nlp-service-gpu
 labels:
   owner: bentoml-team
-  project: multi-tasks-nlp
+  project: transformers-nlp-service
 include:
   - '*.py'
   - '/tests'
diff --git a/bentofile.yaml b/bentofile.yaml
@@ -1,8 +1,8 @@
 service: 'service.py:svc'
-name: multi-tasks-nlp
+name: transformers-nlp-service
 labels:
   owner: bentoml-team
-  project: multi-tasks-nlp
+  project: transformers-nlp-service
 include:
   - '*.py'
   - '/tests'
diff --git a/service.py b/service.py
@@ -18,7 +18,7 @@
 categorizer_runner = get_runner("zero-shot-classification", classification_model)
 
 svc = bentoml.Service(
-    name="multi-tasks-nlp", runners=[summarizer_runner, categorizer_runner]
+    name="transformers-nlp-service", runners=[summarizer_runner, categorizer_runner]
 )
 
 

Original file line number	Diff line number	Diff line change
`@@ -18,7 +18,7 @@`
`18`	`18`	`categorizer_runner = get_runner("zero-shot-classification", classification_model)`
`19`	`19`
`20`	`20`	`svc = bentoml.Service(`
`21`		`- name="multi-tasks-nlp", runners=[summarizer_runner, categorizer_runner]`
	`21`	`+ name="transformers-nlp-service", runners=[summarizer_runner, categorizer_runner]`
`22`	`22`	`)`
`23`	`23`
`24`	`24`