New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Adding Triton backend support #537

Merged

aspctu merged 21 commits into main from abuqader/adding-triton-support

Aug 24, 2023

Collaborator

aspctu commented Aug 10, 2023 •

edited

Loading

Overview

This PR adds support for Triton as a backend for Truss. Specifically, this PR contains logic for

Building Docker images that run user code within Triton
Generating the appropriate configs for Triton based on a user's config.yaml and model.py
Handling the conversion between Triton types and Python types

Logic around testing this flow will be in a follow-up PR. It's a significant testing suite and requires running tests within the Triton docker container.

Quickstart

Quickstart repo here

git clone https://github.com/aspctu/bert-triton-truss
truss image build-context ./bert-truss-context ./bert-truss
cd ./bert-truss-context
docker build ./
docker run --gpus=all -p8080:8080 -p8000:8000 -it (image id)

Follow the README.md in the repo above to invoke the model.

Introduction

Triton is a high-performance model serving backend developed by NVidia. For most models (outside of LLMs), it's advantageous to use Triton as the backend server. This is due to various server features that are attractive to maximizing GPU utilization and memory.

This PR introduces a simplified developer experience to enabling users to tap into some of this functionality within Truss. It's worth nothing that there is a lot of functionality in Triton that is not supported here (such as decoupled mode or ensemble models).

To enable Triton, a user needs to do a couple things:

Update their config.yaml to contain the following information (automatically done if the truss is created via truss init)

build:
  model_server: TRITON

Define 2 Pydantic classes in their model.py that correspond to the Input and Output of their model (example below)

from pydantic import BaseModel, conlist

class Input(BaseModel):
    text: str

class Output(BaseModel):
    text: str
    embedding: conlist(float, min_length=768, max_length=768)

...

Update their predict function to accept a List[Input] and produce a List[Output]

def predict(inputs: List[Input]) -> List[Output]:
...

TODOs

Parametrize GPU / CPU deployment in config.pbtxt
Look into truss image build failing to do anything

aspctu added 8 commits

August 9, 2023 23:05


          init

c01e030


          cleanup

c48fa1d


          Merge branch 'main' into abuqader/adding-triton-support

77d4fed


          increment version

00de8fa


          adding batching

c5b912e


          incr version

56c4c19


          adding shm size

f7a4daf


          incr version

3e73bf0

bolasim reviewed

View reviewed changes

Collaborator

bolasim left a comment

lgtm. just a few comments/questions. Lemme know when it's ready for review.

truss/templates/triton/config.pbtxt.jinja Outdated Show resolved Hide resolved

truss/templates/triton/config.pbtxt.jinja Outdated Show resolved Hide resolved

truss/templates/triton/model/1/model_wrapper.py Outdated Show resolved Hide resolved

truss/templates/triton/model/1/model_wrapper.py Outdated Show resolved Hide resolved

truss/templates/triton/proxy.conf Outdated Show resolved Hide resolved

truss/templates/triton/proxy.conf Outdated Show resolved Hide resolved

aspctu added 3 commits

August 16, 2023 05:55


          refactor

4181bfb


          removing print

bb704a0


          docstrings

ba89bb6

aspctu requested review from pankajroark, squidarth, bolasim, varunshenoy and joostinyi

August 16, 2023 06:30

aspctu marked this pull request as ready for review

August 16, 2023 06:30

aspctu changed the title ~~(WIP) Triton~~ Adding Triton backend support

squidarth reviewed

View reviewed changes

Collaborator

squidarth left a comment

Took a first pass. Overall, I like the structure that we went with here!

Nits aside, main comments here are around error-handling (and my own questions about what kind of assumptions are fair to make about the input)

truss/templates/triton/model/triton_model_wrapper.py Outdated Show resolved Hide resolved

truss/templates/triton/model/utils/pydantic.py Outdated Show resolved Hide resolved

truss/templates/triton/model/triton_model_wrapper.py Show resolved Hide resolved

truss/templates/triton/model/transform.py Outdated Show resolved Hide resolved

truss/templates/triton/model/transform.py Show resolved Hide resolved

truss/templates/triton/root/generate_config.py Show resolved Hide resolved

truss/templates/triton/model/triton_model_wrapper.py Show resolved Hide resolved

truss/templates/triton/model/triton_model_wrapper.py Outdated Show resolved Hide resolved

truss/templates/triton/model/utils/pydantic.py Outdated Show resolved Hide resolved

truss/templates/triton/model/utils/triton.py Outdated Show resolved Hide resolved

bolasim reviewed

View reviewed changes

truss/contexts/image_builder/serving_image_builder.py Outdated Show resolved Hide resolved

truss/contexts/image_builder/serving_image_builder.py Show resolved Hide resolved

truss/templates/triton/model/model.py Outdated Show resolved Hide resolved

truss/templates/triton/model/triton_model_wrapper.py Show resolved Hide resolved

truss/templates/triton/root/Dockerfile Outdated Show resolved Hide resolved

joostinyi requested changes

View reviewed changes

truss/templates/triton/model/transform.py Outdated Show resolved Hide resolved

truss/templates/triton/model/transform.py Show resolved Hide resolved

truss/templates/triton/model/triton_model_wrapper.py Show resolved Hide resolved

truss/templates/triton/model/utils/triton.py Outdated Show resolved Hide resolved

truss/templates/triton/root/config.pbtxt.jinja Show resolved Hide resolved

truss/templates/triton/root/generate_config.py Outdated Show resolved Hide resolved

truss/templates/triton/root/generate_config.py Outdated Show resolved Hide resolved

aspctu added 7 commits

August 18, 2023 07:03


          addressing comments

566061e


          addressing comments

4f6f997


          addressing comments

beb57ad


          fixing types error

e5d86cb


          Merge branch 'main' into abuqader/adding-triton-support

6fb2cb0


          adding clearer errors

dfef8a0


          updating dict support

cd4a882

aspctu requested a review from squidarth

August 18, 2023 16:14

aspctu requested review from joostinyi and bolasim

August 18, 2023 16:14

squidarth reviewed

View reviewed changes

truss/templates/triton/root/generate_config.py Outdated Show resolved Hide resolved

truss/templates/triton/root/generate_config.py Outdated Show resolved Hide resolved

truss/templates/triton/root/generate_config.py Show resolved Hide resolved

bolasim approved these changes

View reviewed changes

Collaborator

bolasim left a comment

let's merge this and use it then iterate on smaller PRs

joostinyi approved these changes

View reviewed changes

aspctu added 2 commits

August 24, 2023 15:46


          cleaning up code, refactoring, adding docs and error codes

4a7b346


          Merge branch 'main' into abuqader/adding-triton-support

cf86e17

aspctu enabled auto-merge (squash)

August 24, 2023 15:52

aspctu disabled auto-merge

August 24, 2023 15:52


          fixing doc lint

57bdab1

aspctu merged commit 8b9d302 into main

aspctu deleted the abuqader/adding-triton-support branch

August 24, 2023 16:22

Member

amiruci commented Aug 24, 2023

🤩 🤩 🤩

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

squidarth squidarth left review comments

bolasim bolasim approved these changes

joostinyi joostinyi approved these changes

pankajroark Awaiting requested review from pankajroark

varunshenoy Awaiting requested review from varunshenoy

Labels

None yet