Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
AgrawalAmey authored Aug 5, 2024
1 parent e8d0781 commit 28d84e0
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Sarathi-Serve

This is the official OSDI'24 artifact submission for paper #444, "Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve”.
Sarathi-Serve is a high througput and low-latency LLM serving framework. Please refer to our [OSDI'24 paper](https://www.usenix.org/conference/osdi24/presentation/agrawal) for more details.

## Setup

### Setup CUDA

Sarathi-Serve has been tested with CUDA 12.1 on A100 and A40 GPUs.
Sarathi-Serve has been tested with CUDA 12.3 on H100 and A100 GPUs.

### Clone repository

Expand Down

0 comments on commit 28d84e0

Please sign in to comment.