llm-serving

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

Updated Nov 24, 2024
Python

sgl-project / sglang

Star

SGLang is a fast serving framework for large language models and vision language models.

cuda inference pytorch transformer moe llama vlm llm llm-serving llava llama2 llama3 llama3-1

Updated Nov 24, 2024
Python

superduper-io / superduper

Star

Superduper: Build end-to-end AI applications and agent workflows on your existing data infrastructure and preferred tools - without migrating your data.

Updated Nov 22, 2024
Python

predibase / lorax

Star

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

transformers pytorch llama gpt lora model-serving fine-tuning llm llmops llm-serving llm-inference

Updated Nov 23, 2024
Python

microsoft / aici

Star

AICI: Prompts as (Wasm) Programs

rust ai wasm inference transformer language-model model-serving wasmtime llm llmops llm-serving llm-inference llm-framework

Updated Nov 10, 2024
Rust

ray-project / ray-llm

Star

RayLLM - LLMs on Ray

distributed-systems transformers ray serving large-language-models llm llmops llm-serving llm-inference

Updated May 28, 2024
Python

mosecorg / mosec

Star

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

python rust machine-learning deep-learning mxnet tensorflow gpu cv pytorch tts hacktoberfest model-serving nerual-network machine-learning-platform jax mlops llm llm-serving

Updated Nov 23, 2024
Python

efeslab / Nanoflow

Star

A throughput-oriented high-performance serving framework for LLMs

cuda inference model-serving llm llm-serving llama2

Updated Sep 21, 2024
Cuda

alibaba / rtp-llm

Star

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

inference llama gpt model-serving llm llmops llm-serving

Updated Oct 14, 2024
C++

rohan-paul / LLM-FineTuning-Large-Language-Models

Star

LLM (Large Language Model) FineTuning

pytorch gpt-3 large-language-models llm llm-serving gpt3-turbo llm-training llm-inference open-source-llm llama2 llm-finetuning mistral-7b

Updated May 19, 2024
Jupyter Notebook

hpcaitech / SwiftInfer

Star

Efficient AI Inference & Serving

deep-learning inference artificial-intelligence llama gpt llm-serving llm-inference llama2

Updated Jan 8, 2024
Python

Multi-node production GenAI stack. Run the best of open source AI easily on your own servers. Easily add knowledge from documents and scrape websites. Create your own AI by fine-tuning open source models. Integrate LLMs with APIs. Run gptscript securely on the server

Updated Nov 24, 2024
Go

ray-project / ray-educational-materials

Star

This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.

deep-learning ray distributed-machine-learning ray-tune ray-train ray-distributed llm generative-ai ray-serve ray-data llm-serving llm-inference

Updated Feb 13, 2024
Jupyter Notebook

galeselee / Awesome_LLM_System-PaperList

Star

Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on inference acceleration, and related works will be gradually added in the future. Welcome contributions!

system papers paperlist llm-serving llm-inference

Updated Nov 21, 2024

substratusai / runbooks

Star

Finetune LLMs on K8s by using Runbooks

kubernetes kubernetes-operator mlops ml-platform llmops llm-serving llm-training llm-inference

Updated Aug 28, 2024
Go

Improve this page

Add a description, image, and links to the llm-serving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-serving topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-serving

Here are 73 public repositories matching this topic...

ray-project / ray

vllm-project / vllm

liguodongiot / llm-action

bentoml / OpenLLM

bentoml / BentoML

skypilot-org / skypilot

sgl-project / sglang

superduper-io / superduper

predibase / lorax

microsoft / aici

ray-project / ray-llm

mosecorg / mosec

efeslab / Nanoflow

alibaba / rtp-llm

rohan-paul / LLM-FineTuning-Large-Language-Models

hpcaitech / SwiftInfer

helixml / helix

ray-project / ray-educational-materials

galeselee / Awesome_LLM_System-PaperList

substratusai / runbooks

Improve this page

Add this topic to your repo