Deep Infra

All

22 repositories

tensorrtllm_backend
Public
The Triton TensorRT-LLM Backend
Python
•
Apache License 2.0
•107•0•0•0•Updated Nov 19, 2024Nov 19, 2024
TensorRT-LLM
Public
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
C++
•
Apache License 2.0
•996•0•0•0•Updated Nov 19, 2024Nov 19, 2024
vllm
Public
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
•
Apache License 2.0
•4.6k•0•0•5•Updated Nov 6, 2024Nov 6, 2024
Pyramid-Flow
Public
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Python
•
MIT License
•232•0•0•0•Updated Oct 21, 2024Oct 21, 2024
llama-stack
Public
Model components of the Llama Stack APIs
Python
•
MIT License
•586•0•0•0•Updated Oct 10, 2024Oct 10, 2024
ngx-http-auth-jwt-module
Public
Secure your NGINX locations with JWT
Shell
•
MIT License
•123•0•0•0•Updated Jun 17, 2024Jun 17, 2024
deepctl
Public
Command line tool for Deep Infra cloud ML inference service
Rust
•
Apache License 2.0
•1•26•1•0•Updated Jun 10, 2024Jun 10, 2024
langchainjs
Public
🦜🔗 Build context-aware reasoning applications 🦜🔗
TypeScript
•
MIT License
•2.2k•0•0•0•Updated May 31, 2024May 31, 2024
deepinfra-node
Public
Official TypeScript wrapper for DeepInfra Inference API
javascript api wrapper typescript deep-learning api-client llm llm-inference
TypeScript
•
MIT License
•0•8•4•1•Updated May 13, 2024May 13, 2024
lm-evaluation-harness
Public
A framework for few-shot evaluation of language models.
Python
•
MIT License
•1.9k•0•0•0•Updated Apr 29, 2024Apr 29, 2024
langchain
Public
⚡ Building applications with LLMs through composability ⚡
Python
•
MIT License
•15k•1•0•0•Updated Jan 22, 2024Jan 22, 2024
litellm
Public
Call all LLM APIs using the OpenAI format. Use Azure, OpenAI, Cohere, Anthropic, Ollama, VLLM, Sagemaker, HuggingFace, Replicate (100+ LLMs)
Python
•
MIT License
•1.7k•0•0•0•Updated Jan 8, 2024Jan 8, 2024
text-generation-inference
Public
Large Language Model Text Generation Inference
Python
•
Apache License 2.0
•1.1k•9•0•6•Updated Dec 15, 2023Dec 15, 2023
fetch-stream-parser
Public
fetch-stream
JavaScript
•
Apache License 2.0
•1•0•0•0•Updated Nov 6, 2023Nov 6, 2023
fetch-event-source
Public
A better API for making Event Source requests, with all the features of fetch()
TypeScript
•
MIT License
•141•0•0•0•Updated Aug 18, 2023Aug 18, 2023
cog
Public
Containers for machine learning
Go
•
Apache License 2.0
•564•0•0•0•Updated Aug 1, 2023Aug 1, 2023
cog-llama-2
Public
A cog for running llama-2 using llama.cpp server
Python
•0•0•0•0•Updated Aug 1, 2023Aug 1, 2023
transformers
Public
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python
•
Apache License 2.0
•27k•0•0•0•Updated Jul 24, 2023Jul 24, 2023
superfans-gpu-controller
Public
NVIDIA GPU-based FAN controller for SUPERMICRO server
Python
•
MIT License
•3•0•0•0•Updated Apr 25, 2023Apr 25, 2023
whisper-timestamped
Public
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Python
•
GNU Affero General Public License v3.0
•156•0•0•0•Updated Mar 7, 2023Mar 7, 2023
sentence-transformers
Public
Multilingual Sentence & Image Embeddings with BERT
Python
•
Apache License 2.0
•2.5k•0•0•0•Updated Feb 28, 2023Feb 28, 2023
full-stack-deep-learning-website
Public
Source for https://fullstackdeeplearning.com
HTML
•204•0•0•0•Updated Feb 14, 2023Feb 14, 2023