Skip to content
Change the repository type filter

All

    Repositories list

    • The Triton TensorRT-LLM Backend
      Python
      Apache License 2.0
      107000Updated Nov 19, 2024Nov 19, 2024
    • TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
      C++
      Apache License 2.0
      996000Updated Nov 19, 2024Nov 19, 2024
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      4.6k005Updated Nov 6, 2024Nov 6, 2024
    • Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
      Python
      MIT License
      232000Updated Oct 21, 2024Oct 21, 2024
    • Model components of the Llama Stack APIs
      Python
      MIT License
      586000Updated Oct 10, 2024Oct 10, 2024
    • Secure your NGINX locations with JWT
      Shell
      MIT License
      123000Updated Jun 17, 2024Jun 17, 2024
    • deepctl

      Public
      Command line tool for Deep Infra cloud ML inference service
      Rust
      Apache License 2.0
      12610Updated Jun 10, 2024Jun 10, 2024
    • 🦜🔗 Build context-aware reasoning applications 🦜🔗
      TypeScript
      MIT License
      2.2k000Updated May 31, 2024May 31, 2024
    • Official TypeScript wrapper for DeepInfra Inference API
      TypeScript
      MIT License
      0841Updated May 13, 2024May 13, 2024
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      1.9k000Updated Apr 29, 2024Apr 29, 2024
    • langchain

      Public
      ⚡ Building applications with LLMs through composability ⚡
      Python
      MIT License
      15k100Updated Jan 22, 2024Jan 22, 2024
    • litellm

      Public
      Call all LLM APIs using the OpenAI format. Use Azure, OpenAI, Cohere, Anthropic, Ollama, VLLM, Sagemaker, HuggingFace, Replicate (100+ LLMs)
      Python
      MIT License
      1.7k000Updated Jan 8, 2024Jan 8, 2024
    • Large Language Model Text Generation Inference
      Python
      Apache License 2.0
      1.1k906Updated Dec 15, 2023Dec 15, 2023
    • fetch-stream
      JavaScript
      Apache License 2.0
      1000Updated Nov 6, 2023Nov 6, 2023
    • A better API for making Event Source requests, with all the features of fetch()
      TypeScript
      MIT License
      141000Updated Aug 18, 2023Aug 18, 2023
    • cog

      Public
      Containers for machine learning
      Go
      Apache License 2.0
      564000Updated Aug 1, 2023Aug 1, 2023
    • A cog for running llama-2 using llama.cpp server
      Python
      0000Updated Aug 1, 2023Aug 1, 2023
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      27k000Updated Jul 24, 2023Jul 24, 2023
    • NVIDIA GPU-based FAN controller for SUPERMICRO server
      Python
      MIT License
      3000Updated Apr 25, 2023Apr 25, 2023
    • Multilingual Automatic Speech Recognition with word-level timestamps and confidence
      Python
      GNU Affero General Public License v3.0
      156000Updated Mar 7, 2023Mar 7, 2023
    • Multilingual Sentence & Image Embeddings with BERT
      Python
      Apache License 2.0
      2.5k000Updated Feb 28, 2023Feb 28, 2023
    • HTML
      204000Updated Feb 14, 2023Feb 14, 2023