Skip to content
Change the repository type filter

All

    Repositories list

    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      4.7k001Updated Nov 22, 2024Nov 22, 2024
    • [ICML 2024] CLLMs: Consistency Large Language Models
      Python
      Apache License 2.0
      1835570Updated Nov 16, 2024Nov 16, 2024
    • vllm-ltr

      Public
      [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
      Python
      Apache License 2.0
      31400Updated Nov 4, 2024Nov 4, 2024
    • [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
      Python
      Apache License 2.0
      671.2k330Updated Oct 14, 2024Oct 14, 2024
    • HTML
      7010Updated Oct 13, 2024Oct 13, 2024
    • MuxServe

      Public
      Jupyter Notebook
      34620Updated Jun 13, 2024Jun 13, 2024
    • dsc291-PA

      Public
      Jupyter Notebook
      2200Updated Jun 6, 2024Jun 6, 2024
    • Website for DSC 291, Spring 2024
      SCSS
      Other
      26000Updated Jun 5, 2024Jun 5, 2024
    • Website for DSC 204a, Winter 2024
      SCSS
      Other
      26801Updated Mar 24, 2024Mar 24, 2024