Stars
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
GPT4V-level open-source multi-modal model based on Llama3-8B
Advanced Quantization Algorithm for LLMs/VLMs.
A framework for few-shot evaluation of language models.
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
Awesome LLM compression research papers and tools.
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
MLNLP社区用来更好进行论文搜索的工具。Fully-automated scripts for collecting AI-related papers
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
A curated list of neural network pruning resources.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Awesome Knowledge Distillation