Skip to content
View n1ck-guo's full-sized avatar

Block or report n1ck-guo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,191 218 Updated Mar 9, 2025

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 1,984 287 Updated Mar 11, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 43,760 5,352 Updated Mar 11, 2025

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 2,299 151 Updated Mar 3, 2025

Advanced Quantization Algorithm for LLMs/VLMs.

Python 388 30 Updated Mar 11, 2025

A framework for few-shot evaluation of language models.

Python 8,203 2,185 Updated Mar 11, 2025

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

Python 432 54 Updated Aug 1, 2024

GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM

Python 157 16 Updated Jul 12, 2024

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

Jupyter Notebook 447 124 Updated Mar 10, 2025

Awesome LLM compression research papers and tools.

1,410 89 Updated Mar 11, 2025

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 26,681 8,752 Updated Mar 11, 2025

MLNLP社区用来更好进行论文搜索的工具。Fully-automated scripts for collecting AI-related papers

Python 1,140 120 Updated Dec 16, 2023

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,350 263 Updated Mar 11, 2025

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,161 211 Updated Oct 8, 2024

A curated list of neural network pruning resources.

2,420 330 Updated Apr 4, 2024

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 140,986 28,239 Updated Mar 11, 2025

Awesome Knowledge Distillation

3,601 507 Updated Mar 11, 2025
Showing results