Stars
Production ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3
The official codes for "Aurora: Activating chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning"
🚀LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Sparsity-aware deep learning inference runtime for CPUs
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…
This repository contains demos I made with the Transformers library by HuggingFace.
很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。
The codes for training sparsity predictor on LLaMA.
[NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 whil…
Port of OpenAI's Whisper model in C/C++
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
LostRuins / koboldcpp
Forked from ggerganov/llama.cppRun GGUF models easily with a KoboldAI UI. One File. Zero Install.
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
🪀 Lobe CLI Toolbox - AI CLI Toolbox, enhancing git commit and i18n workflow efficiency
Zero-dependent. A native nodejs screenshots library for Mac、Windows、Linux.
🚀 Screenshots, word marking, OCR, AI, translation software || 截图、划词、文字识别、AI、翻译软件
SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
MemFree - Hybrid AI Search Engine & AI Page Generator