Popular repositories Loading
-
bitsandbytes
bitsandbytes PublicForked from bitsandbytes-foundation/bitsandbytes
8-bit CUDA functions for PyTorch
Python
-
vllm-fork
vllm-fork PublicForked from HabanaAI/vllm-fork
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
-
-
neural-compressor
neural-compressor PublicForked from intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.