Skip to content
@compressa-ai

Compressa.ai

Popular repositories Loading

  1. llm-awq llm-awq Public

    Forked from mit-han-lab/llm-awq

    AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    Python 1

  2. compressa-perf compressa-perf Public

    Python 1

  3. neural-compressor neural-compressor Public

    Forked from intel/neural-compressor

    Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, spar…

    Python

  4. qlora qlora Public

    Forked from artidoro/qlora

    QLoRA: Efficient Finetuning of Quantized LLMs

    Jupyter Notebook

  5. peft peft Public

    Forked from huggingface/peft

    🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

    Python

  6. OmniQuant OmniQuant Public

    Forked from OpenGVLab/OmniQuant

    OmniQuant is a simple and powerful quantization technique for LLMs.

    Python

Repositories

Showing 10 of 13 repositories
  • compressa-ai/compressa-ai.github.io’s past year of commit activity
    HTML 0 0 0 0 Updated Dec 13, 2024
  • compressa-ai/compressa-perf’s past year of commit activity
    Python 1 MIT 0 0 0 Updated Nov 6, 2024
  • compressa-ai/compressa-deploy’s past year of commit activity
    0 1 0 0 Updated Oct 31, 2024
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    compressa-ai/vllm’s past year of commit activity
    Python 0 Apache-2.0 4,999 0 0 Updated Oct 26, 2024
  • compressa-ai/langchain_compressa’s past year of commit activity
    Python 0 MIT 1 0 0 Updated Jul 18, 2024
  • qlora Public Forked from artidoro/qlora

    QLoRA: Efficient Finetuning of Quantized LLMs

    compressa-ai/qlora’s past year of commit activity
    Jupyter Notebook 0 MIT 885 0 0 Updated Nov 20, 2023
  • llm-awq Public Forked from mit-han-lab/llm-awq

    AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    compressa-ai/llm-awq’s past year of commit activity
    Python 1 MIT 218 0 0 Updated Nov 20, 2023
  • OmniQuant Public Forked from OpenGVLab/OmniQuant

    OmniQuant is a simple and powerful quantization technique for LLMs.

    compressa-ai/OmniQuant’s past year of commit activity
    Python 0 56 0 0 Updated Nov 8, 2023
  • rulm Public Forked from IlyaGusev/rulm

    Language modeling and instruction tuning for Russian

    compressa-ai/rulm’s past year of commit activity
    Jupyter Notebook 0 Apache-2.0 51 0 0 Updated Oct 18, 2023
  • AutoAWQ Public Forked from casper-hansen/AutoAWQ

    AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference.

    compressa-ai/AutoAWQ’s past year of commit activity
    C++ 0 MIT 224 0 0 Updated Oct 16, 2023

Top languages

Loading…

Most used topics

Loading…