Skip to content

Commit

Permalink
Blog post on bitsandbytes integration on Hugging Face (#463)
Browse files Browse the repository at this point in the history
* first commit

* add new thumbnails

* add more content

* add new gif

* Update _blog.yml

* rename files

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Apply suggestions from code review

* change content a bit

- add more details and adapt from stas suggestions

* re-write text: part 1

* few modifs

- add credits
- add image
- modify a bit the content

* modify a bit

* add more content

* add image

* paraphrase a bit

* add more content

* add more content

* some improvements

* add thumbnail

* add more text + fix table

* fix table

* fix tables

* add stas as author

* add a last sentence

* edit some more

* few modifs

* modify thumbail

* add thumbnail

* add removed comment

* add photos

* add more infos

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Add files via upload

* add steven to the credits!

* edits

* edits

* edits

* edits

* add script

* change to std err

* refactor a bit the tables

* add Tim's comments

* remove separators

* explain why it is slow

* Update hf-bitsandbytes-integration.md

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Add links to paper

* delete dummy file

* add correct link to paper

* add more explanation on speed

* update figure

* replace authors by we

* add freezed image

* remove old table

* Update hf-bitsandbytes-integration.md

Some slight edits.

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Tim Dettmers <TimDettmers@users.noreply.github.com>
  • Loading branch information
5 people authored Aug 17, 2022
1 parent 822183c commit d339f36
Show file tree
Hide file tree
Showing 23 changed files with 642 additions and 1 deletion.
11 changes: 10 additions & 1 deletion _blog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1120,7 +1120,6 @@
- guide



- local: skops
title: Introducing Skops
author: merve
Expand All @@ -1132,3 +1131,13 @@
- announcement
- guide


- local: hf-bitsandbytes-integration
title: "A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes"
author: ybelkada
thumbnail: /blog/assets/96_hf_bitsandbytes_integration/thumbnail_blue.png
date: August 17, 2022
tags:
- nlp
- llm
- quantization
Binary file added assets/96_hf_bitsandbytes_integration/BF16.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/96_hf_bitsandbytes_integration/FP16.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/96_hf_bitsandbytes_integration/FP32.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/96_hf_bitsandbytes_integration/LLM.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/96_hf_bitsandbytes_integration/LLM3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/96_hf_bitsandbytes_integration/Matmul.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/96_hf_bitsandbytes_integration/TF32.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/96_hf_bitsandbytes_integration/byte.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 41 additions & 0 deletions assets/96_hf_bitsandbytes_integration/example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import torch
import torch.nn as nn

from bitsandbytes.nn import Linear8bitLt

# Utility function

def get_model_memory_footprint(model):
r"""
Partially copied and inspired from: https://discuss.pytorch.org/t/gpu-memory-that-model-uses/56822/2
"""
return sum([param.nelement() * param.element_size() for param in model.parameters()])

# Main script

fp16_model = nn.Sequential(
nn.Linear(64, 64),
nn.Linear(64, 64)
).to(torch.float16)

# Train and save your model!

torch.save(fp16_model.state_dict(), "model.pt")

# Define your int8 model!

int8_model = nn.Sequential(
Linear8bitLt(64, 64, has_fp16_weights=False),
Linear8bitLt(64, 64, has_fp16_weights=False)
)

int8_model.load_state_dict(torch.load("model.pt"))
int8_model = int8_model.to(0) # Quantization happens here

input_ = torch.randn(8, 64, dtype=torch.float16)
hidden_states = int8_model(input_.to(0))

mem_int8 = get_model_memory_footprint(int8_model)
mem_fp16 = get_model_memory_footprint(fp16_model)

print(f"Relative difference: {mem_fp16/mem_int8}")
129 changes: 129 additions & 0 deletions assets/96_hf_bitsandbytes_integration/mantissa.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/96_hf_bitsandbytes_integration/tim.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/96_hf_bitsandbytes_integration/younes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
462 changes: 462 additions & 0 deletions hf-bitsandbytes-integration.md

Large diffs are not rendered by default.

0 comments on commit d339f36

Please sign in to comment.