-
Notifications
You must be signed in to change notification settings - Fork 782
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Blog post on
bitsandbytes
integration on Hugging Face (#463)
* first commit * add new thumbnails * add more content * add new gif * Update _blog.yml * rename files * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review * change content a bit - add more details and adapt from stas suggestions * re-write text: part 1 * few modifs - add credits - add image - modify a bit the content * modify a bit * add more content * add image * paraphrase a bit * add more content * add more content * some improvements * add thumbnail * add more text + fix table * fix table * fix tables * add stas as author * add a last sentence * edit some more * few modifs * modify thumbail * add thumbnail * add removed comment * add photos * add more infos * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Add files via upload * add steven to the credits! * edits * edits * edits * edits * add script * change to std err * refactor a bit the tables * add Tim's comments * remove separators * explain why it is slow * Update hf-bitsandbytes-integration.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Add links to paper * delete dummy file * add correct link to paper * add more explanation on speed * update figure * replace authors by we * add freezed image * remove old table * Update hf-bitsandbytes-integration.md Some slight edits. * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Tim Dettmers <TimDettmers@users.noreply.github.com>
- Loading branch information
1 parent
822183c
commit d339f36
Showing
23 changed files
with
642 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
import torch | ||
import torch.nn as nn | ||
|
||
from bitsandbytes.nn import Linear8bitLt | ||
|
||
# Utility function | ||
|
||
def get_model_memory_footprint(model): | ||
r""" | ||
Partially copied and inspired from: https://discuss.pytorch.org/t/gpu-memory-that-model-uses/56822/2 | ||
""" | ||
return sum([param.nelement() * param.element_size() for param in model.parameters()]) | ||
|
||
# Main script | ||
|
||
fp16_model = nn.Sequential( | ||
nn.Linear(64, 64), | ||
nn.Linear(64, 64) | ||
).to(torch.float16) | ||
|
||
# Train and save your model! | ||
|
||
torch.save(fp16_model.state_dict(), "model.pt") | ||
|
||
# Define your int8 model! | ||
|
||
int8_model = nn.Sequential( | ||
Linear8bitLt(64, 64, has_fp16_weights=False), | ||
Linear8bitLt(64, 64, has_fp16_weights=False) | ||
) | ||
|
||
int8_model.load_state_dict(torch.load("model.pt")) | ||
int8_model = int8_model.to(0) # Quantization happens here | ||
|
||
input_ = torch.randn(8, 64, dtype=torch.float16) | ||
hidden_states = int8_model(input_.to(0)) | ||
|
||
mem_int8 = get_model_memory_footprint(int8_model) | ||
mem_fp16 = get_model_memory_footprint(fp16_model) | ||
|
||
print(f"Relative difference: {mem_fp16/mem_int8}") |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+17.4 KB
assets/96_hf_bitsandbytes_integration/tf32-Mantissa-chart-hi-res-FINAL.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Large diffs are not rendered by default.
Oops, something went wrong.