Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low bit shampoo #1257

Open
msaroufim opened this issue Nov 9, 2024 · 0 comments
Open

Low bit shampoo #1257

msaroufim opened this issue Nov 9, 2024 · 0 comments
Labels
enhancement New feature or request optimizer

Comments

@msaroufim
Copy link
Member

msaroufim commented Nov 9, 2024

Opening this on behalf of @winglian

An optimizer that many folks have been interested in is Shampoo https://arxiv.org/abs/1802.09568 - its fans say it converges faster because it uses second order gradients but still manages to keep memory requirements in check

To further keep the memory requirements in check we can quantize it! There are some existing papers out there that are good recipes for how this could work for int4 https://arxiv.org/abs/2405.18144

As far as implementing the work we have many reference examples for int8, int4, fp8 adam and adamw https://github.com/pytorch/ao/tree/main/torchao/prototype/low_bit_optim and we have in progress contribution here #1231

Ideally the work above can be turned into a guide on how to implement a new low bit optimizer that people can follow and implement a new optimizer in a day's worth of work if they already understand how the optimizer they're trying to implement works

cc @gau-nernst @andrewor14 @vkuzo @janeyx99 @supriyar

@msaroufim msaroufim added optimizer enhancement New feature or request labels Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request optimizer
Projects
None yet
Development

No branches or pull requests

1 participant