Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for mamba #915

Merged
merged 15 commits into from
Dec 9, 2023
Merged

support for mamba #915

merged 15 commits into from
Dec 9, 2023

Conversation

winglian
Copy link
Collaborator

@winglian winglian commented Dec 4, 2023

you'll need to run pip install git+https://github.com/state-spaces/mamba.git or pip install -e .[mamba]

this is still a WIP, as the loss explodes to NaN.

@winglian winglian merged commit 40a6362 into main Dec 9, 2023
4 checks passed
@winglian winglian deleted the mamba-llm branch December 9, 2023 17:10
@NanoCode012
Copy link
Collaborator

Just one comment , but we should maybe add a warning that safetensors isn’t available when saving but not crash

mkeoliya pushed a commit to mkeoliya/axolotl that referenced this pull request Dec 15, 2023
* support for mamba

* more mamba fixes

* use fork for mamba kwargs fix

* grad checkpointing doesn't work

* fix extras for mamaba

* mamba loss fix

* use fp32 and remove verbose logging

* mamba fixes

* fix collator for mamba

* set model_type on training_args

* don't save safetensors for mamba

* update mamba config to disable safetensor checkpooints, install for tests

* no evals for mamba tests

* handle save_pretrained

* handle unused safetensors arg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants