You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using pure PyTorch's FSDP would offer better compatibility with gradient checkpointing and potentially much faster training, especially with newer PyTorch versions (unfortunately, we're currently stuck on PyTorch 2.0.1 due to Accelerate).
I plan to implement an Accelerate-free version in the future. However, I'm currently occupied with interviews, so I don't have a clear timeline for this yet :(
assert not (train_args.fsdp and train_args.gradient_checkpointing), "currently, we don't support both options. open an issue for details."
why??
The text was updated successfully, but these errors were encountered: