You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
is called, using the normal CheckpointIO workflow, but with the mangled path.
The expected behavior should be that if the user chooses full state dict type, CheckpointIO and remote paths should work as usual, but currently full state dict checkpoints cannot be saved to remote paths.
#- PyTorch Lightning Version (e.g., 2.4.0):
#- PyTorch Version (e.g., 2.4):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
More info
No response
The text was updated successfully, but these errors were encountered:
Bug description
In
FSDPStrategy.save_checkpoint
, thefilepath
variable is transformed viapytorch-lightning/src/lightning/pytorch/strategies/fsdp.py
Line 562 in 3627c5b
This only makes sense if doing sharded checkpointing, and in fact mangles any legitimate fsspec path that is passed in.
When
self._state_dict_type == "full"
,is called, using the normal CheckpointIO workflow, but with the mangled path.
The expected behavior should be that if the user chooses full state dict type, CheckpointIO and remote paths should work as usual, but currently full state dict checkpoints cannot be saved to remote paths.
What version are you seeing the problem on?
v2.4
How to reproduce the bug
Error messages and logs
Environment
Current environment
More info
No response
The text was updated successfully, but these errors were encountered: