[`core`] Fix safetensors serialization for shared tensors #1101

younesbelkada · 2023-11-09T13:15:50Z

What does this PR do?

Safetensors do not support tensor aliasing, before saving the model state dict, we need to loop over the state dict and check for aliased tensors and clone() them to avoid any error

cc @BenjaminBossan @pacman100

HuggingFaceDocBuilderDev · 2023-11-09T13:24:55Z

The documentation is not available anymore as the PR was closed or merged.

younesbelkada · 2023-11-09T13:29:02Z

src/peft/peft_model.py

+                    if isinstance(tensor, torch.Tensor):
+                        ptrs[id_tensor_storage(tensor)].append(name)
+                    else:
+                        # In the non-tensor case, fall back to the pointer of the object itself
+                        ptrs[id(tensor)].append(name)


Not sure if this is needed, I added that in the transformers code to handle bnb state dict case and should be safe to keep it but happy to remove it as well

Probably it's better to follow the same steps as transformers.

BenjaminBossan

Thanks for investigating this. Just a minor comment about adding a comment, apart from this, LGTM.

BenjaminBossan · 2023-11-09T13:28:57Z

src/peft/peft_model.py

@@ -199,6 +203,25 @@ def save_pretrained(
            os.makedirs(output_dir, exist_ok=True)

            if safe_serialization:
+                # Safetensors does not allow tensor aliasing.


This block is basically copied from https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L2111-L2134, right? Let's add a comment.

Ok perfect!

BenjaminBossan · 2023-11-09T13:31:45Z

src/peft/peft_model.py

+                    if isinstance(tensor, torch.Tensor):
+                        ptrs[id_tensor_storage(tensor)].append(name)
+                    else:
+                        # In the non-tensor case, fall back to the pointer of the object itself
+                        ptrs[id(tensor)].append(name)


Probably it's better to follow the same steps as transformers.

younesbelkada added 2 commits November 9, 2023 13:13

fix st serialization

3698ca9

add test

2b63608

younesbelkada requested review from BenjaminBossan and pacman100 November 9, 2023 13:22

add CI test

f24738f

younesbelkada commented Nov 9, 2023

View reviewed changes

BenjaminBossan approved these changes Nov 9, 2023

View reviewed changes

add comment

5d954e7

younesbelkada merged commit b5641cc into huggingface:main Nov 9, 2023
11 checks passed

younesbelkada deleted the fix-safe-serialization-shared-tensors branch November 9, 2023 13:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`core`] Fix safetensors serialization for shared tensors #1101

[`core`] Fix safetensors serialization for shared tensors #1101

younesbelkada commented Nov 9, 2023

HuggingFaceDocBuilderDev commented Nov 9, 2023 •

edited

Loading

younesbelkada Nov 9, 2023

BenjaminBossan Nov 9, 2023

BenjaminBossan left a comment

BenjaminBossan Nov 9, 2023

younesbelkada Nov 9, 2023

BenjaminBossan Nov 9, 2023

[core] Fix safetensors serialization for shared tensors #1101

[core] Fix safetensors serialization for shared tensors #1101

Conversation

younesbelkada commented Nov 9, 2023

What does this PR do?

HuggingFaceDocBuilderDev commented Nov 9, 2023 • edited Loading

younesbelkada Nov 9, 2023

Choose a reason for hiding this comment

BenjaminBossan Nov 9, 2023

Choose a reason for hiding this comment

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenjaminBossan Nov 9, 2023

Choose a reason for hiding this comment

younesbelkada Nov 9, 2023

Choose a reason for hiding this comment

BenjaminBossan Nov 9, 2023

Choose a reason for hiding this comment

[`core`] Fix safetensors serialization for shared tensors #1101

[`core`] Fix safetensors serialization for shared tensors #1101

HuggingFaceDocBuilderDev commented Nov 9, 2023 •

edited

Loading