static cache: RuntimeError: cannot mutate tensors with frozen storage #33178

dvrogozh · 2024-08-28T20:41:24Z

With:

On:

CPU
NVidia A10

Test that "static cache works with torch.export()" fails with:

# RUN_SLOW=1 python3 -m pytest --pspec -vv -k CacheTest tests/utils/test_cache_utils.py

RuntimeError: cannot mutate tensors with frozen storage

While executing %index_copy_ : [num_users=0] = call_method[target=index_copy_](args = (%k_out, 2, %l_input_pos_, %k_embed), kwargs = {})
Original traceback:
  File "/home/dvrogozh/git/huggingface/transformers/tests/utils/test_cache_utils.py", line 210, in forward
    outs = self.model(
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/models/gemma/modeling_gemma.py", line 1076, in forward
    outputs = self.model(
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/models/gemma/modeling_gemma.py", line 889, in forward
    layer_outputs = decoder_layer(
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/models/gemma/modeling_gemma.py", line 611, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/models/gemma/modeling_gemma.py", line 521, in forward
    key_states, value_states = past_key_value.update(key_states, value_states, self.layer_idx, cache_kwargs)
  File "/home/dvrogozh/git/huggingface/transformers/src/transformers/cache_utils.py", line 1101, in update
    k_out.index_copy_(2, cache_position, key_states)

I observe that adding a .clone() to the following 2 tensors does fix the issue. Such solution was suggested in pytorch/pytorch#127571 (comment). However I am not sure whether that's the correct fix. See #33178 draft PR with this change.

transformers/src/transformers/cache_utils.py

Lines 1090 to 1091 in 5c1027b

    
           k_out = self.key_cache[layer_idx] 
        
           v_out = self.value_cache[layer_idx]

CC: @gante @SunMarc

The text was updated successfully, but these errors were encountered:

For huggingface#33178 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh · 2024-08-28T22:12:08Z

Another observation, this issue seem to appear after this commit: 1c36db6, #32543, @SunMarc

dvrogozh · 2024-08-29T17:33:14Z

Found also this issue on pytorch side which seem relevant:

torch.export() fails on aten.to(..., copy=True) followed by mutation pytorch/pytorch#131679
And this PR (merged) seems to add a "clone" workaround to some other similar case right in the pytorch:
[ts_converter]handle python list append, list add, aten.to.dtype+mutation_op pattern pytorch/pytorch#132529

guangy10 · 2024-09-04T00:08:17Z

Workaround in #33287

SunMarc · 2024-09-04T12:03:25Z

The workaround proposed by @guangy10 sounds better as it doesn't involve copying !

github-actions · 2024-09-29T08:03:28Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

dvrogozh · 2024-10-01T18:26:19Z

This issue was addressed by:

Fix the initialization of the cache when we have multi gpu #33303.

dvrogozh added a commit to dvrogozh/transformers that referenced this issue Aug 28, 2024

Clone tensors to be able to mutate

60d7c70

For huggingface#33178 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh mentioned this issue Aug 28, 2024

WIP: Clone tensors to be able to mutate #33179

Closed

dvrogozh mentioned this issue Aug 28, 2024

fix multi-gpu with static cache #32543

Merged

dvrogozh mentioned this issue Sep 5, 2024

Fix the initialization of the cache when we have multi gpu #33303

Merged

dvrogozh closed this as completed Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

static cache: RuntimeError: cannot mutate tensors with frozen storage #33178

static cache: RuntimeError: cannot mutate tensors with frozen storage #33178

dvrogozh commented Aug 28, 2024 •

edited

Loading

dvrogozh commented Aug 28, 2024

dvrogozh commented Aug 29, 2024

guangy10 commented Sep 4, 2024

SunMarc commented Sep 4, 2024

github-actions bot commented Sep 29, 2024

dvrogozh commented Oct 1, 2024

static cache: RuntimeError: cannot mutate tensors with frozen storage #33178

static cache: RuntimeError: cannot mutate tensors with frozen storage #33178

Comments

dvrogozh commented Aug 28, 2024 • edited Loading

dvrogozh commented Aug 28, 2024

dvrogozh commented Aug 29, 2024

guangy10 commented Sep 4, 2024

SunMarc commented Sep 4, 2024

github-actions bot commented Sep 29, 2024

dvrogozh commented Oct 1, 2024

dvrogozh commented Aug 28, 2024 •

edited

Loading