Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory error. #240

Closed
SVC04 opened this issue Dec 21, 2023 · 2 comments · Fixed by #245
Closed

CUDA out of memory error. #240

SVC04 opened this issue Dec 21, 2023 · 2 comments · Fixed by #245
Labels
question Further information is requested

Comments

@SVC04
Copy link

SVC04 commented Dec 21, 2023

Hello,

I am using inseq to generate explanation for text summarization problem of long length input article.
While the code works well with short length article it throws an error if the length of the article is increased.
I am using 48 GB GPU memory on cloud to execute this.
Following is the code i have used.

import inseq

LONG_ARTICLE = """"anxiety affects quality of life in those living
with parkinson 's disease ( pd ) more so than
overall cognitive status , motor deficits , apathy
, and depression [ 13 ] ."""

model = "google/bigbird-pegasus-large-pubmed"
model = inseq.load_model(model, "attention")

out = model.attribute(LONG_ARTICLE)


I get following error.

CUDA out of memory. Tried to allocate 15.02 GiB (GPU 0; 47.54 GiB total capacity; 32.75 GiB already allocated; 13.67 GiB free; 32.87 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I have tried with defining max_split_size_mb = 256 as environment variable, cleared cache as well but nothing worked.

@SVC04 SVC04 added the question Further information is requested label Dec 21, 2023
@gsarti
Copy link
Member

gsarti commented Dec 21, 2023

Hi @seemavishal, could you paste the whole stack trace of the error?

From the looks of it, provided that the attention attribution method does not perform any additional action beyond the forward passes required for generation, I think the error might get raised at the generation stage with transformers.

If this is the case, the only solution would be to use a smaller model, or get access to a machine with more GPU memory!

@SVC04
Copy link
Author

SVC04 commented Dec 21, 2023

Hi @gsarti

Pasting the error trace below.


/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py:1518: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use and modify the model generation configuration (see https://huggingface.co/docs/transformers/generation_strategies#default-text-generation-configuration )
  warnings.warn(
Attributing with attention...:   1%|          | 1/168 [00:00<?, ?it/s]
---------------------------------------------------------------------------
OutOfMemoryError                          Traceback (most recent call last)
Cell In [10], line 1
----> 1 out = model.attribute(LONG_ARTICLE)

File /usr/local/lib/python3.9/dist-packages/inseq/models/attribution_model.py:445, in AttributionModel.attribute(self, input_texts, generated_texts, method, override_default_attribution, attr_pos_start, attr_pos_end, show_progress, pretty_progress, output_step_attributions, attribute_target, step_scores, include_eos_baseline, attributed_fn, device, batch_size, generate_from_target_prefix, generation_args, **kwargs)
    443     logger.info("Batched attribution currently not supported for LIME. Using batch size of 1.")
    444     batch_size = 1
--> 445 attribution_outputs = attribution_method.prepare_and_attribute(
    446     input_texts,
    447     generated_texts,
    448     batch_size=batch_size,
    449     attr_pos_start=attr_pos_start,
    450     attr_pos_end=attr_pos_end,
    451     show_progress=show_progress,
    452     pretty_progress=pretty_progress,
    453     output_step_attributions=output_step_attributions,
    454     attribute_target=attribute_target,
    455     step_scores=step_scores,
    456     include_eos_baseline=include_eos_baseline,
    457     attributed_fn=attributed_fn,
    458     attribution_args=attribution_args,
    459     attributed_fn_args=attributed_fn_args,
    460     step_scores_args=step_scores_args,
    461 )
    462 attribution_output = merge_attributions(attribution_outputs)
    463 attribution_output.info["input_texts"] = input_texts

File /usr/local/lib/python3.9/dist-packages/inseq/attr/attribution_decorators.py:71, in batched.<locals>.batched_wrapper(self, batch_size, *args, **kwargs)
     68         raise TypeError(f"Unsupported type {type(seq)} for batched attribution computation.")
     70 if batch_size is None:
---> 71     out = f(self, *args, **kwargs)
     72     return out if isinstance(out, list) else [out]
     73 batched_args = [get_batched(batch_size, arg) for arg in args]

File /usr/local/lib/python3.9/dist-packages/inseq/attr/feat/feature_attribution.py:237, in FeatureAttribution.prepare_and_attribute(self, sources, targets, attr_pos_start, attr_pos_end, show_progress, pretty_progress, output_step_attributions, attribute_target, step_scores, include_eos_baseline, attributed_fn, attribution_args, attributed_fn_args, step_scores_args)
    233 # If prepare_and_attribute was called from AttributionModel.attribute,
    234 # attributed_fn is already a Callable. Keep here to allow for usage independently
    235 # of AttributionModel.attribute.
    236 attributed_fn = self.attribution_model.get_attributed_fn(attributed_fn)
--> 237 attribution_output = self.attribute(
    238     batch,
    239     attributed_fn=attributed_fn,
    240     attr_pos_start=attr_pos_start,
    241     attr_pos_end=attr_pos_end,
    242     show_progress=show_progress,
    243     pretty_progress=pretty_progress,
    244     output_step_attributions=output_step_attributions,
    245     attribute_target=attribute_target,
    246     step_scores=step_scores,
    247     attribution_args=attribution_args,
    248     attributed_fn_args=attributed_fn_args,
    249     step_scores_args=step_scores_args,
    250 )
    251 # Same here, repeated from AttributionModel.attribute
    252 # to allow independent usage
    253 attribution_output.info["include_eos_baseline"] = include_eos_baseline

File /usr/local/lib/python3.9/dist-packages/inseq/attr/feat/feature_attribution.py:431, in FeatureAttribution.attribute(self, batch, attributed_fn, attr_pos_start, attr_pos_end, show_progress, pretty_progress, output_step_attributions, attribute_target, step_scores, attribution_args, attributed_fn_args, step_scores_args)
    429 for step in range(attr_pos_start, iter_pos_end):
    430     tgt_ids, tgt_mask = batch.get_step_target(step, with_attention=True)
--> 431     step_output = self.filtered_attribute_step(
    432         batch[:step],
    433         target_ids=tgt_ids.unsqueeze(1),
    434         attributed_fn=attributed_fn,
    435         target_attention_mask=tgt_mask.unsqueeze(1),
    436         attribute_target=attribute_target,
    437         step_scores=step_scores,
    438         attribution_args=attribution_args,
    439         attributed_fn_args=attributed_fn_args,
    440         step_scores_args=step_scores_args,
    441     )
    442     # Add batch information to output
    443     step_output = self.attribution_model.formatter.enrich_step_output(
    444         self.attribution_model,
    445         step_output,
   (...)
    450         contrast_targets_alignments=contrast_targets_alignments,
    451     )

File /usr/local/lib/python3.9/dist-packages/inseq/attr/feat/feature_attribution.py:579, in FeatureAttribution.filtered_attribute_step(self, batch, target_ids, attributed_fn, target_attention_mask, attribute_target, step_scores, attribution_args, attributed_fn_args, step_scores_args)
    577         attribution_args = {**attribution_args, **hidden_states_dict}
    578 # Perform attribution step
--> 579 step_output = self.attribute_step(
    580     attribute_main_args,
    581     attribution_args,
    582 )
    583 # Format step scores arguments and calculate step scores
    584 for step_score in step_scores:

File /usr/local/lib/python3.9/dist-packages/inseq/attr/feat/internals_attribution.py:114, in AttentionWeightsAttribution.attribute_step(self, attribute_fn_main_args, attribution_args)
    109 def attribute_step(
    110     self,
    111     attribute_fn_main_args: Dict[str, Any],
    112     attribution_args: Dict[str, Any],
    113 ) -> MultiDimensionalFeatureAttributionStepOutput:
--> 114     return self.method.attribute(**attribute_fn_main_args, **attribution_args)

File /usr/local/lib/python3.9/dist-packages/inseq/attr/feat/internals_attribution.py:87, in AttentionWeightsAttribution.AttentionWeights.attribute(self, inputs, additional_forward_args, encoder_self_attentions, decoder_self_attentions, cross_attentions)
     85         target_attributions = None
     86         sequence_scores["decoder_self_attentions"] = decoder_self_attentions
---> 87     sequence_scores["encoder_self_attentions"] = encoder_self_attentions.clone().permute(0, 3, 4, 1, 2)
     88     return MultiDimensionalFeatureAttributionStepOutput(
     89         source_attributions=cross_attentions[..., -1, :].clone().permute(0, 3, 1, 2),
     90         target_attributions=target_attributions,
     91         sequence_scores=sequence_scores,
     92         _num_dimensions=2,  # num_layers, num_heads
     93     )
     94 else:

OutOfMemoryError: CUDA out of memory. Tried to allocate 15.02 GiB (GPU 0; 47.54 GiB total capacity; 32.75 GiB already allocated; 13.67 GiB free; 32.87 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@gsarti gsarti linked a pull request Jan 12, 2024 that will close this issue
@SVC04 SVC04 changed the title CUDA out of memory error while trying to generate explanation for long text input article for summarization problem on GPU. CUDA out of memory error. Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants