Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA OOM doing reading comprehension on A10 24GB VRAM GPU #81

Open
tleyden opened this issue Dec 5, 2023 · 0 comments
Open

CUDA OOM doing reading comprehension on A10 24GB VRAM GPU #81

tleyden opened this issue Dec 5, 2023 · 0 comments
Assignees

Comments

@tleyden
Copy link
Contributor

tleyden commented Dec 5, 2023

With a subset of the nuclear patent dataset, it throws this error:

12/05/2023 16:14:48 - INFO - dalm.pipelines.reading_comprehension_pipeline - LLM RC dataset generated text of length 2415 from context of length 670
12/05/2023 16:14:48 - INFO - dalm.pipelines.reading_comprehension_pipeline - Writing unprocessed LLM output to context_data_c8307498-165e-49b6-b073-214fbe9bb8e0.csv8_0.json
12/05/2023 16:14:48 - INFO - dalm.pipelines.reading_comprehension_pipeline - Writing Q & A chat completions of length 9 to context_data_c8307498-165e-49b6-b073-214fbe9bb8e0.csv8_0.json
12/05/2023 16:15:17 - INFO - dalm.pipelines.reading_comprehension_pipeline - LLM RC dataset generated text of length 2855 from context of length 11202
12/05/2023 16:15:17 - INFO - dalm.pipelines.reading_comprehension_pipeline - Writing unprocessed LLM output to context_data_c8307498-165e-49b6-b073-214fbe9bb8e0.csv9_0.json
12/05/2023 16:15:17 - INFO - dalm.pipelines.reading_comprehension_pipeline - Writing Q & A chat completions of length 9 to context_data_c8307498-165e-49b6-b073-214fbe9bb8e0.csv9_0.json
/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py:1101: UserWarning: You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset
12/05/2023 16:15:41 - INFO - dalm.pipelines.reading_comprehension_pipeline - LLM RC dataset generated text of length 2240 from context of length 2841
12/05/2023 16:15:41 - WARNING - dalm.datasets.reading_comprehension_generation.utils - Found a question with no answer: {'question': 's and answer task:', 'answer': 'TBD'}.  Skipping.
12/05/2023 16:15:41 - INFO - dalm.pipelines.reading_comprehension_pipeline - Writing unprocessed LLM output to context_data_c8307498-165e-49b6-b073-214fbe9bb8e0.csv10_0.json
12/05/2023 16:15:41 - INFO - dalm.pipelines.reading_comprehension_pipeline - Writing Q & A chat completions of length 7 to context_data_c8307498-165e-49b6-b073-214fbe9bb8e0.csv10_0.json

12/05/2023 16:15:42 - ERROR - root - Training failed with exception: CUDA out of memory. Tried to allocate 2.00 GiB (GPU 0; 22.20 GiB total capacity; 17.54 GiB already allocated; 327.12 MiB free; 20.83 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Traceback (most recent call last):
  File "//train_generator.py", line 153, in <module>
    create_reading_comprehension_dataset_and_train()
  File "//train_generator.py", line 134, in create_reading_comprehension_dataset_and_train
    pipeline(
  File "/opt/conda/lib/python3.10/site-packages/dalm/pipelines/reading_comprehension_pipeline.py", line 146, in pipeline
    for index, text_identifier, context, gen_text in llm_rc_dataset_generator:
  File "/opt/conda/lib/python3.10/site-packages/dalm/datasets/reading_comprehension_generation/synthetic_based.py", line 119, in generate_synthetic_dataset
    gen_text = generate_synthetic_data(model_pipeline, chunk_, generation_params)
  File "/opt/conda/lib/python3.10/site-packages/dalm/datasets/reading_comprehension_generation/synthetic_based.py", line 82, in generate_synthetic_data
    outputs = model_pipeline(prompt, **generation_params)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/text_generation.py", line 208, in __call__
    return super().__call__(text_inputs, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1140, in __call__
    return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1147, in run_single
    model_outputs = self.forward(model_inputs, **forward_params)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1046, in forward
    model_outputs = self._forward(model_inputs, **forward_params)
  File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/text_generation.py", line 271, in _forward
    generated_sequence = self.model.generate(input_ids=input_ids, attention_mask=attention_mask, **generate_kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 1719, in generate
    return self.sample(
  File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 2801, in sample
    outputs = self(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 1009, in forward
    outputs = self.model(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 897, in forward
    layer_outputs = decoder_layer(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 626, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 286, in forward
    attn_weights = nn.functional.softmax(attn_weights, dim=-1, dtype=torch.float32).to(query_states.dtype)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/functional.py", line 1845, in softmax
    ret = input.softmax(dim, dtype=dtype)
@tleyden tleyden self-assigned this Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant