CUDA out of memory #90

TanvirHafiz · 2024-11-03T10:46:35Z

I have 24GB of ram, and i tried to generate a 512 by 512 image and i still get this!

Traceback (most recent call last):
File "F:\OmniGen\venv\lib\site-packages\gradio\queueing.py", line 624, in process_events
response = await route_utils.call_process_api(
File "F:\OmniGen\venv\lib\site-packages\gradio\route_utils.py", line 323, in call_process_api
output = await app.get_blocks().process_api(
File "F:\OmniGen\venv\lib\site-packages\gradio\blocks.py", line 2018, in process_api
result = await self.call_function(
File "F:\OmniGen\venv\lib\site-packages\gradio\blocks.py", line 1567, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "F:\OmniGen\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "F:\OmniGen\venv\lib\site-packages\anyio_backends_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
File "F:\OmniGen\venv\lib\site-packages\anyio_backends_asyncio.py", line 943, in run
result = context.run(func, *args)
File "F:\OmniGen\venv\lib\site-packages\gradio\utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "F:\OmniGen\app.py", line 22, in generate_image
output = pipe(
File "F:\OmniGen\venv\lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "F:\OmniGen\OmniGen\pipeline.py", line 278, in call
samples = scheduler(latents, func, model_kwargs, use_kv_cache=use_kv_cache, offload_kv_cache=offload_kv_cache)
File "F:\OmniGen\OmniGen\scheduler.py", line 162, in call
pred, cache = func(z, timesteps, past_key_values=cache, **model_kwargs)
File "F:\OmniGen\venv\lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "F:\OmniGen\OmniGen\model.py", line 387, in forward_with_separate_cfg
temp_out, temp_pask_key_values = self.forward(x[i], timestep[i], input_ids[i], input_img_latents[i], input_image_sizes[i], attention_mask[i], position_ids[i], past_key_values=past_key_values[i], return_past_key_values=True, offload_model=offload_model)
File "F:\OmniGen\OmniGen\model.py", line 338, in forward
output = self.llm(inputs_embeds=input_emb, attention_mask=attention_mask, position_ids=position_ids, past_key_values=past_key_values, offload_model=offload_model)
File "F:\OmniGen\venv\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "F:\OmniGen\venv\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "F:\OmniGen\OmniGen\transformer.py", line 156, in forward
self.get_offlaod_layer(layer_idx, device=inputs_embeds.device)
File "F:\OmniGen\OmniGen\transformer.py", line 52, in get_offlaod_layer
self.evict_previous_layer(layer_idx)
File "F:\OmniGen\OmniGen\transformer.py", line 43, in evict_previous_layer
param.data = param.data.to("cpu", non_blocking=True)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

The text was updated successfully, but these errors were encountered:

FurkanGozukara · 2024-11-03T11:23:15Z

I don't know how you installed, because repo doesn't have proper installation instructions, but very likely installation error

i made 1-click Python 3.10 installers and works as low as 5.5 gb : #86

TanvirHafiz · 2024-11-03T11:25:24Z

Cool I will try yours

…

On Sun, 3 Nov 2024, 5:23 pm Furkan Gözükara, ***@***.***> wrote: I don't know how you installed, because repo doesn't have proper installation instructions, but very likely installation error i made 1-click Python 3.10 installers and works as low as 5.5 gb : #86 <#86> — Reply to this email directly, view it on GitHub <#90 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A3IB2HS5BIYRLRNK5XB5QXLZ6YBTTAVCNFSM6AAAAABRCTVA26VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJTGM4TAOJZGI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Removes non_blocking argument from all device to cpu transfers. In certain environments (e.g. WSL) large transfers will throw a CUDA memory error regardless of VRAM available. Adjusts stream synchronize for modest performance gains with cpu_offload. fixes VectorSpaceLab#90, fixes VectorSpaceLab#117

Rypo mentioned this issue Dec 2, 2024

Adds support for 4bit (nf4) and 8bit bitsandbytes quantization (3/3) #151

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA out of memory #90

CUDA out of memory #90

TanvirHafiz commented Nov 3, 2024

FurkanGozukara commented Nov 3, 2024

TanvirHafiz commented Nov 3, 2024 via email

CUDA out of memory #90

CUDA out of memory #90

Comments

TanvirHafiz commented Nov 3, 2024

FurkanGozukara commented Nov 3, 2024

TanvirHafiz commented Nov 3, 2024 via email