Model is successfully compiled, but OOM when loading #991

jiangwei221 · 2024-02-09T17:06:32Z

Hi AIT team:

I'm working on compiling a generative video model into AIT.

I can successfully compile the model, as you can see here:

2024-02-09 07:20:35,614 INFO <aitemplate.compiler.transform.memory_planning> max_blob=19546740864 constant_offset=7630531776
2024-02-09 07:20:35,939 INFO <aitemplate.backend.codegen> generated 1027 function srcs
2024-02-09 07:20:40,944 INFO <aitemplate.backend.codegen> generated 8 library srcs
2024-02-09 07:20:40,949 INFO <aitemplate.backend.builder> Using 64 CPU for building
2024-02-09 09:40:18,778 INFO <aitemplate.compiler.compiler> compiled the final .so file elapsed time: 2:19:37.829273

However, when I try to load the model into the same GPU, it reports OOM error:

  Device:                                                                                                                                                                                              [79/1955]
     ASCII string identifying device: NVIDIA GeForce RTX 3090
     Major compute capability: 8
     Minor compute capability: 6
     UUID: GPU-aca8dfe8-0c10-ed38-e488-8117bfbc3566
     Unique identifier for a group of devices on the same multi-GPU board: 0
     PCI bus ID of the device: 46
     PCI device ID of the device: 0
     PCI domain ID of the device: 0
  Memory limits:
     Constant memory available on device in bytes: 65536
     Global memory available on device in bytes: 25438126080
     Size of L2 cache in bytes: 6291456
     Shared memory available per block in bytes: 49152
     Shared memory available per multiprocessor in bytes: 102400
[14:54:05] model_container.cu:87: Init AITemplate Runtime with 1 concurrency
[14:54:05] model_interface.cu:91: Error: DeviceMalloc(&result, n_bytes) API call failed: out of memory at model_interface.cu, line49
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/aitemplate/compiler/model.py", line 238, in __init__
    self.DLL.AITemplateModelContainerCreate(
  File "/usr/local/lib/python3.8/dist-packages/aitemplate/compiler/model.py", line 196, in _wrapped_func
    raise RuntimeError(f"Error in function: {method.__name__}")
RuntimeError: Error in function: AITemplateModelContainerCreate

The file size of test.so is 7.4G, and I have 24GB on my 3090.
I think it is related to the dynamic shape, when I compile with low resolution height/width, the model can be loaded. But when I compile with high resolution h/w, it gives me OOM.
Do you have any suggestions on this issue? Will removing reshape/permute help? Or can you provide some insight why the dynamic dimension range would affect the memory consumption?

Thanks and happy lunar new year!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model is successfully compiled, but OOM when loading #991

Model is successfully compiled, but OOM when loading #991

jiangwei221 commented Feb 9, 2024

Model is successfully compiled, but OOM when loading #991

Model is successfully compiled, but OOM when loading #991

Comments

jiangwei221 commented Feb 9, 2024