You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on compiling a generative video model into AIT.
I can successfully compile the model, as you can see here:
2024-02-09 07:20:35,614 INFO <aitemplate.compiler.transform.memory_planning> max_blob=19546740864 constant_offset=7630531776
2024-02-09 07:20:35,939 INFO <aitemplate.backend.codegen> generated 1027 function srcs
2024-02-09 07:20:40,944 INFO <aitemplate.backend.codegen> generated 8 library srcs
2024-02-09 07:20:40,949 INFO <aitemplate.backend.builder> Using 64 CPU for building
2024-02-09 09:40:18,778 INFO <aitemplate.compiler.compiler> compiled the final .so file elapsed time: 2:19:37.829273
However, when I try to load the model into the same GPU, it reports OOM error:
Device: [79/1955]
ASCII string identifying device: NVIDIA GeForce RTX 3090
Major compute capability: 8
Minor compute capability: 6
UUID: GPU-aca8dfe8-0c10-ed38-e488-8117bfbc3566
Unique identifier for a group of devices on the same multi-GPU board: 0
PCI bus ID of the device: 46
PCI device ID of the device: 0
PCI domain ID of the device: 0
Memory limits:
Constant memory available on device in bytes: 65536
Global memory available on device in bytes: 25438126080
Size of L2 cache in bytes: 6291456
Shared memory available per block in bytes: 49152
Shared memory available per multiprocessor in bytes: 102400
[14:54:05] model_container.cu:87: Init AITemplate Runtime with 1 concurrency
[14:54:05] model_interface.cu:91: Error: DeviceMalloc(&result, n_bytes) API call failed: out of memory at model_interface.cu, line49
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/dist-packages/aitemplate/compiler/model.py", line 238, in __init__
self.DLL.AITemplateModelContainerCreate(
File "/usr/local/lib/python3.8/dist-packages/aitemplate/compiler/model.py", line 196, in _wrapped_func
raise RuntimeError(f"Error in function: {method.__name__}")
RuntimeError: Error in function: AITemplateModelContainerCreate
The file size of test.so is 7.4G, and I have 24GB on my 3090.
I think it is related to the dynamic shape, when I compile with low resolution height/width, the model can be loaded. But when I compile with high resolution h/w, it gives me OOM.
Do you have any suggestions on this issue? Will removing reshape/permute help? Or can you provide some insight why the dynamic dimension range would affect the memory consumption?
Thanks and happy lunar new year!
The text was updated successfully, but these errors were encountered:
Hi AIT team:
I'm working on compiling a generative video model into AIT.
I can successfully compile the model, as you can see here:
However, when I try to load the model into the same GPU, it reports OOM error:
The file size of
test.so
is 7.4G, and I have 24GB on my 3090.I think it is related to the dynamic shape, when I compile with low resolution height/width, the model can be loaded. But when I compile with high resolution h/w, it gives me OOM.
Do you have any suggestions on this issue? Will removing reshape/permute help? Or can you provide some insight why the dynamic dimension range would affect the memory consumption?
Thanks and happy lunar new year!
The text was updated successfully, but these errors were encountered: