Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference was killed due to memory(100GB was used) #2116

Open
yht4work opened this issue Sep 27, 2024 · 2 comments
Open

inference was killed due to memory(100GB was used) #2116

yht4work opened this issue Sep 27, 2024 · 2 comments
Labels
question Further information is requested

Comments

@yht4work
Copy link

Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

The length of the audio file is 15 hours, and when I use the code below for inference, the process is killed due to running out of memory. Is this behavior normal, or could it be a memory leak?

    model = AutoModel(
        model="paraformer-zh",
        vad_model="fsmn-vad",
        punc_model="ct-punc",
        spk_model="cam++",
    )

    res = model.generate(input=audio_path, batch_size_s=batch_size_s, hotword=hotword)

image

Code

    model = AutoModel(
        model="paraformer-zh",
        vad_model="fsmn-vad",
        punc_model="ct-punc",
        spk_model="cam++",
    )

    res = model.generate(input=audio_path, batch_size_s=batch_size_s, hotword=hotword)

What have you tried?

I have tried shorter length of audio, which is normal.

What's your environment?

  • OS (e.g., Linux): Ubuntu 22.04
  • FunASR Version (e.g., 1.0.0): 4294d21
  • ModelScope Version (e.g., 1.11.0): 1.18.1
  • PyTorch Version (e.g., 2.0.0):
pytorch-wpe              0.0.1
torch                    2.4.1
torch-complex            0.4.4
torchaudio               2.4.1
torchvision              0.19.1
(funasr) yuhang@a2:/opt/repo/FunAS
  • How you installed funasr (pip, source): source git code, pip install -e .
  • Python version: 3.10.14
  • GPU (e.g., V100M32) 4090
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:01:00.0 Off |                  Off |
| 30%   49C    P2             220W / 450W |  10265MiB / 24564MiB |     75%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
@yht4work yht4work added the question Further information is requested label Sep 27, 2024
@LauraGPT
Copy link
Collaborator

show me the decoding log.

@yht4work
Copy link
Author

INFO:root:Logging to file: /opt/repo/data-clean/audio/audio2txt/logs/20240929105547.log
2024-09-29 10:55:47,489 - INFO - Logging to file: /opt/repo/data-clean/audio/audio2txt/logs/20240929105547.log
INFO:root:Found 10 audio files in /opt/data/douyin/recorder
2024-09-29 10:55:47,494 - INFO - Found 10 audio files in /opt/data/douyin/recorder
funasr version: 1.1.7.
Check update of funasr, and it would cost few times. You may disable it by set `disable_update=True` in AutoModel
New version is available: 1.1.8.
Please use the command "pip install -U funasr" to upgrade.
INFO:root:download models from model hub: ms
2024-09-29 10:55:47,717 - INFO - download models from model hub: ms
2024-09-29 10:55:48,433 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
INFO:root:Loading pretrained params from /home/yuhang/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
2024-09-29 10:55:49,868 - INFO - Loading pretrained params from /home/yuhang/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
INFO:root:ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
2024-09-29 10:55:49,877 - INFO - ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
/opt/repo/FunASR/funasr/train_utils/load_pretrained_model.py:39: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  ori_state = torch.load(path, map_location=map_location)
INFO:root:scope_map: ['module.', 'None']
2024-09-29 10:55:50,478 - INFO - scope_map: ['module.', 'None']
INFO:root:excludes: None
2024-09-29 10:55:50,478 - INFO - excludes: None
INFO:root:Loading ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt, status: <All keys matched successfully>
2024-09-29 10:55:50,545 - INFO - Loading ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt, status: <All keys matched successfully>
INFO:root:Building VAD model.
2024-09-29 10:55:50,745 - INFO - Building VAD model.
INFO:root:download models from model hub: ms
2024-09-29 10:55:50,745 - INFO - download models from model hub: ms
2024-09-29 10:55:51,148 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
INFO:root:Loading pretrained params from /home/yuhang/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
2024-09-29 10:55:51,393 - INFO - Loading pretrained params from /home/yuhang/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
INFO:root:ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
2024-09-29 10:55:51,393 - INFO - ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
INFO:root:scope_map: ['module.', 'None']
2024-09-29 10:55:51,396 - INFO - scope_map: ['module.', 'None']
INFO:root:excludes: None
2024-09-29 10:55:51,396 - INFO - excludes: None
INFO:root:Loading ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt, status: <All keys matched successfully>
2024-09-29 10:55:51,397 - INFO - Loading ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt, status: <All keys matched successfully>
INFO:root:Building punc model.
2024-09-29 10:55:51,397 - INFO - Building punc model.
INFO:root:download models from model hub: ms
2024-09-29 10:55:51,397 - INFO - download models from model hub: ms
2024-09-29 10:55:51,769 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
Building prefix dict from the default dictionary ...
DEBUG:jieba:Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
DEBUG:jieba:Loading model from cache /tmp/jieba.cache
Loading model cost 0.295 seconds.
DEBUG:jieba:Loading model cost 0.295 seconds.
Prefix dict has been built successfully.
DEBUG:jieba:Prefix dict has been built successfully.
INFO:root:Loading pretrained params from /home/yuhang/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt
2024-09-29 10:56:03,073 - INFO - Loading pretrained params from /home/yuhang/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt
INFO:root:ckpt: /home/yuhang/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt
2024-09-29 10:56:03,074 - INFO - ckpt: /home/yuhang/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt
INFO:root:scope_map: ['module.', 'None']
2024-09-29 10:56:03,676 - INFO - scope_map: ['module.', 'None']
INFO:root:excludes: None
2024-09-29 10:56:03,676 - INFO - excludes: None
INFO:root:Loading ckpt: /home/yuhang/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt, status: <All keys matched successfully>
2024-09-29 10:56:03,739 - INFO - Loading ckpt: /home/yuhang/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt, status: <All keys matched successfully>
INFO:root:Building SPK model.
2024-09-29 10:56:03,916 - INFO - Building SPK model.
INFO:root:download models from model hub: ms
2024-09-29 10:56:03,916 - INFO - download models from model hub: ms
2024-09-29 10:56:04,310 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
Detect model requirements, begin to install it: /home/yuhang/.cache/modelscope/hub/iic/speech_campplus_sv_zh-cn_16k-common/requirements.txt
install model requirements successfully
INFO:root:Loading pretrained params from /home/yuhang/.cache/modelscope/hub/iic/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin
2024-09-29 10:56:05,381 - INFO - Loading pretrained params from /home/yuhang/.cache/modelscope/hub/iic/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin
INFO:root:ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin
2024-09-29 10:56:05,383 - INFO - ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin
/opt/repo/FunASR/funasr/train_utils/load_pretrained_model.py:39: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  ori_state = torch.load(path, map_location=map_location)
INFO:root:scope_map: ['module.', 'None']
2024-09-29 10:56:05,430 - INFO - scope_map: ['module.', 'None']
INFO:root:excludes: None
2024-09-29 10:56:05,430 - INFO - excludes: None
INFO:root:Loading ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin, status: <All keys matched successfully>
2024-09-29 10:56:05,440 - INFO - Loading ckpt: /home/yuhang/.cache/modelscope/hub/iic/speech_campplus_sv_zh-cn_16k-common/campplus_cn_common.bin, status: <All keys matched successfully>
  0%|                                                                                                                                              | 0/10 [00:00<?, ?it/s]
INFO:root:Transcribing file:  /opt/data/xxx/recorder/xxx/xxxx_2024-09-02_10-16-32.mp3
2024-09-29 10:56:05,551 - INFO - Transcribing file:  /opt/data/xxx/recorder/xxx/xxxx_2024-09-02_10-16-32.mp3
Killed        

@LauraGPT thanks for your reply, is there any suggestion about how to solve it or avoid it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants