-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OOM killed when diarizing 4 hours recording #8235
Comments
@tttalshaul can you share more info on length of audio and size of embeddings in pkl file you saved? |
Audio length is 3:43 hours after cutting silence (using VAD). |
@nithinraok I'm moving ClusteringDiarizer(cfg=cfg).to(device), and I see that device is cuda:0. Thank you |
@tango4j what is the maximum audio duration we currently support for single pass offline diarzation |
Sorry for the late reply.
You should increase the |
tried now embeddings_per_chunk 50000, and also tried 500000. What can I do to debug it further (suppose I can't export the audio)? |
Also, what is the supported limit of audio length with long form audio clustering? (Is is less than 4 hours?) |
@tango4j 🙏 |
First check
This depends on your GPU memory size, portion of detected silence in your audio file (3)
This means speaker counting can be wrong (wrong number of speakers) and diarization accuracy can be dropped. |
@tango4j my GPU is T4 with 14G vRAM. What else can I do to make sure everything is loaded on GPU RAM? |
@tango4j any idea? |
@tttalshaul However, you have another quick solution: NeMo team is supporting CHiME-8 Challenge and we are providing the baseline for everyone. |
Hello @tango4j! Are you talking about |
I have also faced a CUDA OOM on long audio samples. Maybe my solution might help you too? #8469 Surprisingly, small changes in math helped not to hit the limit of GPU capacity. |
@dkurt |
@khaykingleb The MSDD-v2 model for Chime8 baseline will give you a different DER number for the same audio file, since these chime8 models are tuned for chime7 train/dev datasets. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
@tango4j Thank you, |
I believe it has the same interface. For example, you can see But the running doesn't work even if you download I get the following error:
And if I fix it, I get another error. @tango4j, can we really use MSDD V2? |
I've managed to run chime8 diarization somehow on a custom recording I've choosed but I've got not good results.. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
Hey, I made a solution that worked for me. Previously the code never went to long form clustering because it was giving OOM at get_argmin_mat. Let me know if this works for you. I was able to process a 5 hour audio on g4.xlarge using this. Merge-Request |
Thanks for creating the PR, @tango4j could you please help to review the PR. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue was closed because it has been inactive for 7 days since being marked as stale. |
The issue still exists |
@tango4j please add your comments/views for the PR. |
@nithinraok |
@tango4j Have you got the 4hour audio samples? |
Hey @tango4j @nithinraok Please view the PR. Its been waiting for quite a while now |
Sorry for the delay on @AatikaNazneen's PR. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue was closed because it has been inactive for 7 days since being marked as stale. |
I'm trying nemo_toolkit 1.22.0 on T4 (14G vRAM) with 40G cpu RAM to diarize with ClusteringDiarizer.
After completing speaker embeddings, tqdm shows "clustering: 0%" and then my script is getting killed by OS with error 137 OOM (out of memory) because of RAM consumption greater then 40G..
Installed nemo using pip install nemo_toolkit==1.22.0 (python 3.10 x64) on premise K8S.
I'm using diar_infer_meeting.yaml with max_num_speakers=20, chunk_cluster_count=50,
Tried to lower embeddings_per_chunk from default 10000 to 500 but it didn't help.
The text was updated successfully, but these errors were encountered: