Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable multiprocessing when dumping features in hubert preprocessing #2311

Closed
wants to merge 1 commit into from

Conversation

nateanl
Copy link
Member

@nateanl nateanl commented Apr 1, 2022

The multi-processing works well on MFCC features. However, it sometimes makes the script hang when dumping HuBERT features. Change it to for-loop resolves the issue.

@mthrok
Copy link
Collaborator

mthrok commented Apr 1, 2022

The multi-processing works well on MFCC features. However, it sometimes makes the script hang when dumping HuBERT features. Change it to for-loop resolves the issue.

Sounds like oversubscription issue.
What happens if you set the number of threads to 1?

https://jdhao.github.io/2020/07/06/pytorch_set_num_threads/

@nateanl
Copy link
Member Author

nateanl commented Apr 1, 2022

That's what I was using. OMP_NUM_THREADS=1
I was using it for training kmeans model with mfcc features, because there is some warning message shows

OpenBLAS Warning : Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option.

With OMP_NUM_THREADS=1, the kmeans method is working.
However, with OMP_NUM_THREADS=1, the script still hangs in the dumping hubert feature step.

@nateanl
Copy link
Member Author

nateanl commented Apr 1, 2022

oops, I misread the option. I used OMP_NUM_THREADS instead of OM_NUM_THREADS, let me try with it with multi-processing.

@mthrok
Copy link
Collaborator

mthrok commented Apr 1, 2022

OM_NUM_THREADS is typo. OMP_NUM_THREADS is correct.

@facebook-github-bot
Copy link
Contributor

@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@github-actions
Copy link

github-actions bot commented Apr 5, 2022

Hey @nateanl.
You merged this PR, but labels were not properly added. Please add a primary and secondary label (See https://github.com/pytorch/audio/blob/main/.github/process_commit.py)

xiaohui-zhang pushed a commit to xiaohui-zhang/audio that referenced this pull request May 4, 2022
…pytorch#2311)

Summary:
The multi-processing works well on MFCC features. However, it sometimes makes the script hang when dumping HuBERT features. Change it to for-loop resolves the issue.

Pull Request resolved: pytorch#2311

Reviewed By: mthrok

Differential Revision: D35393813

Pulled By: nateanl

fbshipit-source-id: afdc14557a1102b20ecd5fafba0964a913250a11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants