Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifications Needed for Reproducing BS-RoFormer SDR Performance #34

Open
EuiYeonKim opened this issue Jul 12, 2024 · 0 comments
Open

Comments

@EuiYeonKim
Copy link

Hello,

Firstly, thank you for making the code publicly available. I am attempting to reproduce the results of the BS-RoFormer paper to achieve similar SDR performance and am referencing your code to do so. However, I am encountering significantly lower performance during reproduction and have a few questions. I would appreciate it if you could answer the following queries:

The hyperparameters I used are as follows:

dim: 384
depth: 6
stereo: True
num_stems: 1
time_transformer_depth: 2
freq_transformer_depth: 2
dim_head: 64
heads: 8
ff_dropout: 0.1
attn_dropout: 0.1
flash_attn: True
mask_estimator_depth: 2
Here are my questions:

Could you please share the model and training hyperparameters you used during training?
The paper mentions using a complex spectrogram as input, but I noticed the code uses torch.view_as_real to handle the input in a CaC manner. I believe this is different from the paper. Could you explain the reason for this difference?
I am running the training on an H100 80GB GPU with the above hyperparameters. Despite slight differences from the paper's hyperparameters, the batch_size of 4 fills up the 80GB memory. Could you let me know what batch_size you used and if there were any additional steps taken to manage memory efficiency? For reference, I used 44.1kHz 8-second audio for both target and mixture inputs, as per the paper's setup.
Your answers would be greatly helpful. Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant