Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load shard #338

Open
mrjunjieli opened this issue Jul 20, 2024 · 2 comments
Open

Unable to load shard #338

mrjunjieli opened this issue Jul 20, 2024 · 2 comments

Comments

@mrjunjieli
Copy link

When the wespeaker is applied on torch>=2.1, it will output this error:
"

[ WARNING : 2024-07-20 17:11:39,248 ] - error to parse id07100/uUtjsdtDOkQ/00327.wav.wav
[ WARNING : 2024-07-20 17:11:39,248 ] - error to parse id07259/87pXFH7gTZw/00009.wav.wav
[ WARNING : 2024-07-20 17:11:39,248 ] - error to parse id04222/KHa0QWgSUnA/00154.wav.wav
......

"
I try to modify dataset/processor.py
stream = tarfile.open(fileobj=sample['stream'], mode="r|*")
->
stream = tarfile.open(fileobj=sample['stream'], mode="r:*")
AND it works. I didn't try if it works on torch<2.1.

@wsstriving
Copy link
Collaborator

It seems it's the problem of tar, which is built in python, can you check whether the python version is the same?

@vikcost
Copy link
Contributor

vikcost commented Dec 7, 2024

I can confirm that I have same issue with double .wav extension.
This looks like a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants