Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis gets stuck with corrupt files #286

Closed
paulijar opened this issue Feb 17, 2021 · 3 comments
Closed

Analysis gets stuck with corrupt files #286

paulijar opened this issue Feb 17, 2021 · 3 comments

Comments

@paulijar
Copy link
Contributor

I got a report from a user that scanning the files is insanely slow, taking an hour or more per file. It turned out that his library contained some corrupted mp3 files which were causing the slowness.

I got some sample files and could reproduce the issue: calling getID3::analyze on these files seemed to cause some kind of busy loop, as CPU load hit 100% and nothing was happening. I waited for 45 minutes but the analysis didn't finish during this time. But as said, my user reported that eventually the scanning moved on to next files.

Now, obviously these files are broken and no metadata can be extracted from them. But could getID3 maybe bail out a bit sooner on these? I'll send the sample files by email.

@JamesHeinrich
Copy link
Owner

I have downloaded the sample files and confirmed there's something wrong (even corrupt MP3s shouldn't take more than a couple seconds to analyze or be rejected). I have not had time to look in detail as to where or how it's getting stuck, but I will look at the in the next day or two. Thanks for the samples.

JamesHeinrich added a commit that referenced this issue Feb 20, 2021
#286
Prevent apparently-mp3 files with large number of 0xFF chars from stalling scanning
@JamesHeinrich
Copy link
Owner

Should be fixed in 1490b43

Your files are "special" in that they consist largely of nothing but FF bytes which means the code that looks for the next valid MPEG-audio sequence has to examine every single byte (at least within the first 128kB of the file) which is why/where the immense slowdown was taking place. I have added a failsafe escape route where the loop is broken after examining 1000 false syncs. On my test system those corrupt files now finish scanning (with appropriate error) in about 0.1s each.

@paulijar
Copy link
Contributor Author

Now I finally had time to test the fix, and it seems to work fine. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants