-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skipping silence sometimes adds delay #884
Comments
Can you check if those songs happen to be mp3 with variable bitrate? I don't know where you can check this on Windows, but Windows has different mp3 handling compared to mac and linux, and variable bitrate has been known to cause issues. If they are constant bitrate, or not an mp3 in the first place, you're going to somehow need to share an affected song. If it's downloaded through the USDB Syncer, just put the USDB link here (plus what audio format you're using), otherwise the best way is probably PM'ing it through Discord (see https://usdx.eu/contact/ for the Discord link, then find my username in the #ultrastar-deluxe channel) or emailing on my git commit email. (don't put audio files / real txt's directly on github) |
I checked some MP3s with a variable bitrate and some with a constant bitrate, and it does indeed seem to only be an issue with variable bitrate MP3s. (Edit: For reference, I used |
I use my own script to download from USDB. I don't know if USDB Syncer existed when I made it, but I at least didn't know about it....
My script essentially does What audio codecs does USDX support in the |
USDX supports in the #MP3 field everything supported by ffmpeg. VBR MP3 files are not designed for accurate seeking. When we tell libavformat that we want to seek to position x, the library has to guess where x is inside the MP3 file based on the frame sizes it has already seen. And once it has chosen a position inside the file, it has no way to verify which point in time that corresponds to since MP3 frames don't contain time stamps. This works ok if all frames have the same number of bytes, but with VBR this is not the case. The only way to avoid problems like these is to decode the whole audio file from the beginning to the point where you want to skip to. It's not the fault of mp3. The problem is the lack of a container format like matroska or mp4 around the audio data. You can add one e.g. with |
Oh, that's good to know.
Ah, that clears things up, thanks for the explanation!
I'm not sure I really understand this. A Matroska file contains timestamps that point to a media track number and a frame number? If so, if I make a Matroska file with my VBR MP3 data (like the ffmpeg example you gave), wouldn't it have to decode the whole audio stream to generate those timestamps? If so, why does it take significantly less time than decoding to a PCM stream?
vs
(both fastest of 10 runs) In any case, I feel like decoding the 'whole' audio file in the case of VBR MP3s (in the native MPEG-1/MPEG-2 container) would be reasonable for USDX to do? I can't think of many cases where the extra decoding time would be slower than the time saved by skipping. |
Matroska files usually contain a KaxCues block that tells FFmpeg where to find some points in time inside the file. In the test file I created from an MP3 this is in 5 second steps. In addition to that Matroska places a time stamp right before every MP3 frame. FFmpeg treats MP3 both as a container format and as an audio codec. The muxing into Matroska only needs the container format code from libavformat. That code will look at each frame to determine how long it is in bytes and (micro)seconds. The code in libavcodec to convert an MP3 frame into PCM samples is not used. It is not full decoding, but every frame has to be looked at in the correct order to calculate the time stamps. |
Actual behaviour
With some songs, skipping silence introduces delay between output and microphone input.
Expected behaviour
Skipping silence doesn't introduce delay for any song.
Steps to reproduce
S
)Details
The text was updated successfully, but these errors were encountered: