Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't export audio from master #2457

Closed
softrabbit opened this issue Nov 22, 2015 · 35 comments
Closed

Can't export audio from master #2457

softrabbit opened this issue Nov 22, 2015 · 35 comments
Milestone

Comments

@softrabbit
Copy link
Member

Rendering audio to files is broken in master, the progress bar sticks at 0 and the GUI freezes. Tested with Skiessi-Onion.mmpz and some others, 64-bit builds fail but on my 32-bit system I get one successful export and the second one in the same LMMS instance freezes.

Git bisect points to 8b65ae3 but I might've screwed something up, as that commit looks pretty unlikely to cause this.

@tresf tresf added the bug label Mar 15, 2016
@tresf tresf added this to the 1.2.0 milestone Mar 15, 2016
@tresf tresf added the critical label Mar 15, 2016
@michaelgregorius
Copy link
Contributor

@softrabbit I am not able to reproduce this using the latest master (5d3333b). Are you using any special settings for the export? Have you also tried to reproduce this with a complete fresh rebuild (also remove the build directory and let it create from scratch) and a fresh .lmmsrc.xml?

@softrabbit
Copy link
Member Author

I remove my build and target directories all the time, and now I made sure to delete the rc-file too. Seems like the most consistent way for me is a qt5 build, using ALSA out to the actual hardware device. ALSA to the "default" device, i.e. PulseAudio seems to be less problematic. This is a pretty sure way to reproduce on my box:

  • Load Skiessi-Onion.mmpz
  • export using default settings, the .wav file gets written correctly
  • export again, progress stuck at 0%.

It might have improved a bit, as the first export seems to succeed more or less always now.

On the second attempt I noticed playback is broken as well after the first export, it stays at 0... could be related, maybe something is left in an unexpected state?

$ lsb_release -a |grep Desc
No LSB modules are available.
Description:    Ubuntu 15.10
$ ./target/bin/lmms --version
LMMS 1.1.90-ge98b6a2
(Linux x86_64, Qt 5.4.2, GCC 5.2.1 20151010)

@Fastigium
Copy link
Contributor

Been trying to reproduce this. Lots of funky stuff going on:

  1. If I cancel an export, then start another one, it always gives a SIGSEGV, but in different code positions. This was with the SDL audio backend.
  2. After setting the audio backend to ALSA (and restarting LMMS of course), I got a failed assertion somewhere deep in pulseaudio on starting the first export.
  3. Then I tried again, and finally managed to reproduce what @softrabbit reports. thread apply all bt revealed that both the main thread and the export thread were waiting for m_fifoWriter to finish, while the fifoWriter thread itself was waiting on its m_writer_sem semaphore. A quick read of the code suggests that the latter only occurs if the fifoBuffer is full and nothing is read from it to make room. An interesting deadlock: the threads that are supposed to read from the fifoWriter are waiting for it to finish, and the fifoWriter is waiting to be read from.

To fix the reported bug, more investigation is needed into how this situation can come about. Perhaps the cancel-and-reexport bug can be fixed in the same run. As for the deep pulseaudio crash: I've always hated pulseaudio's ALSA emulation 😁, but if it occurs often it may need a look as well. Or maybe it'll disappear after the main bug has been fixed. One can hope.

Anyway, I'll see if I can uncover more on this. @softrabbit Could you reproduce the bug with LMMS running in gdb and post the output of thread apply all bt somewhere? You may need to press Ctrl + C before you get a prompt to enter it on.

@zonkmachine
Copy link
Member

Git bisect points to 8b65ae3 but...

@softrabbit I tried a bit before that, 9bd4c4f, and it hang on me on onion/export two. I'll try and bisect it again.

@zonkmachine
Copy link
Member

f7e5bf4

f7e5bf4860c3f377d9de5d2757a0a8b1ec04813b is the first bad commit
commit f7e5bf4860c3f377d9de5d2757a0a8b1ec04813b
Author: Andrew Kelley <superjoe30@gmail.com>
Date:   Sun Sep 13 16:54:16 2015 -0700

    core: fix cleanup race condition in mixer

revealed that both the main thread and the export thread were waiting for m_fifoWriter to finish.

Cool. I'm just busy exporting Onion for the second time in Master... >:-)

As you said, fifoWriter. There was a waiting sequence that was removed in f7e5bf4
You change this line back to m_fifoWriter->wait( 1000 ); and it's fixed.

Edit: Just got a crash though so there may be something else going on there...

@zonkmachine
Copy link
Member

After reverting the line above I can do multiple exports with no problems but lmms will crash on exit, and this only after one export, with messag: QThread: Destroyed while thread is still running.
Doctor @Fastigium takes it from here...

@Fastigium
Copy link
Contributor

Someone called for the doctor 💉 😋?

m_fifoWriter->wait( 1000 );

Yeah, adding a timeout argument allows things to continue without the fifoWriter thread actually having stopped. It fixes the symptom, but not the problem. Thanks for the 🚜! Should help in my investigation diagnosis 😁

@Fastigium
Copy link
Contributor

Hm... I made PR #2687 which allows playback again after the first export, though it sounds strangely disfigured here. And the second export still hangs, waiting on some mysterious Qt-internal QMutex that can't be traced back to any LMMS code. More investigation work needed, but I need a break. Weird stuff.

@Fastigium
Copy link
Contributor

After investigating some more, I have a strong suspicion that this bug is at least partly caused by PulseAudio hijacking the alsa output. When a project is exported, the alsa audio thread fully terminates. PulseAudio seems not to like this. It uses a pthread_mutex_lock that fails an assertion at some thread terminations, but at others, the PulseAudio thread appears to keep running while the alsa thread is already gone, preventing the alsa audio thread from respawning correctly after the export.

At any rate, lots and lots of headaches caused by the alsa thread terminating during the export. I'm thinking of solving this by just keeping it running writing zeros. Should be very light on the CPU, and may also prevent future headaches with other sound servers hijacking alsa output. Now how to code this in a way that's not too hacky, hmm...

@michaelgregorius
Copy link
Contributor

To me it sounds like the underlying problem is that the audio export is in some strange way coupled to the audio driver that's used for live playback in the first place. I would have thought that an audio export doesn't use an audio driver (ALSA, Pulse, SDL, etc.) at all and just uses something like a system independent file driver that will collect the audio stream as fast as possible and write the result into a file according to the options that the user has chosen.

Is it correct that the export is coupled to the audio driver? And if yes, why is that so?

@musikBear
Copy link

Is it correct that the export is coupled to the audio driver? And if yes, why is that so?

what i see is that the project is being silently replayed, not only counter, but even playhead moves during export. That would indicate that two processes take place at the same time? A normal, but silent replay, and a file-handling-process.

@Fastigium
Copy link
Contributor

@michaelgregorius The project export uses a special audio interface that writes to a file instead of sending to a soundcard, so that is kind of what you expected. However, that special interface temporarily replaces the normal audio interface, terminating the thread of the normal one. The project renderer attempts to reverse the swap after exporting and restart the thread of the normal audio interface. This gives problems with the alsa audio interface, especially when it's actually backed by PulseAudio.

@musikBear It plays as normal indeed, it just replaces the audio driver with a file driver, so to speak.

@michaelgregorius
Copy link
Contributor

@Fastigium That sounds good in principle. Thanks for the insight!

@Fastigium
Copy link
Contributor

Ah, the SDL audio interface already does what I propose to do for alsa: write zeros when processing is stopped, keeping its thread alive. I was afraid not stopping the thread might make stopProcessing() semantically wrong, but apparently it's already in use that way. Time to change the alsa audio interface!

@musikBear
Copy link

@Fastigium That sounds good in principle. Thanks for the insight!

👍 Great job @Fastigium !

@Fastigium
Copy link
Contributor

Testing please! #2687 now fixes the export problem for me. However, I often get distorted sound when I switch projects after exporting. This distorted sound sometimes turns back to normal after a while (a minute or two). More weird stuff. Not sure how critical that is, though.

@musikBear Thanks 😊! I was going to say insight alone doesn't fix a problem, but it's an essential step toward a fix, so I'll take the compliment and be happy with it 🍻

@zonkmachine
Copy link
Member

I'm testing this now with multiple exports and switching projects in between. Five different projects and counting, this now works. I had a project hanging on me but that's hardly related.

However, I often get distorted sound when I switch projects after exporting. This distorted sound sometimes turns back to normal after a while (a minute or two).

I haven't seen the noise you refer to but I've had it before, properly scrambled sound that is, and I think it's pretty much a known issue but I can't find the report for it right now.

@tresf
Copy link
Member

tresf commented Mar 21, 2016

I think it's pretty much a known issue but I can't find the report for it right now.

I did some extensive testing with various backends for this bug report, not sure if the symptoms are similar or not... Edit: Also #158 #2582

@zonkmachine
Copy link
Member

@tresf I think that's the bug report I was looking for.
Edit: #158 was the Issue I had in mind.

I've now exported ten of the projects two to three times. Fixed for me. 👍

@Fastigium
Copy link
Contributor

@zonkmachine Thanks for the testing!

@tresf Thanks, that's some interesting reading! Places things in perspective.

@softrabbit If you have the time, I'd love to know if #2687 fixes this bug for you as well!

@tresf
Copy link
Member

tresf commented Mar 22, 2016

Probably unrelated, but some interesting output from openbsd build... Pasting it here as it may relate in some way to export, specifically the OGG portion ;)

CMakeFiles/lmmsobjs.dir/core/audio/AudioFileOgg.cpp.o: In function `AudioFileOgg::startEncoding()':
AudioFileOgg.cpp:(.text+0x1b6): warning: warning: strcpy() is almost always misused, please use strlcpy()
CMakeFiles/lmmsobjs.dir/core/DrumSynth.cpp.o: In function `DrumSynth::GetDSFileSamples(char const*, short*&, int, unsigned int)':
DrumSynth.cpp:(.text+0x2b04): warning: warning: rand() may return deterministic values, is that what you want?
/usr/local/lib/gcc/x86_64-unknown-openbsd5.8/4.9.3/../../../libjack.so.1.0: warning: warning: strcat() is almost always misused, please use strlcat()
CMakeFiles/lmmsobjs.dir/core/ProjectRenderer.cpp.o: In function `ProjectRenderer::updateConsoleProgress()':
ProjectRenderer.cpp:(.text+0x691): warning: warning: sprintf() is often misused, please use snprintf()
/usr/local/lib/gcc/x86_64-unknown-openbsd5.8/4.9.3/../../../libglib-2.0.so.4200.1: warning: warning: stpcpy() is dangerous; do not use it
/usr/local/lib/gcc/x86_64-unknown-openbsd5.8/4.9.3/../../../libglib-2.0.so.4200.1: warning: warning: vsprintf() is often misused, please use vsnprintf()
/usr/X11R6/lib/libGL.so.15.0: warning: warning: random() may return deterministic values, is that what you want?

@Fastigium
Copy link
Contributor

@tresf Thanks for the heads-up! I've looked into it, but that use of strcpy is actually fine. No user input involved, so no chance of overflow or overallocation.

@tresf
Copy link
Member

tresf commented Mar 22, 2016

that use of strcpy is actually fine. No user input involved, so no chance of overflow or overallocation.

👍

@softrabbit
Copy link
Member Author

@Fastigium: well done, #2687 seems to fix this issue. Haven't extensively tried to break it yet, but exported some 5 times in a row without LMMS hanging or breaking playback 👍

@Fastigium
Copy link
Contributor

@softrabbit Thanks for testing! I'll close this issue when #2687 is merged.

In the mean time, a comment by Wallacoloo gives me pause. #2687 would introduce another such member that technically requires synchronization. Testing so far hasn't turned up any problems, but since these source files are compiled with C++11 enabled any compiler is free to optimize m_stopped into not working. Hm. On the other hand, synchronization in the audio thread is something you really want to avoid, to prevent underruns. It is a very common pattern to signal a thread with a shared boolean. Perhaps there is a lock-free alternative I don't know about. I'll Google around a bit.

@Fastigium
Copy link
Contributor

Welp, what a mess! I've been investigating the issue even more, and have reached this conclusion: a small redesign is needed in how LMMS stops processing audio. Let me explain:

  1. This bug is a regression caused by f7e5bf4, as @zonkmachine determined.
  2. Prior to f7e5bf4, stopping the processing of audio would be done by inserting a NULL buffer into the mixer's fifoBuffer, then waiting for the AudioDevice to stop on encountering that NULL (stopProcessing() on an AudioDevice had that job). Most AudioDevices have a 1 second timeout on that wait, meaning that that NULL buffer might or might not be processed. If it wasn't, audio would hang on restarting the audio processing, but that bug seldom surfaced. Under default settings, this race condition was unlikely to occur.
  3. After f7e5bf4, m_fifoWriter was set to NULL before stopProcessing() was called on the AudioDevice. When m_fifoWriter is NULL, the AudioDevice stops getting audio buffers from the fifoBuffer and starts getting them from direct rendering. This means that the AudioDevice is very likely never to get a NULL buffer, which caused the SDL shutdown hang reported with f7e5bf4 applied, which was subsequently fixed by modifying AudioSdl not to wait for that NULL buffer anymore. All AudioDevices with a 1 second timeout (including AudioAlsa) did not show issues, as they would terminate the hard way after the timeout had expired. However, restarting audio after such a hard termination is more problematic, as this bug I'm commenting on shows.
  4. Both before and after f7e5bf4, stopping the processing of audio isn't great. Before f7e5bf4, it waited for the whole fifoBuffer to be emptied, and if that took too long, it might terminate the hard way, leaving the AudioDevice unable to restart processing. After f7e5bf4, most AudioDevices are forced to terminate the hard way, and lots of old data is kept in the fifoBuffer.

What I would want to happen is this:

  1. stopProcessing() on an audio device always stops processing audio cleanly at the first possible moment, making shutting down LMMS and starting an export feel quick (they feel slow now) and preventing unnecessary hard termination. Termination should only happen on audio interface malfunction, after the one second timeout.
  2. The fifoBuffer should always be emptied after stopping the processing of audio. This prevents old buffers from being played back after an export has occurred.

Also, there are lots of benign races in the code (unsynchronized booleans used for signalling threads to stop; no, declaring them volatile isn't enough). I intend to get rid of them all in a big commit, so that we're C++11-safe. I was thinking of using std::atomic<bool>, seeing how the code already contains C++11-or-higher constructs (the range-based for loops), so it shouldn't reduce portability too much. On the other hand, Qt's synchronization mechanisms are already in use, and may be slightly more portable. Hm.

TL;DR: Some LMMS code sucks right now; I want to improve it but making the right choices is hard 😓

@Fastigium
Copy link
Contributor

Sorry for spamming this bug, but thinking "aloud" helps my thought process, and this gives other people with thoughts on the matter a chance to chime in 😊

My vision for a redesign is somewhat hampered by an interesting property of the fifoBuffer: it blocks both when reading stops too early or too late and when writing stops too early or too late. Therefore, any synchronization mechanism that circumvents the fifoBuffer is prone to cause deadlocks on stopping the processing of audio. In other words, there's a reason that audio interfaces used to be stopped by a signal transported by the fifoBuffer itself: it guaranteed that the audio device would stop processing at exactly the right moment. Also, at default settings, the fifoBuffer only holds a single sampleframe, so the delay on stopping the processing of audio is negligible (5.8 ms).

At any rate, the pre-f7e5bf4 way isn't as bad as I thought. It is confusing, however, that calling stopProcessing() on an AudioDevice does not actually stop the processing, but waits until it is stopped by the NULL buffer transported through the fifoBuffer. I think that's where the misunderstanding backing f7e5bf4 came from. I've considered renaming stopProcessing() to waitUntilProcessingHasStopped(), but that ignores the fact that AudioDevice::stopProcessing() (the base method that is mostly overridden) does actively contribute to stopping the processing under certain circumstances (though normally, they never occur AFAIK). Sigh.

So if I were to revert f7e5bf4 and 98346f7, modify AudioSoundIo.cpp to stop processing on receiving a NULL buffer and fix the benign races, things would basically be fine. I'd add comments to Mixer::stopProcessing() and perhaps all the AudioDevice::stopProcessing() overrides to prevent a misunderstanding like f7e5bf4 from happening again. The largest stopping delay possible would be 371.5 ms (the max buffer size in settings), which would be acceptable and well under the 1 second hard termination timeout. The fifoBuffer would be emptied before audio processing was stopped, so no need for cleaning up.

Boy, that almost sounds to good to be true, I really hope I understand it correctly now 😝. I'll mull over it for a little while longer, and if it still seems like a good idea, I'll make another PR. The LMMS code base is frighteningly complex sometimes 😰

@tresf
Copy link
Member

tresf commented Mar 28, 2016

In other words, there's a reason that audio interfaces used to be stopped by a signal transported by the fifoBuffer itself: it guaranteed that the audio device would stop processing at exactly the right moment.

What change occurred to make this undesirable (can/should we go back to that original signal method)?

I've considered renaming stopProcessing() to waitUntilProcessingHasStopped(), but that ignores the fact that AudioDevice::stopProcessing() (the base method that is mostly overridden) does actively contribute to stopping the processing under certain circumstances (though normally, they never occur AFAIK). Sigh.

So if I were to revert f7e5bf4 and 98346f7, modify AudioSoundIo.cpp to stop processing on receiving a NULL buffer and fix the benign races, things would basically be fine. I'd add comments to Mixer::stopProcessing() and perhaps all the AudioDevice::stopProcessing() overrides to prevent a misunderstanding like f7e5bf4 from happening again

Yeah that's quite confusing indeed. Perhaps waitUntilProcessingHasStopped could be empty() or something more gulp sounding. :)

@Fastigium
Copy link
Contributor

What change occurred to make this undesirable (can/should we go back to that original signal method)?

It is only undesirable because the whole existing fifoBuffer needs to be processed before the signal reaches the AudioDevice. But I called it undesirable before I learned that by default, the fifoBuffer holds a whopping 5.8 ms of samples 😁

As for going back to the original method, I was proposing to do that. "So if I were to revert f7e5bf4 and 98346f7, modify AudioSoundIo.cpp to stop processing on receiving a NULL buffer" is exactly that: revert to the original method.

Perhaps waitUntilProcessingHasStopped could be empty() or something more gulp sounding. :)

Could you expound on that a little? I'm not sure what you mean now 😅

@tresf
Copy link
Member

tresf commented Mar 29, 2016

Perhaps waitUntilProcessingHasStopped could be empty() or something more gulp sounding. :)

Could you expound on that a little? I'm not sure what you mean now 😅

I was trying to help come up with a new name, that's all. 👍

@Umcaruje
Copy link
Member

Umcaruje commented Jul 23, 2016

Ok, so on the current master, this issue is fixed for me on ALSA and SDL backends. @jasp00's recent PR's probably fixed this. I do however get a crash on exporting with the Jack backend, but that's a different issue.

@softrabbit could you test out to confirm this?

I'll make an new issue for the Jack crash.

@Umcaruje
Copy link
Member

Can anyone still reproduce this bug on current master, or is it fixed?

@jasp00
Copy link
Member

jasp00 commented Aug 24, 2016

I can export projects without freeze, so I think this is fixed.

@jasp00 jasp00 closed this as completed Aug 24, 2016
@follower
Copy link
Contributor

Just a data point on this for Mac OS X 10.8.x:

  • 1.2.0-RC1 (a.k.a. 1.1.90) -- "PortAudio"/"Core Audio" hung on export (a small number of bytes get to the file on disk). Only "SDL" appeared to export successfully.
  • 1.1.91 (from Command + Drag does not work (Apple) #2890) -- "PortAudio"/"Core Audio" and "SDL" both seem to export successfully with no hang.

@zonkmachine
Copy link
Member

@follower This was fixed after 1.2.0-RC1. You need to test the latest source or the next RC when it's available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants