-
-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speaker forwarding may cause server to hang #669
Comments
Can you check if maybe we're leaking something every time we start + stop sound? Maybe run it without limits, and / or even as root to try? |
2014-09-23 14:34:29: totaam uploaded file
|
Well, well, maybe I can reproduce this after all.
The trick here was to also have synergy syncing the clipboard between the laptop and the host running the xp vm, as well as virtualbox doing more clipboard synchronization between the host and the xp vm, then finally we have xpra doing clipboard sync between the laptop and xp... full circle. At one point, I saw 90+ kworker threads in my process list - they eventually cleared themselves. But I also managed to lockup the laptop's display, requiring a reboot (killing X11 would have done it too I suppose). Interestingly, killing a stuck server made it spit out:
@onlyjob: can you reproduce your problem with clipboard disabled? (I cannot so far) |
I've now connected well over a hundred times (scripted detach, with attach in a while true loop). No problem as long as the clipboard is disabled. Maybe I made things worse with r7751? As it will cause a timeout on the main loop waiting for the reply to the message that causes us to stop the clipboard. I don't think so, as the main loop should still be running, just nested. But maybe this nesting doesn't play well with gstreamer? @onlyjob: if you can reproduce this problem with the clipboard disabled, please provide a gdb backtrace. |
2014-10-02 18:14:29: onlyjob commented
|
I have finally managed to reproduce this one. I had to It took a laptop running the Fedora 21 alpha as server, and just my regular Fedora 20 client. Just a shame I am going away (hence the laptop upgrade)... this will have to wait until I get back. |
I have an idea: until we rewrite the clipboard using native calls (which would be nice as it would get rid of the ugly nested loop hack - but this is going to take a long time), we can move the sound code to its own process using multiprocessing - which means it will get its own main loop, independent of the clipboard code. It is Python 2.6+ only, which is fine for v0.15+. |
2014-11-19 04:04:30: onlyjob commented
|
multiprocessing is available since python 2.6 Unfortunately, it just doesn't work with pygst, which relies on the gobject main loop and we cannot run two copies of the main loop... The only solution that I can see would be to use a distinct subprocess that we spawn and talk to using pipes: we could use its stdin to send it messages (set_volume and stop? anything else? stop can be done with sigint) and get the compressed sound data from its stdout? Better yet would be to use mmap and re-use our video mmap code. |
2015-02-16 07:08:28: antoine uploaded file
|
The patch above is very hackish and would need cleaning up. It only handles sound capture via a separate process, not playback. It doesn't handle any message exchange, we just assume the data is the compressed sound buffers.. (don't want to add the network layer there if it can be avoided) It doesn't handle mmap transfers which would cave us a lot of unnecessary events and process context switching, if only we had an easy way of signalling the parent that some data is available without writing to the pipe. Can't see how. That said, it works very well and start + stop are very reliable. |
Things left to do:
|
Latest status: the win32 client seems to suffer from clock drift. But maybe this isn't a gstreamer pipeline clock drift but rather a system clock problem?? (added the "time" metadata attribute to see how this changes over time) Notes:
Latest TODO list:
Later:
|
Finally applied in r8786. See long commit message for details. Some new environment variables (which should not be needed - I hope):
@afarr: what needs testing:
Please re-assign to me so that I can try to close the remaining issues (as per comment:13 - though many are likely to be re-scheduled unless critical) |
@afarr: as of r8793, we can tune the minimum queue time (the amount of data we buffer before we start feeding the sound card) as well as the overall buffer size. ie:
shows:
I don't think we want to reduce the buffer size too much, or we will get more overruns. But maybe a little? (better to get overruns than having the sound go too far out of sync)
We should be able to reduce the min queue size. The default was to use half the buffer size, which was 225ms. r8794 reduces this to just a quarter, 112ms. This should be enough so that we don't get underruns (which can cause some crackling), without getting too much out of sync with the picture. Note: a great idea that Lonnie has had: we should be able to see what range of values the queue buffer is using and self tune it. |
r8805 makes the code much more generic and robust. PS: the idea of prioritizing sound packets was a bad one, they are already queued up ahead of the pixel data, and there is very little else anyway - so no changes needed in that area. See also: #790#comment:20 which seems to have a way of causing sound restarts. |
Can you please test on various OSX versions and let me know how it goes? |
If you have problems with stuttering sound and underruns, see #400#comment:15 |
Looks like something broke py3k (#640) support along the way, yay.
But this won't hold up the release, the GTK3 port is not in a great state anyway. |
As of r8870, we lower the min queue time to just 50ms by default. |
@onlyjob: thinking of releasing 0.15.x quite soon, now would be a good time to shout if you still have problems... or wait another 6 months! |
2015-03-31 11:46:51: onlyjob commented
|
2015-04-01 00:47:37: afarr commentedSome testing with win/osx 0.15.0 r8872 clients against a fedora 20 0.15.0 r8872 server, turning the speakers on/off with the tray/menu controls works reliably. Playing with the
Eventually the speakers don't restart without using a tray/menu link.
With low TIME settings starting sound in a second tab (at the same time as some first tab) will also cause stuttering and make it even easier to disable speakers. Testing and tweaking, it looks like Seems very stable otherwise. |
Some minor improvements and fixes in r8905, r8906, r8907, r8908. |
r8956 makes it easier to test overruns without changing the queue settings, just start the client with:
Where N is the number of seconds to wait before faking an overrun which restarts the sound pipeline. Using this patch and printing the call stack from I guess we could also restart the sound when the process dies, but this should never happen, so the log warning message should be enough for now. I am also using the same @afarr: looks ready for more testing. Can you break it again somehow? |
2015-04-08 02:29:45: afarr commented
|
That's the most important thing.
Hopefully, there won't be too many restarts then. We do buffer a tiny bit more than before (the min queue thing), we wait before the pipeline is actually ready before throwing data at it.. (rather than hoping for the best and filling up the queue too quickly, which is what we did before) which does mean that this delay will actually be worse over high latency links. I am working on making the restarts faster (which we should be able to do now in a different way: we don't have to wait for the previous sound process to terminate to start a new one - as long as it does terminate eventually...). The problem is that restarts tell the server to stop and start again, we get the "end-of-stream" message and that kills the new pipeline rather the old one! Will fix. As for the stacktraces above (thanks for posting them), they are caused by us killing the sound process before it has a chance of exiting cleanly - I will try to see if I can just raise the kill delay to allow for a more graceful exit. Taking the ticket back to deal with those small issues. |
Lots more tweaks and subtle fixes in r8958, r8959, r8960, r8961, r8962, 8965, r8966. Some of which should be backported (started in 8963 + 8964 - more needed). Problem is that I get deadlocks on OSX after restarts, sometimes very quickly, sometimes after 20+ restarts. Not sure what causes it, but the UI thread is deadlocked:
Could be related to the auto-release stuff? |
More tweaks and fixes in r8970 + r8971 + r8972 + r8974. Restarts should be quicker now. Though we still have to spawn a sound sink process client side, wait for it, then we tell the server we're ready, the server spawns its sound source process and we eventually get some data to forward... there will always be a delay. (we could shave off some of this setup cost - but not easily, so not doing it unless we have to) Done most of the remaining backporting of the fixes to v0.14.x in 8973. I was struggling to figure out why OSX was being so crap and deadlocking at random... It turns out that fixing the stacktraces from comment:27 also fixes the OSX deadlocks! r8975 gives us a lot more time to shutdown the gstreamer pipeline cleanly: we wait 250 ms in the sound process, 500 ms in the xpra client before we force kill it. (and in the meantime we can start a new process) @Afarrr: ready for more testing, and OSX should be stable too! (I've just done 350+ overrun restarts without any issue) |
2015-04-09 07:02:14: antoine uploaded file
|
More fixes and improvements, including to the child reaper (r8982) and adding hooks for it (r8984), misc + cleanups (r8983). @afarr: the OSX client now exits reliably again, whether you exit from the osx global menu, or using control-C. |
I have published beta builds for all platforms with all these changes. Caveats:
|
r8992 improves crash and error handling and more, see commit message and info below. Here's a recap of the new env vars that can be used for testing (making it easier to test error handling):
We should be handling almost all of these errors gracefully now. In particular, we now tell the other end to stop sending sound when the process died - something that is still a problem in the v0.14.x branch and needs fixing. Note: see also #835: " synchronize sound with video frames " |
2015-04-17 02:31:33: afarr commented
|
That's OK. The sound process and the main client process are talking to each other over pipes, and when we shutdown the sound process, those pipes are closed causing the warning.
Can you provide the
Good catch. That's fixed in r9029 and will need backporting. |
2015-04-17 23:53:53: afarr commented
|
The new wrapper was not firing the emulated signals from the UI thread, which was causing crashes on win32 with GTK3, r9079 fixes that. (and this could have caused other problems elsewhere - a nice find!) |
2015-05-01 23:30:00: afarr commented
|
@afarr: is this related to sound? (ie: does this go away if you turn sound off) |
I have moved the UI thread issue to #855. |
One more thing that needs doing: I can get the win32 client to get stuck on control-c... |
Backports in 9347 + 9350. |
Related tickets:
|
Issue migrated from trac ticket # 669
component: sound | priority: critical | resolution: fixed
2014-09-04 10:42:32: onlyjob created the issue
The text was updated successfully, but these errors were encountered: