-
Notifications
You must be signed in to change notification settings - Fork 3
Suggestion: Use dedicated pulseaudio monitor #22
Comments
Thanks a bunch! It really is awesome. It's very imprecise for songs with noisy intros and outros, but for videos that are just delayed it's perfect. I currently don't really know how to solve this problem for videos with intros to be honest. The solution you propose was what I was going to do initially, but I would've had to learn how to use the pulseaudio libraries so I left it for the future. The script to obtain the audio data in I don't know how deep you've looked into the code, but I currently get the audio data from ffmpeg with a pipe, in The current solution is of course worse than what you propose: I have to read the data byte by byte (very slow), and I don't have much control over it. I also can't scope specifically the Spotify app like you did, so I have to record the entire desktop. With a pulseaudio module I could also find a more suitable intermediate format that both audio files can be converted to, rather than using only what I mentioned earler. That way I would also avoid doing unnecessary conversions (if both original formats are in 41000Hz, I'd just leave them like that, rather than converting both to 48000Hz). I appreciate your post because I have an out-of-the-box pulseaudio setup and I don't know how it actually works for more advanced users. I will use your bash sketch when I start improving this part of the code. I'm currently more focused on the multi-threading part, and being able to control ffmpeg from outside the module. That way, if the Spotify music is paused, ffmpeg won't keep recording. Also for a better synchronization between threads overall. I currently can't work on what you suggest because it's going to take me a lot of time and I currently don't have it. Also, I'm still a student so my C code is really messy and I want to get it right, which takes time. I'm starting finals soon so there won't be much activity in here until I'm done (by mid-February). I'll still be checking the repos regularly for any issues/PRs made, though. Thanks again for your help!! |
:O I didn't know that you could do pulseaudio things directly from C, that is indeed a lot better (and faster) then my bash script.
I didn't go very deep, my C is not nearly as good as my python :( Something like this could work, if executed (by vidify's python side) every time spotify starts playing. I have added a part that moves the ffmpeg recording stream to the right monitor at the end. It's entirely bash, so no need to adjust any C code. It doesn't touch the actual audio streams, it just moves them around. This script works for me if I execute it just after vidify starts. However the thing is that as soon as I pause the music the spotify stream gets reset :(
But as you said, using the libraries would be a lot better.
No rush :) , I'm just dumping ideas here and there, otherwise I'll forget them later. Good luck with your finals :D |
Ohhh so I kinda misunderstood your initial comment. You want to run It actually doesn't seem that hard to implement directly in C, so after I have more time I'll first try to do it with the pulseaudio libraries (or ALSA). The hard part is the recording itself because I need to do lots of conversions. But for now, I could just run this script translated to C and afterwards record the audio with ffmpeg (if that's possible). Maybe the issue you have when Spotify is paused can be fixed with this lower level code.
Great! Keep the ideas coming, I love it. I'll keep reading the repos in case someone opens new issues/PRs. Also, thanks! Good luck to you whenever you start finals, too :) |
Exactly, this only has to be done once, when the virtual sink and loopback are available they will stay available until the system reboots.
Indeed, as soon as we want to analyse spotify's audio output we should move that audio stream to the virtual sink. Maybe it is even possible to get the name of the player that is playing from vidify/(py)dbus and use that to look up the index of the audio stream that needs moving. That way there is no need to hardcode it to spotify. Getting ffmpeg to record from the correct monitor is the easy part, just replace 'default' with the monitor of the virtual sink:
That would be awesome, using ALSA would probably be even faster because spotify uses ALSA (all spotify streams are always named "ALSA plug-in [spotify]"). That being said Pulseaudio is probably a lot easier then ALSA.
Spotify always outputs audio at 44100 Hz, irrespective of the rate pulseaudio is using (the format is variable though, either s16le or s32le depending on pulseaduio settings). If you use a dedicated monitor you can exploit this and save a bit of CPU. This can be done by hard coding the null sink to match spotify's rate (add the
This prevents pulseaudio from resampling the spotify stream, and as added benefit you now know for certain what the rate of the input stream from the Here's a more concrete schematic example ($DEF_SAMP_RATE is pulseaudio's 'Default Sample Rate' from
Currently (with audiosync)
With dedicated sink/monitor (without hardcoding rate=44100)
With dedicated sink/monitor (hardcoding rate=44100)
In the last example the resampling to In short, i think it would be a good idea to hardcode |
I had some spare time just now, and I brushed up my C-skills a bit and got this "'working'" (sort of). Something like this is very ugly, but it works excellent: Just before the audiosync module starts recoding, this will:
This makes the audiosync feature work pretty well inside my non-trivial pulseaudio setup 😄 . [EDIT]
|
Hey Andrew, I'm back. Thanks a lot for your contributions. The script is super useful because I'm not too experienced with PulseAudio and elaborate set-ups, so it'll serve as a great guide when I translate it to C. Your solution seems to work well, but I really would like to take more time to do it correctly with the ALSA or PulseAudio libraries. Your first comment about hardcoding |
So I've started to work on this a bit in the pulseaudio branch and here are a couple things I've noticed:
|
Interesting, i have the same behaviour on my PC when I tried it just now. One thing I noticed is that the spotify stream is now named "Spotify" instead of "Alsa Plugin [Spotify]" (see screenshot in my first post). Maybe spotify got updated or something else changed in my setup. In any case spotify seems to be using pulseaudio directly now, which makes the streams behave more nicely. (Which is good, I'm just confused about what was going on before with the "Alsa Plugin [Spotify]" stuff) In any case
Using this we could create some if statement to only move the sink if it is not already on the correct sink. Because it should still be moved once after spotify has started. (and if spotify is restarted) Executing such an if statement whenever a (new) song starts should make sure that audiosync keeps working at all times.
I hadn't noticed this before. In any case the latency is small on my PC (it fluctuates between 0.1ms and 2 ms): Please note that the loopback module's default latency is 200ms, this is why I specified As a side note: The specified latency is not guaranteed to be the actual latency (see screenshot: set lattency is 1ms, yet actual is 2ms). The actual lower limit of the latency is determined by the hardware (in my case apparently 2ms). Setting it to something higher will make pulseaudio (artificially) increase the latency to match the set latency. Setting the latency lower then what your hardware can do has no effect, pulseaudio will just use the lowest latency possible. As far as I know, this has no negative side effects, other then maybe using slightly more CPU. As another side note: The main use of the loopback module is looping back microphone/Line-In inputs back to speakers/headphones. In such a use case you do want latency, to prevent a feedback loop. I suspect this is the reason that the default latency is 200ms. In our case there is no risk of a feedback loop so we can just set the latency as low as possible. (Which I think is 1ms, I'm not sure what happens if you set it to 0ms) [EDIT]: I took a more in depth look at the code and noticed:
The rate is indeed optional (though it should in theory speed things up) but without specifying
I don't get this on my PC (or at least I don't hear it), I can switch the spotify stream between virtual sinks more or less seamlessly without hearing any kind of interruption. I suspect that what you are observing is a symptom of the half a second delay you noticed in the previous point. Eliminating the delay (as much as possible) should stop (or at least reduce) this. In any case, since it doesn't seem necessary any more to switch the stream at the beginning of every song. I don't think it will be that annoying.
You can create as many virtual sinks/loopbacks as you want. But every stream can only output into 1 sink. If you'd like to have a stream output to multiple sinks, you can use the 'module-combine-sink' module. It is a virtual sink that sends it's input to another sink, and a 'slave sink'. |
I should investigate more on this, but I'm not sure where exactly their changelog can be found. I use a rolling-release distro, so I'm not sure at what version this happened. If it wasn't too long ago it's possible that those with older versions will experience issues. Anyway, I don't think checking the current sinks whenever a new song starts will worsen the performance. Using a pipe like right now is already quite slow, I imagine.
Ah yes. I commented it out because I wanted to use the default config before anything else. So thanks for explaining it more clearly. I'll use the
I'll have to investigate a bit more about using 44100Hz as the sample rate, though. Many other parts of the program use 4.8kHz. Also, I'm not sure if it will worsen the audiosync's precision.
Yes, this is probably what you suggested. I haven't heard it again.
Thanks! For now I don't think it's needed, but it's good to know. |
Update: the pulseaudio libs are almost unusable... Terrible docs, no real examples, complex compilation... So I'll try to come up with the best solution I can find to use the CLI if I end up being unable to use C... There seems to be a big difference between the pulseaudio API needed to run CLI commands, and the simple API to record sinks ans such. The former is way too low level and undocumented, but I still have hope for the latter. |
I'm on spotify-1.1.10-r1 at the moment, the git logs show that this version was added 2019-08-08. So I don't think any spotify update is the cause of this change after all. Perhaps it was something else. In any case it would be a good idea to check whether the spotify stream is still set to the correct sink whenever audiosync starts recording anyway.
I did some more tests, and it doesn't really seem to matter. This is without rate=44100: The main difference is that the "Resampler" of the spotify stream changes to "copy" instead of "speex-float-5" (which is the default resample-method on my system). This is to be expected, in fact that was the whole point of the rate=44100 in the first place. However, the latency doesn't really change significantly, it seems slightly better with rate=44100. However it fluctuates a lot, and any improvement is overshadowed by this fluctuation, so I dare not draw any hard conclusions. That being said, in theory rate=44100 should still reduce latency, and I would expect to observe a larger change on low-end hardware. Though maybe not as larger as I initially expected. Also, this whole rate=44100 story is only relevant for the spotify client, if you add support for other clients in the future you might not want rate=44100 after all. Because as far as I know spotify is the only client that uses a rate of 44100Hz all the time. Other (music)-clients might use 48000Hz if available, though I have no examples at the moment. [EDIT] If you'd like to test this yourself: https://github.com/wwmm/pulseeffects
If doing this in C ends up to be too much of a nightmare, you might be interested in https://pypi.org/project/pulsectl . I haven't explored the code here in great detail, but this might just be able to do what needs to be done in python. The advantage would be that we could use the 'Power of Python' (with a capital P because python is awesome :D ), instead of having to use these 'hacky' grep commands. (I have this feeling that using grep might not be a very future-proof solution, because it might break if pactl/pacmd changes the way it outputs things.) |
Maybe it was a pulseaudio update? Did you try the scripts on something other than spotify?
Yeah, the thing is that other media players are also supported, so it can't be limited to Spotify. Using 44100Hz is still interesting, though. It might save up memory and CPU in some cases. But I'll leave it for the future (#15).
Thanks for the recommendation! I actually decided to give it another try today and I did get it working! Here's a program you can test for now. The docs are better than I thought, once I understood how pulseaudio handles requests and such. |
Latest pulseaudio update was 2019-09-17 (to 13.0), So I don't think that was it either. I vaguely remember switching around default_samplerate=96000 and alternate_samplerate=48000, but I'm not sure.
Awesome, I just tested it, it works well :) One thing I did notice is that if you run Another thing I found is that I now seem to be experiencing the half-a-second-delay you described earlier when executing One thing I did notice is that just after running [EDIT]:
Maybe the 'load once' property might provide an easy way to prevent the modules from being loaded a second time. Though I can not find on google how it should be set. |
Yes I'm aware. For now it should only be used once, because currently it doesn't check for previous sinks. It's just a sketch of how it should look like, after all. The thing is that I'm still thinking about how I could integrate it into the main program. First of all, the Vidify usage of The new version would have to run this set-up either as a standalone function (something like Also, now that you mentioned it, Vidify doesn't currently support multiple MPRIS players in the same session, so technically, if you were to use a different player, it wouldn't be detected. I just opened a new issue to track that: vidify/vidify#59.
This unfortunately doesn't exist AFAIK, but the second idea is definitely a good option to consider.
Same! For some reason, the C program seems to cause this lag, but the script doesn't? Not sure why exactly this happens. It only happens the first time, too. Thanks for the LatencyControl link you provided, I'll read it in detail soon. I should also take note of the differences between what the source code of |
Update:
|
Doesn't that require conversion if your system's format is not float32? e.g. on my system it is set to s32le, and if I remember correctly the default format is s16le. Or is this a ffmpeg/fftw thing?
What I meant is the number that is shown left of 'latency' in my screenshots. I'm not sure how 'buffer' relates to 'latency' but I do know that more buffer means more initial audio delay. The sinks/streams buffer is filled before the audio is sent through, this explains the slight delay we observe when switching audio sinks. The size of the buffer seems variable, just after running your script it is huge, and then it starts decreasing to more reasonable values. It might be possible to force the buffer to smaller values at all times, but I don't know how to do that. |
Yes, it still requires a conversion, but for what I understand, it's already handled by pulseaudio... Which is probably easier & faster to handle? To be honest I'm not really sure. There is still a conversion being performed:
I don't use PulseEffects myself, but the latency can also be checked by listing the inputs with As I was developing this, I've come to the conclusion that this method won't always work. There are so many possible stream names, and they're so unreliable, that the previous method should still be available in case there was an error while trying to set-up the sink. So by default, audiosync will have |
I just checked the output of Perhaps, this is just a unavoidable pulseaudio thing :(
That seems like a good fail-safe. Another idea I just thought of is to determine the index of the stream we want to move not by it's name: I'm not sure if this is possible at all, but maybe MPRIS/dbus can return the index of the stream we want. After all, it gives us information on title/artist/etc from the player, it might also be able to return the index of the stream that this metadata belongs to. I couldn't find anything about this in the MPRIS specifications though, so maybe it is not possible to determine the index like this :( |
I'll have to ask in their mailing lists for advice because I'm clueless. I'll update when/if they answer.
Yes, that's what I was going to do. I'd always call Update: the setup sketch can now be run more than once! Here's a quick benchmark:
Not only C is faster, it's fully integrated in the code, and more customizable. Although it's way more complex. Soon I will remove the C sketch and turn it into a test. The bash script is enough as a sketch. |
This is pretty much finished. All it needs is a couple tweaks and fixes, but that can be done in master. I still ended up using I should also improve the documentation. I still have to update Vidify to support this new version, too, but I'll open up a new issue for that. |
Hello,
I've been checking out the audiosync feature, and I've gotta say it is absolutely awesome. It works pretty well, though not always on videos with a very long intro/outro. But overall it's pretty impressive how well this works.
I do have one suggestion though.
I noticed that you are using 'Monitor of {default device}' to record output. Now I've been experimenting a bit, and something as simple as a Notification Sound or some other audio stream can make the process fail:
INFO: Audiosync module failed to return the lag
. Also, for non-standard pulseaudio setups (like mine) this requires some manual-pavucontrol-work to get it working. Furthermore, pulseaudio's loopback module can introduce white noise in the default output device which also leads to problems.[My setup: I use Pulseeffects for fancy effects, and I use the loopback module to forward input from my PC's line-in to line-out (--> speakers). Pulseeffects makes the setup for AudioSync non-trivial, I have to set AudioSync to `Monitor of PulseEffects.Apps. The line-in loopback introduces some white noise that is usually inaudible but breaks audiosync.]
To circumvent all these problems, and make the audiosync process a bit more rigid, I would suggest the following:
pacmd list-sink-inputs
find index of player named spotify (or other supported player), use some grep magic or something on this output (in this case 135):[EDIT] This grep magic does the trick:
echo ${$(pacmd list-sink-inputs | grep -B 20 spotify | grep index)#"index:"}
Though we should probably find a way to deal with the case that there are multiple spotify clients playing (unlikely, but we don't want everything to crash if that does happen).pacmd move-sink-input 135 AudioSync
AudioSync.monitor
instead of$DEFAULT_SINK.monitor
and does it's thing.This is more robust, and will work on more complex setups without manual configuration. Because the loopback module creates an audio stream that acts as if it is a regular application. And there will be no more need to manually move audiosync to the correct monitor because the correct monitor will always be
AudioSync.monitor
. In short, this will make audiosync work 'out-of-the-box'. Furthermore, audiosync will record audio from spotify and only spotify, thus audiosync will work even when another stream or notification sound is playing, or when there is other noise on the default output.Proof of principle:
Spotify outputs into AudioSync
AudioSync.monitor is looped back to the default device and vidify-audiosync records from AudioSync.monitor
Note:
Spotify starts a new stream whenever a new song is played (Actually, this happens whenever the play button is pressed), this means that whenever a new song is started this has to be run again:
pacmd move-sink-input 135 AudioSync
However the
pactl load-module
commands only have to be run once, at the start ofvidify --audiosync
EDIT: In summary
Untested patch
This should work to set ffmpeg to use
AudioSync.monitor
instead of${DEFAULT_SINK}.monitor
. And move spotify streams to theAudioSync
sink.The text was updated successfully, but these errors were encountered: