-
-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental microphone support #19106
Conversation
Note: this will fix #13268 once merged. |
Special thanks to @marcelofg55 for providing a patch which seems to have more or less fixed the audio clipping problems. |
8419761
to
6b9ddd5
Compare
As to what was spoken on IRC a few days ago it would be good to convert this to use a single microphone device at a time, since it's really not necessary to have multiple mics active at a time and will make the code simpler. So instead of a Vector of microphone_device_outputs it can be a single object. |
@SaracenOne I solve the audio clipping by adding a lockless buffer. I am only trained in c++11, so I cannot provide my lame implementation of a multi producer single consumer. I found it to be simpler to just make audio output an bindable signal |
@marcelofg55 Sorry, I've been a bit busy over the past few weeks. I'll return to this PR soon. Interestingly, an earlier attempt actually did use this naming convention. |
Whats the status of this? :) |
I could continue with this if @SaracenOne is busy, but I don't want to step on his toes if he wants to continue it by himself :) |
Just saying: Right now there's still a little time left to get this into a 3.1 release ;) |
Sorry, I actually put this to one side while I started working on a high level networking layer which would actually incorporate the VOIP aspect of this eventually. I do intend to return to this soon, but I have no problem if someone else wants to, say for example, write more driver backends in the meantime. |
@SaracenOne I'll work on the CoreAudio driver then. |
@marcelofg55 That would be awesome! I actually don't have immediate access to a Mac at the moment, so that would be especially useful. One thing though: I'm not sure if iOS has a similar permissions model to Android, but I know that on Android at least, it will be necessary to poll for permission to access the mic. |
Mind if I convert the code to use a single receiver? Instead of multiple receivers. |
@marcelofg55 I guess if you want to do that too, sure thing ;) |
@SaracenOne If you want, i can help with the voip stuff. Somebody needs to refactor the high level network to allow packets for different purposes. Currently, the Scenetree basically takes controls over the p2p networking. |
@hungrymonkey Although the multiplayer API does take control of the SceneTree and deals primarily with direct RPC calls on nodes with path caching. This system makes it very easy to get networking up and running, but hard to do anything more complicated than that. For example, there's no way to do replication meaning that players can't join servers where the game has already started for example. However, it is actually still possible to send generic packets through this system too which can be parsed however you like. The way I'm approaching networking myself is writing a higher level library on top of the already existing multiplayer API which is more specific to a particular type of game logic, but still pretty broad (basically any kind of which involves one map per server and a bunch of networked entities moving around, but doesn't cover more specific cases like fighting or RTS games). It's currently up and running, but my intention is to go further and implement the networked occlusion system described in this Halo talk (https://www.gdcvault.com/play/1014345/I-Shot-You-First-Networking) to make it highly scalable. For me personally, I'm happy to just manually append extra data to the packets to send VOIP data manually, but it's @reduz's call if he might want to make VOIP data part of the standard multiplayer API, which would still be pretty easy to implement and would probably make it easier for developers working with the vanilla multiplayer API who simply want to get VOIP up and running as soon as possible. The main thing we need though at this point is code to get audio data out of the mixer and back into the game logic so we can start prepping it for going over the networking. We will also need to integrate the Opus codec too for properly compressing the audio. |
@SaracenOne You really dont need to make it part of the multiplayer API. I just need control of an extra few channels in which I can send an receive packets. If that is possible, then I can remove my crash connect code because the VOIP will be self contained in its own module. I am currently using protobufs to create my own mumble like VOIP protocol. Oh ok, you want a generic VOIP protocol in the network? Use something like protobufs and flatbuffers Edit: If you guy fix the network thing. I would donate code for VOIP. |
@hungrymonkey In which case, I THINK it should already be possible, but you might still have to run that by @reduz. I know you can manipulate the channel you want to use by directly interacting with the ENet peer (since the concept 'channels' are not something universal to all networking models, Bluetooth for example), but I think one of the reasons you might still have to go through the MultiplayerAPI's send_bytes method is because I believe the MultiplayerAPI looks at the first few bytes to determine whether or not the incoming command is an RPC call. Beyond that, the only other alternative I can think of would be to establish a secondary connection on a completely different port. |
@SaracenOne I dont need to manipulate channels. I just need an extra channels to mess things around. I attempted to do it with the current implementation and it didnt work. It seems like Fabless made a few changes lately. The largest issue is that Scenetree seems to dequeue all packets. When I try to grab the packet I need, scenetree already dequeue it. |
@hungrymonkey Hmm, in which case I can't be sure. I'll see if I can find any more info. |
@hungrymonkey By the way, I'm having a look through your VOIP implementation right now... |
@SaracenOne look into the happytree_master some of the branches are just old code. I should delete them but I am procrastinating I would have to refactor it into a godot server. |
cc @Faless ^ networking VOIP discussion, see above |
Renamed AudioDriver audio_input_* vars to input_*
Added support for single channel inputs for PulseAudio
I think me and @marcelofg55 decided that this PR is probably in a suitable enough state to get merged now. That might be issues that could still be ironed out, but we're actually somewhat collectively limited into what we can test with this due to not having access to a wide array of capture hardware. At the very least, this now covers support for at least one driver for the three major support desktop platforms and seems to be working now. The only thing I would likely to now know is how to get the audio data out of the mixer and back into the code so I can start looking into integrating @hungrymonkey's VOIP example. |
@akien-mga this is ok to me, but is the PR format in the right shape for you? |
Seems good to me too, quite a few commits compared to our usual standard but I like that it shows the collaboration between two contributors :) |
@SaracenOne i took a look at the mic, i guess i will have to refactor my code again since I need to add an explicit poll to my voip tree |
Is mic support already usable from GDScript in beta9 ? Any docs? |
(*) beta 11 |
@marcelofg55 Would you document this feature? If not I would like to write a simple tutorial for this section but I would like to know whether I can record from microphone just from the editor, I looked at the demo and understand that it can record in game. |
If you want to document it go ahead. The feature is meant for in-game recording. |
Microphone recording doesn't work on iOS. I've tried the demo project from https://github.com/godotengine/godot-demo-projects/tree/master/audio/mic_record, but the audio data that is recorded is all zeros. I added "Privacy - Microphone Usage Description", aka NSMicrophoneUsageDescription, to the app's ...-Info.plist, which is required on iOS before one can use the microphone. While that did trigger a permission request question to the user, as expected, the data that is recorded is still all zeros even after I grant the app permission to use the microphone. Should I create a new bug report for this or would it be tracked here? There are two issues that need to be fixed for iOS:
P.S. I'm assuming I haven't forgotten a step. If you can get the demo project working on iOS, please let me know. By the way, the demo works fine on macOS, but iOS and macOS still have slightly different CoreAudio implementations as far as I know despite recent attempts to make them more similar (though I'm not an expert on macOS). |
I would create a specific bug for iOS's microphone is not working. |
Okay, following @reduz’s advice, I decided to post a PR for this feature I was working on even though it is not complete and definitely not ready for merging. Given that I knew next to nothing about audio encoding before starting this, this has largely been learning experience, but given some walls I have run up against with trying to make this feature work correctly, I feel I should put out what I have already done in an attempt to get feedback, and also support perhaps from people with more experience in this particular field. I'm led to understand that @marcelofg55 in particular might be able to help out with this.
In my discussion with @reduz, the idea behind this particular approach was to add microphone support directly through the AudioStream interface which could then in theory go through AudioMixer for processing. This idea actually introduced somewhat more complexity than my original attempt at the interface, and necessitated in the introduction of a kind of proxy interface between the audio driver, server, and individual streams. This comes in the form of MicrophoneRecievers and MicrophoneDeviceOutputs. The MicrophoneDeviceOutputs are the primary endpoints that the actually drivers for each capture device. MicrophoneRecievers are invisibly assigned by the new AudioStreamMicrophone classes when playback is requested which then allow these classes to decode the microphone buffers themselves and go through the standard audio mixing process. The presence of receivers also automatically control whether a capture device should be active or not. These classes are also heavily virtualised as they also support non-physical capture devices such as the ‘default’ endpoint which can also be changed while the engine is already running.
I will stress further that this is NOT a fully functional implementation of microphones; it is merely a proof of concept which has severe audio clipping and which will actually crash 10 seconds after starting a microphone stream due to a buffer overflow. It is only meant to demonstrate the underlying design of the interface and only currently supports one audio driver. It also probably requires some further cleanup and refactoring.
While most of the issues can be addressed fairly easily, the main thing I am having trouble with right now is how to solve the audio clipping problem. The main issue lies in synchronising the audio capture with the audio stream processing; audio capture packets currently seem to come in at a bigger size than they get decoded at, meaning that the only way to keep them in sync currently has been to only process part of them in the AudioStream, clipping off the end, otherwise greater and greater latency gets introduced between the capture and playback. If anyone can figure out how to handle this issue, please let me know or even send me another pull request.
I’ll list a couple of the things I feel still need to be addressed before this PR would ready for merging:
While there are many local applications for microphone support, the main motivation for this feature is to provide a way of support VOIP. This would likely be achieved by compressing audio packets through Opus, sending them through networking interface and then decoding them for the clients. While it would likely take the form of another interface sitting atop the basic microphone input interface, it would be nice to have a sample implementation of such a feature developers could easily integrate into their own games to instantly have VOIP support.
Lastly, I have also included an extremely basic sample project to demonstrate this feature. Remember that this currently only supports listed supported drivers.
godot_mic_test.zip