-
-
Notifications
You must be signed in to change notification settings - Fork 23.7k
Add MicrophoneFeed with direct access to the microphone input buffer #108773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| return buf.size() / 2; | ||
| } | ||
|
|
||
| PackedVector2Array MicrophoneFeed::get_frames(int p_frames) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong since it mimics the audio capture api but does not use a ring buffer. You must use a ring buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AudioDriver input_buffer is already a ring buffer, so I'm not sure where another ring buffer would come into play.
The current audio capture apis (capture effect and record effect) use separate ring buffers because they're receiving a fixed, transient batch of data from the AudioDriver input_buffer (pulled off it in AudioStreamMicrophone) as part of the audio server _mix_step, and need to store it until the user can request it in their own code, outside the audio server thread/loop. This implementation bypasses that, allowing users to pull from the audio driver ring buffer directly. Each feed can have its own 'buffer_ofs', so multiple places in the code can be retrieving microphone data from the same device without stepping on each others toes.
The issue I have with this is it should need to lock the AudioDriver, since input_buffer is the single buffer that is being written to by the driver in the AudioDriver thread. As far as I can tell, not locking will lead to race conditions in multithreaded environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the locking in the current implementation is inconsistent.
There is no locking applied when the samples are added to the buffer by
AudioDriver::input_buffer_write(int32_t sample), but there is a lock applied when the samples are taken from the buffer by AudioStreamPlaybackMicrophone::_mix_internal(AudioFrame *p_buffer, int p_frames).
That means the locking isn't doing anything useful since it is on only one half of the transaction. Assuming it was meant to do something, I added a corresponding lock into the input_buffer_write() function in my first PR. But there were complaints that I was adding a lock into the very time-sensitive audio thread which could be potentially harmful.
The lack of a lock in input_buffer_write() evidently caused an index out-of-range crash that was mitigated by inserting in an extra boundary test on the index, instead of finding the root cause, which could only be due two threads executing input_position++ at the same time:
void AudioDriver::input_buffer_write(int32_t sample) {
if ((int)input_position < input_buffer.size()) {
input_buffer.write[input_position++] = sample;
if ((int)input_position >= input_buffer.size()) {
input_position = 0;
}
} else {
WARN_PRINT("input_buffer_write: Invalid input_position=" + itos(input_position) + " input_buffer.size()=" + itos(input_buffer.size()));
}
}In any case, there should never be two threads entering this function since it would result in choppy out-of-order audio chunks being pulled from the operating system and buffered. I've only seen it happen in circumstances when AudioDriverPulseAudio::input_start() was called a second time. Some of the code in my current PR protects this from happening again on the various different platforms.
With regards to the race condition, I think it is safe since the input_buffer is never getting realloced and the MicrophoneFeed::get_frames() function doesn't write to any conflicted values and can tolerate an out-of-date input_position value.
Indeed, there is no point in adding in a lock into this function without adding the corresponding lock into input_buffer_write() function. Unfortunately we don't know what the consequences of acquiring a lock at the rate of 88.2kHz in the audio thread would be, and it has persisted for this long without it being a problem that it would be a risk to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The drivers lock themselves in their own threads before making changes to input_buffer (except maybe the Android driver).
| ad->lock(); |
godot/drivers/wasapi/audio_driver_wasapi.cpp
Line 765 in 71a9948
| ad->lock(); |
godot/platform/web/audio_driver_web.cpp
Line 405 in 71a9948
| driver->lock(); |
You may be right that accessing input_buffer from multiple threads is fine (due to no reallocation), but it really doesn't feel correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stand corrected. Note that the lock is per chunk of audio, not per sample.
Yes, it was the Android version which had all the bugs I have been most interested in fixing.
|
Many GH checks failed, |
|
Work to do: Do the docs again using https://docs.godotengine.org/en/stable/contributing/documentation/updating_the_class_reference.html#updating-class-reference-when-working-on-the-engine Try using: Remove vector& lock in the android version and locking in the get buffer. |
aff6747 to
a807709
Compare
|
The deeper proposal is to separate the Audio Input (microphone code) from Audio Output (speakers) and move this code into a default microphone feed class. Unfortunately the Godot codebase does not make this separation easy to do. Here is why. The Godot Each of these classes manages the single input and single output for that platform -- sometimes with the same function. For example, in Here are the options of what can be done. Option 1: Separate the Audio Input and Output at a deep levelThis requires me to write a new The case of Option 2 (what I have implemented): Touch the AudioDriver code as little as humanly possible so nothing breaks:Leave all the code and its technical debt (in relation to the feature of multiple microphones) in place, including its single microphone input buffer on which the 7 different platform implementations depend, and simply access this single buffer from the one and only Option 3: Something in betweenLeave the Since none of the platforms implement multiple microphones, the value of In my opinion the appropriate time to implement the code that can manage multiple MicrophoneFeeds is when at least one of the platforms' AudioDrivers has been extended to support it, because otherwise its implementation will be speculative and likely to be wrong. |
|
I made a similar implementation of a My implementation has the However, since only one real microphone exists in the |
|
I talked to @goatchurchprime at GodotFest and found out about this PR that way 🙂 I implemented a voip GDExtension for our next game, too (still very much WIP, but can be found here: https://github.com/mrTag/listenclosely ) and I'm sharing the pain of the currently very cumbersome and delay inducing way of accessing the microphone stream in a GDExtension. With the current wave of indie "Friend Slop" games (Peak, R.E.P.O. and the like) it is very important for Godot to have a good voip solution ready and for that we need a good way to access the microphone stream directly. I think the All this to say: Please don't forget about this PR, as the issue is more important than you might think. Thank you! |
|
Although this PR is related to a number of existing issues that are linked in the main description, this PR does not fix or close a number of these issues; instead, it simply provides a workaround. The primary purpose of this PR is to close the proposal for direct access to microphone input for VoIP. The other issues will still remain open. (Said differently, this PR description should not use the "fixes |
|
I had other words there originally, but then someone on RocketChat told me:
Would you like me to change them to, "Doesn't exactly fix#"? |
|
Yes, that was me; My intent was that any issues specifically fixed by this PR should be linked as "fixes My mistake! Sorry! |
Direct access to the microphone buffer is required to reduce audio latency for the use-case of VoIP, and to repair the fundamental design flaw of connecting the audio input stream to the audio output stream under the mistaken assumption that they will always proceed at exactly the same rate for all time on every platform. A diagram of the current design and the improved situation can be seen here.
The following issues are illustrative of the flaw in the current design.
This PR is a re-implementation of two previous attempts to find an appropriate place for the
get_microphone_frames()function:AudioStreamPlaybackMicrophoneto access the microphoneAudioDriver::input_bufferindependently of the rest of the Audio system #100508Input#105244This also resolves godotengine/godot-proposals#11347 (proposal).
A comprehensive demo is at:
https://github.com/goatchurchprime/godot-demo-projects/tree/gtch/micplotfeed/audio/mic_feed
As discussed at length in the Audio Group meetings, this API is designed to future-proof for the case when there might be more than one microphone data stream, even though only a single microphone input has been hard-coded throughout the AudioDriver code on all platforms.
Accordingly, I have modeled a framework based on the
CameraServerandCameraFeedobjects and created the new objectMicrophoneServerthat has a singleMicrophoneFeed.This
MicrophoneFeedcontains the following functions:In the class AudioEffectCapture, the function that is equivalent to
get_frames()is known asget_buffer().The function
get_buffer_length_frames()is needed to predict when the overflow condition is likely.