-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raw audio recording not supported #2391
Comments
The It's also unlikely WebAudio wil fix this since |
Thanks for the reply. I'll use ScriptProcessorNode for now. I would still like to see an AudioPassThroughNode, which should be very easy to implement or maybe even just add a timestamp to the AnalyserNode so we can know the exact time frame of the samples. (AudioContext may not provide a sample accurate time). |
What is an AudioPassThroughNode and what does it do? You can file a feature request for this node, if you like. We could add a timestamp to the AnalyserNode. Most likely this would tell you the context time of the first sample in the time domain data. |
This comment was marked as off-topic.
This comment was marked as off-topic.
An AudioWorklet or any kind of processor node is too extreme for such a simple requirement (and it is not supported in Safari anyway). An AudioPassThorughNode would be a very basic node that allows data to pass through it unmodified. It would simply have an event that provides AudioBuffers sequentially (gapless/seamless). This would allow an app to extract the raw audio data from any point within the graph. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Thanks for the suggestions. MediaRecorder and decodeAudioData are not an option. Any encoding/decoding of the audio must be avoided for audio editing software. MediaRecorder does not support unencoded audio on most web browsers. Also live access to the audio data is required for peak meters and waveform drawing. decodeAudioData resamples to the AudioContext's sampling rate, which is also undesirable if getUserMedia uses a different sampling rate. Maybe AudioPassThroughNode is not the best name. Perhaps AudioBufferNode or AudioDataEventNode would be better. Right now there is no node to passively access audio data passing though the graph in a seamless/gapless way. For now I'll have to use ScriptProcessorNode until an alternative becomes more widely available. |
This comment was marked as off-topic.
This comment was marked as off-topic.
I should have stated in the OP that access to the live recorded data is required. Sorry about that.
That is the problem and why ScriptProcessorNode is the only solution at the moment. Yes, AudioWorklet could be used on most browsers, but it has the following disadvantages:
|
This comment was marked as off-topic.
This comment was marked as off-topic.
Every new developer that needs to examine or copy the raw audio in a graph (there seems to be quite a few of them when I was searching for a solution) would have to re-implement this simple functionality as an AudioWorklet. It would be easier and better to have an dedicated node for this purpose. Instead of making developers search for how to do it using an AudioWorklet, they'd just create the dedicated node and use it. Making developers write many lines of code for something this trivial is not developer friendly. Requiring developers to learn about AudioWorklet (and moving data between threads) for something this trivial also isn't. I can see many new developers struggling with this in the future. A dedicated node would have made things easier for me. I had to invest way too much time in figuring this out, then resort to a deprecated feature. Something like this should already be there. AudioWorklets are great for many different things. This isn't one of them. |
So, you basically want an That could be a lot of events, generating lots of garbage. Introspection like this was never part of the original design. I can see it being useful for looking at various parts of the graph to see what is happening. Kind of like an oscilloscope probe at desired points in the graph. |
That "garbage" is exactly the audio data we're using for our applications. :-) Yes, that's a lot of events, and yes they are useful/required for many use cases. Sure, there's a lot of overhead, and yet doing audio is exactly what we need to do. In my own usage of the Web Audio API, almost everything I've built requires the ScriptProcessorNode to capture data precisely because there is no other way to capture raw audio data. Even once AudioWorklet becomes viable, I think there will be a lot of overhead in shuffling buffers around to get that data back on the main thread/context. Even if MediaRecorder were to support WAV, it still only emits chunks when it feels like it rather than being locked to the audio processing time. (You can specify a time, but it can't be guaranteed as it is dependent on the container and such. And realistically, we don't always want a container. WAV container support would be great, but there are plenty of use cases where we just want raw PCM samples.) |
Basically, this is a synchronous version of an |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
AudioWG call:
|
This comment was marked as off-topic.
This comment was marked as off-topic.
As someone that's been dealing with Wave files for over 25 years, my advice about adding RIFF Wave to MediaRecorder and potentially having malformed Wave files with zero length RIFF and 'data' chunks is: Don't do it! Also it is not safe to assume the Wave header is always 44 bytes. If the file contains 24 bit or multichannel audio, WAVE_FORMAT_EXTENSIBLE must be used, which has a completely different chunk size. Forcing developers to skip the Wave header just to get to the raw data is not a good idea. Trust me. Just give us the raw data, please. I will reiterate that many developers (myself included) will continue to use ScriptProcessorNode because it does exactly what we need: we have some control over latency/block size, we get real-time raw data, and (very important) it is very easy to use (much easier than setting up an AudioWorklet). If raw audio support was mandatory for MediaRecorder, that would help. Good to know that Safari will eventually support AudioWorklet, but I'm still waiting for SharedArrayBuffer support. :) |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
It's absurd that this still hasn't been fixed. The amount of work that needs to be done just to get raw samples from the microphone is absurd, and very prone to introducing bugs and wasting everyone's time. There is simply no reason why webaudio doesn't directly support getting raw sample chunks. |
This has been possible for years, many web apps are doing it. https://ringbuf-js.netlify.app/example/audioworklet-to-worker/ is a full example that is suited for high-performance workloads, heavily commented, and does Web Audio API -> WAV file. It does so without touching the main thread, so it is robust against load and real-time safe.
Web Codecs is also now available in Chromium and soon others, so sample-type conversion and encoding (to lossless and lossy audio formats) is supported. It's a few lines of code to (e.g.) get real-time microphone data, encode that in real-time to (e.g.) mp3 or aac or opus or flac and do something with it. I'm closing this because the Web Audio API doesn't really deal with encoding: it's a processing API, and there are solutions already. |
Audioworklet is an abysmal "solution", and not simple at all. I do not want any encoding, I want raw samples, and there is simply no reason why either mediarecorder can't be forced to add support for raw samples, or adding an analyser node to microphone input that gives out chunks on callback without dropping any of them. The absurd thing is that sending raw samples has been made super easy with buffers, but getting raw samples is put behind absurd complexity, and even more absurd is that one can get time samples, but not with callbacks and guarantee that samples aren't dropped. |
It's so absurdly complex that an entire example to get raw samples is 25 lines of code, with about 50% of the lines being boilerplate. <script type="worklet">
registerProcessor('test', class param extends AudioWorkletProcessor {
constructor() { super(); }
process(input, output, parameters) {
this.port.postMessage(input[0]);
return true;
}
});
</script>
<script type=module>
var ac = new AudioContext;
var worklet_src = document.querySelector("script[type=worklet]")
const blob = new Blob([worklet_src.innerText],
{type: "application/javascript"});
var url = URL.createObjectURL(blob);
await ac.audioWorklet.addModule(url);
var worklet = new AudioWorkletNode(ac, 'test', {});
var osc = new OscillatorNode(ac);
osc.start();
osc.connect(worklet)
worklet.connect(ac.destination);
worklet.port.onmessage = (e) => {
console.log(e.data[0]);
}
</script> |
And the reason it couldn't be one line |
Callbacks on the main thread is a terrible idea for performance and you'll just lock up your app. You'll be getting more callbacks than you need, whilst the audio worklet provides you a thread just to process the audio information that you can process and then send back the information you are processing. |
That's based literally on nothing. At 48kS/s and 1024 samples per callback that's literally only 47 calls per second, much less than typical requestAnimationFrame, which runs at 60 fps usually, or even 120/144fps on modern phones, and usually does a lot more work than what analysing audio requires. Patronizing or spreading fud is not ok. |
Yes, unconditionally firing an event in an isosynchronous fashion to the main thread from a real-time audio thread with an expectation of real-time guarantees and no drops of audio buffers, with fixed buffering and without a way to handle main thread overload or any other form of control is simply bad design and doesn't work, in the same way that If the main thread isn't responsive for some time, you suddenly have a large number of events queued to its event loop. When it's processing events again, it now has to process all those events, loading the main thread again, delaying more events to be delivered, etc. This is the same reason why In the case of the Web Audio API, developers can instead devise their own mechanism that suit their use-case better, using the lower-level primitive that is the If it's for recording and not dropping buffers is important, use a worker and add a buffer there. If it's for visualization, maybe compute the values needed on the real-time audio thread and send the desired characteristics to the main thread, etc. For that, it's possible to use message passing ( Finally, |
To follow up to my previous post and give some real world feedback... Safari finally supports AudioWorklets, so I redesigned playback and recording in my app to use them. Unfortunately Safari had a major bug that caused distortion, but that has since been fixed. However audio playback on Safari is still very poor with frequent crackling and glitches when simply tapping the screen, even in the most basic app. Using AudioWorklets requires more coding and has a steeper learning curve than should be required for such a simple task. It is really just a work-around to overcome the real problem. The biggest design flaw is that One a separate note, |
Those are simply Safari bugs, that had nothing to do with the fact that you instantiate an
|
Agreed, but the difficulty Apple is having with playing defect free audio might suggest that WebAudio is more complicated than it should be.
In real world apps AudioWorklets have to interact with Workers or the main thread where real-time priority is not available. Without careful implementation, you end up with the same problems as
Of course, but the lack of a better API after all of this time is disappointing and if |
Thanks so much for posting this code. It's funny I just spent 1 hour googling how to accomplish THE most basic, fundamental task of an audio recording API -- RECORD RAW AUDIO 😂 |
I am creating an audio editor app. To allow editing of newly recorded audio, raw audio needs to be obtained. Unless I am missing something, this basic functionality seems to be missing from the specification. Here is what I have found so far:
Is there another option I have not discovered yet? If not, could an AudioPassThroughNode be considered. It would have an ondata event that provides seamless, raw audio data passing through the node, perhaps with a starting time and an ending time and other details. Alternatively requiring support for raw audio in MediaRecord would work.
The text was updated successfully, but these errors were encountered: