-
Notifications
You must be signed in to change notification settings - Fork 143
Rename AudioFrame->AudioData, drop AudioBuffer, add ref counting semantics. #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Some context for usage of "acquire the content" in https://github.com/WebAudio/web-audio-api-v2/issues/119#issuecomment-812238655 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we need again, but a few important questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly: we are all in agreement about having clone()/close(). |
The mutability of AudioBuffer was undesirable. Also, we like having mor sample formats. See discussion in #179.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of types, and some real questions, but generally I think this goes in the right direction.
index.src.html
Outdated
readonly attribute unsigned long long timestamp; | ||
readonly attribute AudioBuffer? buffer; | ||
|
||
undefined copyFromChannel(BufferSource destination, unsigned long channelNumber); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that occurred to me right after the call last Friday is that this API is not very good for interleaved data, it's really wasteful: each byte is touched twice to copy the full buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would expect this to be something like copyFromPlane and interleaved data would only have one plane.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now copyTo(buffer, planeNumber). I've renamed to copyTo() to match direction in this thread.
:: 32-bit signed integer samples with planar channel arrangement. | ||
|
||
: <dfn enum-value for=AudioSampleFormat>FLTP</dfn> | ||
:: 32-bit float samples with planar channel arrangement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to explain what interleaved and planar are. Also how does those data types hold audio: for example, 16-bit is linear audio from in [-65536, 65536]
, but the FLT
variants only use [-1.0, 1.0] in general (but it can easily go outside this). And also S32
is only expected to use 24-bits probably?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree about more explanation. Filed #215 to track that work.
And also S32 is only expected to use 24-bits probably?
@dalecurtis - true for ffmpeg? I'm not savvy enough to say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing feedback.
index.src.html
Outdated
readonly attribute unsigned long long timestamp; | ||
readonly attribute AudioBuffer? buffer; | ||
|
||
undefined copyFromChannel(BufferSource destination, unsigned long channelNumber); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now copyTo(buffer, planeNumber). I've renamed to copyTo() to match direction in this thread.
:: 32-bit signed integer samples with planar channel arrangement. | ||
|
||
: <dfn enum-value for=AudioSampleFormat>FLTP</dfn> | ||
:: 32-bit float samples with planar channel arrangement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree about more explanation. Filed #215 to track that work.
And also S32 is only expected to use 24-bits probably?
@dalecurtis - true for ffmpeg? I'm not savvy enough to say.
{{AudioData}}'s [=media resource=], as described by the getter steps of | ||
{{AudioData/allocationSize}}. | ||
3. If |allocationSize| is greater than `destination.byteLength`, throw a | ||
{{TypeError}}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that usually RangeError
from ES ? It would have been IndexSizeError
, but that's deprecated.
readonly attribute unsigned long long timestamp; | ||
readonly attribute AudioBuffer? buffer; | ||
|
||
undefined copyTo([AllowShared] BufferSource destination, unsigned long planeNumber); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking maybe an offset and a frame count would make it a lot easier to send data to the WASM heap, and would help to address this feedback from game developers (while not exactly that if I remember our conversation, it's really close).
After demuxing properly (with all the edge cases, easier said than done!), you have quite often a decoder delay value in frames, and a padding value in frames (is WAV the only codec that doesn't have this? Certainly among the codecs available on the Web). Then imagine you're doing all your work in a custom audio engine written in C++, so the immediate thing you're doing is to copy the audio frames out to the WASM heap.
Say you're doing AAC, commonly it's 2112 frames of padding (not aligned to the common number of 1024 frames per packets), so you want to drop the two first packet entirely, and then copy to your heap with an offset of 64 frames starting on the third packet. Similarly, for the last packet, you only need to copy part of the packet, that's a function of a duration you find in the mp4 container usually (not always).
Here with the current API, you need to copy out to some intermediary buffer, and then copy from an offset to end final location, which is wasteful.
I believe adding two parameters
undefined copyTo([AllowShared] BufferSource destination, unsigned long planeNumber, optional unsigned offsetFrame, optional unsigned long frameCount);
with following meaning:
offset
is an offset in frames in the plane - throwRangeError
if bigger than thisAudioData
'snumberOfFrames
frameCount
is the number of frames to copy - throwRangeError
if bigger thannumberOfFrames - offset
ordestination.length
. The Web Audio API clamps in a similar situation, but I'd rather be explicit in this situation. When omitted, it can be decided to meanmin(numberOfFrames - offset, destination.length)
if we want, that's ergonomic.
Most of the time you're not using those parameters though, hence the meaning of the default values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of APIs which take an index like `planeNumber'. It sounds instead like we should have a planes array on the AudioData object, so folks can say data.planes[i].copyTo(dest, offset, frameCount). This would provide some symmetry with the VideoFrame API as well. This also makes it clear that you're not simply indexing into a channel array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@padenot , re offset + framecount + RangeErrors: all SGTM. Out of time tonight, but I'll send a new commit ASAP. I vote we create a AudioDataReadToOptions
dict that lists these members alongside a required member for planeNumber
. This also should help w/ @dalecurtis's concern that the simple argument looks like a channel index.
Re: planes
VideoFrame symmetry... not sure on that. We may yet delete planes
from VideoFrame ([still up in the air])(#157 (comment)), and adding a new Plane interface with a single copyTo() method feels like a lot of ceremony IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WFM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#223 opened for followup, linking to the valuable informations in this thread.
This PR still needs some work to address the open issues, but I'd like to go ahead and merge because its blocking 2 other VideoFrame PRs that have approval and a 3rd that is under discussion. I commit to send a new PR to address @padenot's last review tomorrow. |
SHA: 9f9da6d Reason: push, by @chcunningham Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
SHA: 9f9da6d Reason: push, by @chcunningham Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Partially addresses #129. I'll send a follow up shortly to do the same for VideoFrame (will land at the same time, but trying to keep the diffs in my PRs small for easy review).
Fixes #168 (Rename AudioFrame -> AudioData).
Progresses #179 (Drop dependency on AudioBuffer).
Progresses #184 (Back attributes with slots).
Preview | Diff