Skip to content

Rename AudioFrame->AudioData, drop AudioBuffer, add ref counting semantics. #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 5, 2021

Conversation

chcunningham
Copy link
Collaborator

@chcunningham chcunningham commented Apr 2, 2021

Partially addresses #129. I'll send a follow up shortly to do the same for VideoFrame (will land at the same time, but trying to keep the diffs in my PRs small for easy review).

Fixes #168 (Rename AudioFrame -> AudioData).
Progresses #179 (Drop dependency on AudioBuffer).
Progresses #184 (Back attributes with slots).


Preview | Diff

chcunningham added a commit that referenced this pull request Apr 5, 2021
Making it symmetric w/ AudioFrame.
Adds clone/close
Adds [[resource reference]]
Updates constructors accordingly (and fix cruft/obsolete steps).
Removes destroy (replaced by close).

Fixes #129 (in comboniation w/ PR #162).
Notes issues #165 and #166 for follow up.
@chcunningham
Copy link
Collaborator Author

Some context for usage of "acquire the content" in https://github.com/WebAudio/web-audio-api-v2/issues/119#issuecomment-812238655

FYI @hoch @rtoy

Copy link
Collaborator

@padenot padenot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we need again, but a few important questions.

Copy link
Collaborator Author

@chcunningham chcunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @padenot @aboba

@chcunningham
Copy link
Collaborator Author

@padenot @aboba I think we at least have agreement on the API shape and its intended meaning. Would you mind approving the merge here and we can keep pushing on wording/style stuff in follow up GH issues?

@chcunningham
Copy link
Collaborator Author

@padenot @aboba I think we at least have agreement on the API shape and its intended meaning. Would you mind approving the merge here and we can keep pushing on wording/style stuff in follow up GH issues?

@padenot @aboba friendly ping

@chcunningham
Copy link
Collaborator Author

Broadly: we are all in agreement about having clone()/close().

The mutability of AudioBuffer was undesirable. Also, we like having mor
sample formats. See discussion in #179.
@chcunningham chcunningham changed the title Make AudioFrame immutable and add reference counting semantics. Rename AudioFrame->AudioData, drop AudioBuffer, add ref counting semantics. Apr 30, 2021
@chcunningham
Copy link
Collaborator Author

@padenot @aboba I believe the latest commits address all outstanding comments.

Copy link
Collaborator

@padenot padenot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of types, and some real questions, but generally I think this goes in the right direction.

index.src.html Outdated
readonly attribute unsigned long long timestamp;
readonly attribute AudioBuffer? buffer;

undefined copyFromChannel(BufferSource destination, unsigned long channelNumber);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that occurred to me right after the call last Friday is that this API is not very good for interleaved data, it's really wasteful: each byte is touched twice to copy the full buffer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would expect this to be something like copyFromPlane and interleaved data would only have one plane.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now copyTo(buffer, planeNumber). I've renamed to copyTo() to match direction in this thread.

:: 32-bit signed integer samples with planar channel arrangement.

: <dfn enum-value for=AudioSampleFormat>FLTP</dfn>
:: 32-bit float samples with planar channel arrangement.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to explain what interleaved and planar are. Also how does those data types hold audio: for example, 16-bit is linear audio from in [-65536, 65536], but the FLT variants only use [-1.0, 1.0] in general (but it can easily go outside this). And also S32 is only expected to use 24-bits probably?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree about more explanation. Filed #215 to track that work.

And also S32 is only expected to use 24-bits probably?
@dalecurtis - true for ffmpeg? I'm not savvy enough to say.

Copy link
Collaborator Author

@chcunningham chcunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressing feedback.

index.src.html Outdated
readonly attribute unsigned long long timestamp;
readonly attribute AudioBuffer? buffer;

undefined copyFromChannel(BufferSource destination, unsigned long channelNumber);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now copyTo(buffer, planeNumber). I've renamed to copyTo() to match direction in this thread.

:: 32-bit signed integer samples with planar channel arrangement.

: <dfn enum-value for=AudioSampleFormat>FLTP</dfn>
:: 32-bit float samples with planar channel arrangement.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree about more explanation. Filed #215 to track that work.

And also S32 is only expected to use 24-bits probably?
@dalecurtis - true for ffmpeg? I'm not savvy enough to say.

{{AudioData}}'s [=media resource=], as described by the getter steps of
{{AudioData/allocationSize}}.
3. If |allocationSize| is greater than `destination.byteLength`, throw a
{{TypeError}}.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that usually RangeError from ES ? It would have been IndexSizeError, but that's deprecated.

readonly attribute unsigned long long timestamp;
readonly attribute AudioBuffer? buffer;

undefined copyTo([AllowShared] BufferSource destination, unsigned long planeNumber);
Copy link
Collaborator

@padenot padenot May 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking maybe an offset and a frame count would make it a lot easier to send data to the WASM heap, and would help to address this feedback from game developers (while not exactly that if I remember our conversation, it's really close).

After demuxing properly (with all the edge cases, easier said than done!), you have quite often a decoder delay value in frames, and a padding value in frames (is WAV the only codec that doesn't have this? Certainly among the codecs available on the Web). Then imagine you're doing all your work in a custom audio engine written in C++, so the immediate thing you're doing is to copy the audio frames out to the WASM heap.

Say you're doing AAC, commonly it's 2112 frames of padding (not aligned to the common number of 1024 frames per packets), so you want to drop the two first packet entirely, and then copy to your heap with an offset of 64 frames starting on the third packet. Similarly, for the last packet, you only need to copy part of the packet, that's a function of a duration you find in the mp4 container usually (not always).

Here with the current API, you need to copy out to some intermediary buffer, and then copy from an offset to end final location, which is wasteful.

I believe adding two parameters

 undefined copyTo([AllowShared] BufferSource destination, unsigned long planeNumber, optional unsigned offsetFrame, optional unsigned long frameCount);

with following meaning:

  • offset is an offset in frames in the plane - throw RangeError if bigger than this AudioData's numberOfFrames
  • frameCount is the number of frames to copy - throw RangeError if bigger than numberOfFrames - offset or destination.length. The Web Audio API clamps in a similar situation, but I'd rather be explicit in this situation. When omitted, it can be decided to mean min(numberOfFrames - offset, destination.length) if we want, that's ergonomic.

Most of the time you're not using those parameters though, hence the meaning of the default values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of APIs which take an index like `planeNumber'. It sounds instead like we should have a planes array on the AudioData object, so folks can say data.planes[i].copyTo(dest, offset, frameCount). This would provide some symmetry with the VideoFrame API as well. This also makes it clear that you're not simply indexing into a channel array.

Copy link
Collaborator Author

@chcunningham chcunningham May 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@padenot , re offset + framecount + RangeErrors: all SGTM. Out of time tonight, but I'll send a new commit ASAP. I vote we create a AudioDataReadToOptions dict that lists these members alongside a required member for planeNumber. This also should help w/ @dalecurtis's concern that the simple argument looks like a channel index.

Re: planes VideoFrame symmetry... not sure on that. We may yet delete planes from VideoFrame ([still up in the air])(#157 (comment)), and adding a new Plane interface with a single copyTo() method feels like a lot of ceremony IMO.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WFM, thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#223 opened for followup, linking to the valuable informations in this thread.

@chcunningham
Copy link
Collaborator Author

This PR still needs some work to address the open issues, but I'd like to go ahead and merge because its blocking 2 other VideoFrame PRs that have approval and a 3rd that is under discussion. I commit to send a new PR to address @padenot's last review tomorrow.

@chcunningham chcunningham merged commit 9f9da6d into main May 5, 2021
github-actions bot added a commit that referenced this pull request May 5, 2021
SHA: 9f9da6d
Reason: push, by @chcunningham

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit that referenced this pull request May 5, 2021
SHA: 9f9da6d
Reason: push, by @chcunningham

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AudioFrame needs another name
4 participants