-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wav: add ambisonic bformat subtypes #263
Conversation
Hi @aentity, Thanks for this. The Aside from that, wouldn't this need a decoder to decode the ambisonics into proper PCM channels? |
@pdeljanov hello. i am happy to add these changes to riff. i don't know or use your library (yet) :)) but i just pushed these at same time as i did the hound push. ruuda/hound#72 so for your question, it is possible i have made an answer in that thread, but it is a little complicated. so if by PCM channel you mean the channel in the wave file, most ambisonic wav files you encounter are 4-channel wav files (with .amb extension) with these guids i added. the extension and the guid are to tell users that it is 4 channels or listenable sound (you can listen, it just sounds a little strange), but each sample across channels (a b-format sample) needs further processing (when it comes to listening). as said this format is called the b-format. it is speaker layout independent. so this means the raw b-format must be decoded to a specific speaker target, perhaps binaural stereo, regular stereo, quadrophonic speakers, spherical, 5.1 surround, or any other thing humans can imagine. however, as i said, i do not know how user reads in symphonia, how channels are represented, etc., but as a user of the library, i would expect to be able to read in a .wav or .amb file, collect the 4 separate channel samples into a b-format sample (usually represented as w, x, y, z), which can be either integral pcm or float. in an ambisonic pipeline, we can now perform manipulations on the signal, or decode it to a speaker layout. so yes, the user needs a decoder to transform the raw b-format into something the listener listens. i think this is out of scope for symphonia? i do not know. i am working on decoder implementations right now for another project. but it needs to have understanding of bformats, decoder types, complicated math like pseudo inverse, and other 3dimensional manipulations (if desire). |
i have added the types to riff, thank you for telling. i also should mention, in future, it would be nice to know that user is reading a file with bformat guid. is there a way to retrieve this tag information somehow programmatically? |
Hi @aentity, Thanks for the explanation. The current set of changes will be sufficient for you to access the 4 channels, however, there are some areas where support could be improved. This PR is probably good enough for now, but I'll list them below for future consideration:
There is no way currently as of 0.5.4. As mentioned in 2, assigning a codec type of ambisonics could be one way of detecting this. |
Another option could be introducing ambisonic channels. This may generalize better across different formats. For example, AAC compressed ambisonic audio channels in MP4. |
hello. i don't understand the format CI failure; i had to turn off format-on-save, which is small annoying. i have repushed though and should be ready, thank you! some response:
yes sounds more correct. and yes, please do ping me for design, but please note, i am not familiar with your library details :)
i see. this differs from simpler libraries like hound, which does not have real notion of codec like this. (e.g., i am writing something like this). i am not sure how far you want to take symphonia responsibility there. i think it may draw in extra dependencies for you, but it could be intersting. i do not know. i am just used to reading file, extracting interleaved channels, representing as i want, and then performing operations, for example. one thing i must stress:
one purpose of ambisonic b-format is that it is useful to perform manipulation on this raw format, transforms, mixing, etc., and then at very end, decode into real speaker layout. it is more of a late stage plugin. as they say, it is 'speaker agnostic'. similarly, there is inverse, where user wants to encode into b-format perhaps a mono signal, (or from a-format, which is what microphones pickup usually, but this is another story), to be mixed with other b-format signals. so what i say by all this, maybe (if this is proposed) forcing user to pick the decoder target (speaker layout) on reading of a file is premature. it is nice to work in the "b-format space" until very end, then decode to the layout. so if reading a .wav or .amb file in symphonia forces to immediately pick a decoder layout, this might not be optimal design (not flexible enough), if you understand my point. ok sorry for long text, thank you! |
Hi @aentity, Once again, thanks for your detailed explanations. Quite a bit to consider on my side. I'll see what changes we can incorporate into the 0.6 API to support this use-case better. I'm leaning towards adding an Ambisonic channel map. I'll make sure to @ you when I'm collecting feedback on the audio module rewrite changes. For now, I'll merge this PR as-is. Thanks!
A nightly toolchain is required for rustfmt to support the brace style we use. |
Example ambisonic wav files can be found here: https://www.ambisonia.com/