Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Level API - Phase 1 #196

Closed
34 tasks done
mackron opened this issue Sep 5, 2020 · 54 comments
Closed
34 tasks done

High Level API - Phase 1 #196

mackron opened this issue Sep 5, 2020 · 54 comments

Comments

@mackron
Copy link
Owner

mackron commented Sep 5, 2020

This issue is for tracking development and gathering feedback on the new high level API coming to miniaudio. This API sits on top of the existing low level API and will be able to be disabled at compile time for those who don't need or want it thereby not imposing any kind of significant overhead in build size. This is not a replacement for the low level API. It's just an optional high level layer sitting on top of it.

The high level API is for users who don't want or need to do their own mixing and effect processing themselves via the low level API. The problem with the low level API is that as soon as you want to mix multiple sounds you need to do it all manually. The high level API is intended to alleviate this burden from those who don't need or want to do it themselves. This will be useful for game developers in particular.

There's a lot to the high level API, and for practicality I've decided to break it down into parts. This issue relates to the first phase. This phase will be completed before the next begins. The sections below summarise the main features of the high level API. If you have ideas or feedback on features feel free to leave a comment and I'll consider it.

You can use the high level API by doing something like this:

#define MINIAUDIO_IMPLEMENTATION
#include "miniaudio.h"
#include "research/miniaudio_engine.h"

In your header files, just do something like this:

#include "miniaudio.h"
#include "research/miniaudio_engine.h"

Everything is being developed and updated in the dev branch.

Look at miniaudio_engine.h for the code and preliminary documentation. You can probably figure out a lot of features by looking at the header section.

The checklists below are what I'm planning on adding to the high level API. If you have suggestions, leave a comment.

Resource Management

The resource management system is responsible for loading audio data and delivering it to the audio engine.

  • Synchronous decoding into in-memory buffers
  • Asynchronous decoding into in-memory buffers
  • Asynchronous streaming (music, etc.)
  • Asynchronous API working with Web/Emscripten (no threading allowed)
  • Wide file path support (wchar_t)

The Web/Emscripten backend does not support multithreading. This therefore requires the periodic calling of a function
to process the next pending async job, if any, which will need to be done manually by the application.

Engine

The engine is responsible for the management and playback of sounds.

  • Sound groups
  • Mixing
  • Effects
  • Volume control
  • Panning
  • Pitching
  • Fading
  • Start and stop delays
  • Loop points
  • Sound cloning for resource managed sounds
  • Sound chaining (No gaps allowed. Similar to OpenAL's buffer queue.)

Simple 3D Spatialization

The spatialization model will be simple to begin with. Advanced features such as HRTF will come later once we get an initial implementation done.

  • Sound positioning
  • Sound orientation / cone attenuation
  • Listener positioning
  • Listener orientation
  • Doppler effect
  • Attenuation model (linear and exponential)
  • Multiple listeners

Standard Effects Suite

Support for custom effects is implemented, but it would be useful to have a suite of standard effects so applications don't need to implement them themselves.

  • Biquad
  • LPF
  • HPF
  • BPF
  • Notch Filter
  • Peaking Filter
  • High / Low Shelf Filters
  • Delay / Echo
  • Reverb
  • Fader
  • Stereo Panner

For examples on how to get started, see the _examples folder, and start with engine_hello_world. The general idea is that you have the ma_engine object which you initialize using miniaudio's standard config/init process. You then load sounds which are called ma_sound. The resource manager can be used independently of ma_engine and is called ma_resource_manager. See the resource_manager example for how to use the resource manager manually.

If you're interested in the high level API and it's progress consider subscribing to this issue. I'll be updating this as we go. Feedback is welcome, and I encourage you to play around with it - everything is on the table for review.

@mackron mackron pinned this issue Sep 5, 2020
@mackron mackron mentioned this issue Sep 5, 2020
@r-lyeh
Copy link

r-lyeh commented Sep 5, 2020

Awesome as usual! :D

@frink
Copy link

frink commented Sep 6, 2020

Might also add these few more features/effects:

  • Basic Limiter (Always need a limiter for normalizing or protecting speakers from stray audio....)
  • Basic Compressor (Specifically allowing side chaining for audio ducking...)
  • Parametric Notch Filter (You've got LPF and HPF already these round out EQ for any music player...)
  • IR Convolution (Most realist reverbs end up doing something with convolution anyway...)
  • Feedback Suppression (Any sort of VOIP app is going to want echo/feedback suppression...)
  • Paulstretch (Everyone seems to expect playback speed to be variable...)

Don't think you need the kitchen sink. But these are the basics I would want in the base toolkit...

@mackron
Copy link
Owner Author

mackron commented Sep 6, 2020

I've already got a notch filter (in addition to high/low shelf filters and a peaking filter), unless you're talking about something different? http://miniaud.io/docs/manual/#Filtering. There's no reason I can't support them with the ma_effect infrastructure. Will add that to the list.

The limiter, compressor and convolution stuff I'll consider, but may delay it to a later stage. The feedback suppression and Paulstretch stuff I'll consider, but will definitely not be included in this phase.

@raysan5
Copy link
Contributor

raysan5 commented Sep 6, 2020

Wow! Really impressive! I'm not an audio expert but the set of intended features looks really complete!

@frink
Copy link

frink commented Sep 6, 2020

There's no reason I can't support them with the ma_effect infrastructure. Will add that to the list.

That sounds like a great idea to wrap all the existing low level APIs with the flexible ma_effect routing. It also makes sense to push off other effects (limiter, compressor, convolution, feedback suppression and Paulstretch) until Phase 2. The only thing that you might want to consider now is how sidechains are implemented in your effects routing.

Look at miniaudio_engine.h for the code...

I looked but didn't see enough of the ma_effect stuff yet. I'm probably jumping the gun here. The LPF implementation mentioned above isn't pushed. Please let us know when you get the effects routing a little more fleshed out. I'm very excited to play with your implementation!!!


QUESTION: Are you planning to keep miniaudio_engine.h as a separate modular or are you planning to merge it intominiaudio.h once everything is implemented? - Seems like there might be some code management and readability advantages to smaller pieces over a single monolithic file, (i.e. sokol or stb...) but you probably have a clear understanding of what you want to do and why.

P.S. - Thanks for your devotion to a great cross platform audio experience!!!

@mackron
Copy link
Owner Author

mackron commented Sep 6, 2020

It'll all be merged into the one file, otherwise it's defeating one of the major features of miniaudio.

I've had other people mention the sidechaining stuff in the past so I'll at least look at it for this phase, but not guaranteeing anything.

@r-lyeh
Copy link

r-lyeh commented Sep 7, 2020

Would it be possible to create a basic occlusion system based on this? I'd pull 3d geometry, throw raycasts from/to audio sources, and cutoff some frequencies at the mixing stage for a given listener.

To give a brief recap, I worked in a game where I had to implement sound occlusion too. Dual occlusion explicitly, were adjacent walls and/or roofs would make subtle/big differences while listening to random conversations in a house. Game designers would assign occlusion properties to building materials so during game sessions, engine would raycast positions between audio listeners (players) and audio sources, and would process audio samples every time that a ray did hit one of those materials. Basically at every hit, and depending on the material, engine would decrease volume and cut off a frequency filter; raycast would end if reaching target or if volume was 0 because of multiple hits. Other basic features like audio bouncing and/or material reflectance were just omitted for simplicity (see attached pic). It worked pretty well overall, given the simplicity of such a system.

image

@mackron
Copy link
Owner Author

mackron commented Sep 7, 2020

Yes, a geometric occlusion system is something I want to build, but I'm planning on doing that at a later stage. But yes, that's totally on the table. I should add a section to the bottom of my original post to list the stuff that's planned but won't make it in the first phase.

@mackron mackron mentioned this issue Sep 12, 2020
9 tasks
@mackron
Copy link
Owner Author

mackron commented Oct 11, 2020

A new fading API has been implemented: d8aa619

I've also added sound cloning to the list of things to implement for phase 1 which has now been requested a few times now.

@prime31
Copy link

prime31 commented Oct 16, 2020

I've also added sound cloning to the list of things to implement for phase 1 which has now been requested a few times now.

Maybe a better name for the cloning would be “sound instancing”? The idea being you preload all your ma_sounds at level load time then anytime you want to play a sound you grab an instance of it: ma_sound_spawn_instance or something like that. When a sound instance reaches the end of playback it gets auto-recycled.

@mackron
Copy link
Owner Author

mackron commented Oct 17, 2020

The cloning thing probably needs a bit more explanation. Sound cloning will only available for sounds that were created with ma_sound_init_from_file(). It's basically just a way of taking advantage of the resource manager's instancing system to create a ma_sound without needing to keep track of the file name. If you want to use an instancing system with a custom data source, you need to do all of that yourself.

I realise that might seem a bit unintuitive, so to go into a bit more detail, the ma_sound and ma_data_source objects are the instance. Think of the ma_sound object is the thing that forms the connection between the ma_data_source (the delivery of raw audio data) and ma_engine (mixing, effect processing, spatialization, etc.). When you create an instance, it's actually the responsibility of whatever it is that's creating the ma_data_source object that needs to do the actual instancing and reference counting work, which in the case of the high level API is the resource manager.

There's two ways to initalize a sound: 1) via the resource manager (ma_sound_init_from_file()) and 2) via a custom data source (ma_sound_init_from_data_source()). Neither of these APIs are actually duplicating any raw audio data. When you load via the resource manager with ma_sound_init_from_file(), miniaudio will use the ma_resource_manager object to create a ma_data_source object which is where the real instancing happens. When loading via a custom ma_data_source, the caller needs to do any kind of reference counting themselves because the engine has no knowledge of how to actually instance those.

Auto-recycling will not be happening. Each cloned sound will still need a manual ma_sound_uninit() call. Not only is that more flexible, The miniaudio Way is to push memory management of objects up the pipeline and make it the responsibility of the caller. It also simplifies the implementation.

@prime31
Copy link

prime31 commented Oct 17, 2020

That all makes sense. As does cloning only existing for loaded audio files. I wouldn’t expect miniaudio to be able to clone a data source it doesn’t fully own.

@mackron
Copy link
Owner Author

mackron commented Nov 14, 2020

@frink In case you were interested, I'm experimenting with some ideas for supporting sidechaining in effects, or more generally, supporting multiple input streams when processing an effect.

Basically all I've done is repurposed ma_effect_process_pcm_frames_ex() by adding a inputStreamCount parameter and changing the const void* pFramesIn parameter to const void** ppFramesIn. The declaration looks like this:

MA_API ma_result ma_effect_process_pcm_frames_ex(ma_effect* pEffect, ma_uint32 inputStreamCount, const void** ppFramesIn, ma_uint64* pFrameCountIn, void* pFramesOut, ma_uint64* pFrameCountOut);

You thoughts on that design?

@frink
Copy link

frink commented Nov 17, 2020

I think think will work. But I'm having a little trouble visualizing a full effects chain...

Could you post something showing both the single stream and multi-stream (sidechain) scenarios? Maybe like:

  • chan1 -> highpass -> sidechain
  • chan2 -> sidechain(chan1) -> lowpass

Pseudo-code is fine. Just need to understand a little better how the full effects chain might work in practice.

Thanks! 😄

@mackron
Copy link
Owner Author

mackron commented Nov 17, 2020

I'm still figuring out that part :). Indeed, I actually needed to remove the built-in chaining stuff (might add it back, not sure) because it just didn't work with this multi-input stuff. Will think about this more later and report back.

@frink
Copy link

frink commented Nov 18, 2020

I see...
Sad to hear about scrapping chaining. But that just means we don't have the data model right quite yet...

I think we need to think of the chaining as the main case and the sidechaining as the secondary case. Both need to be easy and expandable. Ideally, all control inputs should also be consumed at the same granularity of time even if their values do not change. This way you can automate any parameter with anything else at the same sample rate. That will produce very impressive synthesis but it's a very hard data structure to model.

I know a veteran coder (doing audio synthesis since hardware in the 60s) who might be better equipped to handle thse architectural questions. I'll reach out and see if he wants to get involved in the discussion...

@mackron
Copy link
Owner Author

mackron commented Nov 18, 2020

You can still chain effects together, but you'll just need to do it manually at the moment. I'm going to sit down and have a proper think about this problem and how to properly handle stream routing - probably in a few weekends from now.

@MichealReed
Copy link
Contributor

I think it's important to consider how other languages will implement bindings that allow for modular addition/subtraction of chained filters/effects. Web Audio has an interesting approach with their node system that I recommend reviewing for consideration. They keep the concept of a node as a separate object, then require connection node by node allowing the api user to specify the order. Maybe bitwise could be used to allow for concatenation of effects/filters? I have seen a similar concept in the algorithmic trading space to allow for signal concatenation.

@meshula
Copy link

meshula commented Nov 26, 2020

If you investigate the WebAudio project, the LabSound project is derived from the WebKit implementation of WebAudio and is boiled down to a lightweight C++ library. We've extensively rewritten most of it at this point for performance, audio quality, or new features. https://github.com/LabSound/LabSound Our current implementation treats node inputs as summing junctions, and uses buffer forwarding for things like pass through nodes, such as gain nodes with a unity value. The code's much easier to follow than the original sources :) The dev branch uses miniaudio as a backend.

@Gargaj
Copy link

Gargaj commented Dec 31, 2020

How about position-retrieval? (i.e. how many samples / seconds have passed since the first sample being played)

@mackron
Copy link
Owner Author

mackron commented Dec 31, 2020

@Gargaj Yep, already implemented. ma_sound_get_cursor_in_pcm_frames().

@mackron
Copy link
Owner Author

mackron commented Jan 3, 2021

For anybody whose interested, I've pushed an update to the dev branch which adds some advanced routing infrastructure which is going to sit as the foundation for the engine API and is replacing the old ma_effect stuff completely. A description is at the top of miniaudio_engine.h and is in the dev branch.

To summarize, it's basically just a node graph system. You create nodes and connect them together. Each node can have a custom effect applied to it by implementing a callback (similar to the old ma_effect stuff). The ma_sound and ma_sound_group objects are going to become nodes in the graph which you're going to be able to connect between each other however you like within certain limits (ma_sound nodes will not allow any inputs). You'll be able to connect groups to other groups, attach effects to the ends of sounds and groups, split groups and route them into other nodes in the graph for some kind of effect processing or whatnot, etc., etc.. This system should be very modular and miniaudio will ship with some stock nodes for standard effects. Currently there's just a few, but I want to expand on this library going forward.

This (hopefully) solves the sidechaining problem mentioned by @frink a while ago and significantly improves on the ma_effect chaining system (it's a now a graph rather than a simple chain). The ma_effect API will almost certainly be removed. The ma_mixer stuff will likely be removed as well.

Also, here's a usage example for those who are curious: https://github.com/mackron/miniaudio/blob/dev/research/miniaudio_routing.c

@mackron
Copy link
Owner Author

mackron commented Jan 14, 2021

An update for those who are following the development of this high level stuff. The engine has now been ported over to the new routing infrastructure. With this change, ma_engine becomes a node graph (it has an "is-a" relationship with ma_node_graph). ma_sound and ma_sound_group are now nodes with an "is-a" relationship with ma_node and are compatible with all ma_node_*() APIs. These can be plugged into any other ma_node in the graph, not just other ma_sound_group nodes. The ma_sound object is a node with 0 input buses, whereas ma_sound_group has 1 input bus. Both have 1 output bus which can be plugged into the input bus of any other node. This is how you'll do advanced effects like ducking, limiters, compressors, etc.

Currently there aren't any effect nodes other than a splitter (1 input bus; 2 output buses each containing the duplicated signal from the input bus). These will be developed further as we go.

There's been a few API changes:

  • ma_mixer has been removed.
  • ma_effect has been removed.
  • ma_sound_set_start/stop_delay() has been renamed to ma_sound_set_start/stop_time() and now takes an absolute time in frames rather than a relative time in milliseconds. To use a relative time, add the time to ma_engine_get_time() to make it absolute. To use milliseconds, just use standard sample rate to milliseconds conversion.
  • There might be some more changes which I've forgotten off the top of my head.

If you're wanting to play around with this just keep in mind that the integration with the routing system is hot off the press and there might be some regressions. I might also be making some API changes here and there. Documentation for the routing system is near the top of miniaudio_engine.h (might need to scroll down a bit). Feel free to post any questions or issues here if you find anything. As usual, always open to feedback no matter how insignificant you think it might be.

@r-lyeh
Copy link

r-lyeh commented Jan 14, 2021

Maybe a small diagram in markdeep would help on the documentation :D

@mackron
Copy link
Owner Author

mackron commented Jan 14, 2021

Yeah some diagrams to visualise some node setups would help. Will look into that when I get a chance.

@mackron
Copy link
Owner Author

mackron commented Jan 16, 2021

A few examples for the routing stuff:

These examples will show how multiple input buses work and how you'd handle sidechaining. When I get the high level stuff out, ma_vocoder_node will be moved to the extras folder so it can be integrated into your own graphs. Not sure yet if it'll make it into miniaudio.h.

This is the library I used for the vocoder effect: https://github.com/blastbay/voclib

@mackron
Copy link
Owner Author

mackron commented Jan 26, 2021

@frink Just re-reading your comment from before about those filters you suggested, and the feedback suppression and paulstretch stuff won't ever be making it into the main library I'm sorry. These can be implemented as custom nodes if they were needing to be used in conjunction with the routing system. The notch filter is a definite yes for the first phase (assuming you're talking about the ma_notch2 filter that already exists). The compressor/limiter is something I'm going to try and get done for the first phase but not making any guarantees.

@Jaytheway
Copy link

Jaytheway commented Feb 17, 2021

Amazing work! I have a question: how are different channels of a sound spatialized? Any plans on implementing a setting for "size" of a sound, or some equivalent to set the distance/positioning for different channels of spatialized sound source?

@mackron
Copy link
Owner Author

mackron commented Feb 19, 2021

@Jaytheway No plans for that right now, sorry. It would require a completely different (and more complex) approach to the spatialization model which I'm not super enthusiastic about.

Spatialization works with sounds of all channel counts and works by first calculating an overall gain based on the distance of the sound as a whole to the listener, and then applying a pan to each channel based on the direction of the sound relative to the listener.

@Jaytheway
Copy link

This is good to know! I guess I could try writing a custom panner effect node that would handle 3D source channel mapping based on channel emitter positions calculated with some vector math. Or because it would need endpoint speaker map, it might be better to move this somewhere outside of individual source pan node effects... 🤔

@Jaytheway
Copy link

Hey @mackron There seems to be a typo in ma_spatializer_set_max_distance(). It sets min distance instead of the max:
image

@mackron
Copy link
Owner Author

mackron commented Mar 5, 2021

@Jaytheway Thanks! Just had another report from another user at the same time! Fixed in the dev branch.

@tycho
Copy link
Contributor

tycho commented Mar 15, 2021

@mackron I know you're aware of these two issues, but I don't want these issues to get lost, so I'm adding them here for tracking. Hope you don't mind!

First of all, there's a memory leak in the resource manager for ma_sounds with multiple instances, backed by the resource manager and initialized with ma_sound_init_from_file. The reference counting for the data buffer API is supposed to be guarding the ma_data_buffer_node rather than the ma_data_buffer itself, but the way the current code works, it skips doing ma_resource_manager_data_buffer_uninit_connector on the ma_data_buffer unless the reference count hits zero. This causes the decoder to leak memory for any sound that has more than a single instance. This fix currently works for me and valgrind/ASAN agree there aren't any memory leaks afterward:

index 17c2ef2..58d3004 100644
--- a/research/miniaudio_engine.h
+++ b/research/miniaudio_engine.h
@@ -6429,6 +6429,8 @@ static ma_result ma_resource_manager_data_buffer_uninit_nolock(ma_resource_manag
             ma_resource_manager_inline_notification_wait(&notification);
             ma_resource_manager_inline_notification_uninit(&notification);
         }
+    } else {
+        ma_resource_manager_data_buffer_uninit_connector(pDataBuffer->pResourceManager, pDataBuffer);
     }
 
     return MA_SUCCESS;

But it's probable I've missed other parts of the ma_data_buffer that require cleanup as well.

The second problem is that the spatialization calculations aren't working correctly. We've talked about this at length already, but something's very wrong with the matrix calculations it does. I ended up cheating and figuring out how openal-soft does the math and came up with this patch, even though I don't understand what the math is doing exactly:

index efc033f..7eb2606 100644
--- a/research/miniaudio_engine.h
+++ b/research/miniaudio_engine.h
@@ -9314,6 +9314,7 @@ MA_API ma_result ma_spatializer_process_pcm_frames(ma_spatializer* pSpatializer,
         ma_vec3f relativePosNormalized;
         ma_vec3f relativePos;   /* The position relative to the listener. */
         ma_vec3f relativeDir;   /* The direction of the sound, relative to the listener. */
+        ma_vec3f relativeVel;
         ma_vec3f listenerVel;   /* The volocity of the listener. For doppler pitch calculation. */
         float speedOfSound;
         float distance = 0;
@@ -9354,8 +9355,8 @@ MA_API ma_result ma_spatializer_process_pcm_frames(ma_spatializer* pSpatializer,
             a cross product.
             */
             axisZ = ma_vec3f_normalize(pListener->direction);                               /* Normalization required here because we can't trust the caller. */
-            axisX = ma_vec3f_normalize(ma_vec3f_cross(axisZ, pListener->config.worldUp));   /* Normalization required here because the world up vector may not be perpendicular with the forward vector. */
-            axisY = ma_vec3f_cross(axisX, axisZ);                                           /* No normalization is required here because axisX and axisZ are unit length and perpendicular. */
+            axisX = ma_vec3f_normalize(pListener->config.worldUp);                          /* Normalization required here because the world up vector may not be perpendicular with the forward vector. */
+            axisY = ma_vec3f_cross(axisZ, axisX);                                           /* No normalization is required here because axisX and axisZ are unit length and perpendicular. */
 
             /*
             We need to swap the X axis if we're left handed because otherwise the cross product above
@@ -9366,50 +9367,50 @@ MA_API ma_result ma_spatializer_process_pcm_frames(ma_spatializer* pSpatializer,
                 axisX = ma_vec3f_neg(axisX);
             }
 
-            #if 1
             {
-                m[0][0] = axisX.x; m[0][1] = axisY.x; m[0][2] = -axisZ.x; m[0][3] = -ma_vec3f_dot(axisX, pListener->position);
-                m[1][0] = axisX.y; m[1][1] = axisY.y; m[1][2] = -axisZ.y; m[1][3] = -ma_vec3f_dot(axisY, pListener->position);
-                m[2][0] = axisX.z; m[2][1] = axisY.z; m[2][2] = -axisZ.z; m[2][3] = -ma_vec3f_dot(axisZ, pListener->position);
+                m[0][0] = axisY.x; m[0][1] = axisX.x; m[0][2] = -axisZ.x; m[0][3] = 0;
+                m[1][0] = axisY.y; m[1][1] = axisX.y; m[1][2] = -axisZ.y; m[1][3] = 0;
+                m[2][0] = axisY.z; m[2][1] = axisX.z; m[2][2] = -axisZ.z; m[2][3] = 0;
                 m[3][0] = 0;       m[3][1] = 0;       m[3][2] = 0;        m[3][3] = 1;
             }
-            #else
+
+            v = pListener->position;
             {
-                m[0][0] = axisX.x; m[1][0] = axisY.x; m[2][0] = -axisZ.x; m[3][0] = -ma_vec3f_dot(axisX, pListener->position);
-                m[0][1] = axisX.y; m[1][1] = axisY.y; m[2][1] = -axisZ.y; m[3][1] = -ma_vec3f_dot(axisY, pListener->position);
-                m[0][2] = axisX.z; m[1][2] = axisY.z; m[2][2] = -axisZ.z; m[3][2] = -ma_vec3f_dot(axisZ, pListener->position);
-                m[0][3] = 0;       m[1][3] = 0;       m[2][3] = 0;        m[3][3] = 1;
+                m[3][0] = -(m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 1);
+                m[3][1] = -(m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 1);
+                m[3][2] = -(m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 1);
             }
-            #endif
 
-            /*
-            Multiply the lookat matrix by the spatializer position to transform it to listener
-            space. This allows calculations to work based on the sound being relative to the
-            origin which makes things simpler.
-            */
+            v = pListener->velocity;
+            {
+                listenerVel.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 0;
+                listenerVel.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 0;
+                listenerVel.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 0;
+            }
+
+
+            /* Now that we have all the listener parameters calculated, translate the spatializer into listener space. */
             v = pSpatializer->position;
-            #if 1
             {
                 relativePos.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 1;
                 relativePos.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 1;
                 relativePos.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 1;
             }
-            #else
+
+            v = pSpatializer->velocity;
             {
-                relativePos.x = m[0][0] * v.x + m[0][1] * v.y + m[0][2] * v.z + m[0][3] * 1;
-                relativePos.y = m[1][0] * v.x + m[1][1] * v.y + m[1][2] * v.z + m[1][3] * 1;
-                relativePos.z = m[2][0] * v.x + m[2][1] * v.y + m[2][2] * v.z + m[2][3] * 1;
+                relativeVel.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 0;
+                relativeVel.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 0;
+                relativeVel.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 0;
             }
-            #endif
 
-            /*
-            The direction of the sound needs to also be transformed so that it's relative to the
-            rotation of the listener.
-            */
             v = pSpatializer->direction;
-            relativeDir.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z;
-            relativeDir.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z;
-            relativeDir.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z;
+            {
+                relativeDir.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 0;
+                relativeDir.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 0;
+                relativeDir.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 0;
+            }
+            relativeDir = ma_vec3f_normalize(relativeDir);
 
             #if defined(MA_DEBUG_OUTPUT)
             {
@@ -9615,7 +9616,7 @@ MA_API ma_result ma_spatializer_process_pcm_frames(ma_spatializer* pSpatializer,
         source.
         */
         if (pSpatializer->config.dopplerFactor > 0) {
-            pSpatializer->dopplerPitch = ma_doppler_pitch(ma_vec3f_neg(relativePos), pSpatializer->velocity, listenerVel, speedOfSound, pSpatializer->config.dopplerFactor);
+            pSpatializer->dopplerPitch = ma_doppler_pitch(ma_vec3f_neg(relativePos), relativeVel, listenerVel, speedOfSound, pSpatializer->config.dopplerFactor);
         } else {
             pSpatializer->dopplerPitch = 1;
         }

This does make the spatialization work perfectly for me (in an OpenGL game), but it's probably doing more work than it needs to.

@mackron
Copy link
Owner Author

mackron commented Mar 20, 2021

@tycho I've pushed an experimental fix for the memory leak to the dev branch. For the spatialization stuff, there's a few things in there that isn't making sense to me from a high level so I want to give that a bit more thought. I'll go through this with you on Discord.

@mackron
Copy link
Owner Author

mackron commented May 3, 2021

A quick update on this since I haven't posted in a while:

  • The memory leak and spatialization bugs pointed out by @tycho are now fixed.
  • Sound cloning has been added. Note that this only works for sounds whose underlying data source was created via the resource manager and is non-streaming. If you want to clone a sound with a custom data source, you need to implement your own reference counting system at the data source level and then call ma_sound_init_from_data_source() directly. New APIs:
    • ma_sound_init_copy()
    • ma_resource_manager_data_buffer_init_copy()
    • ma_resource_manager_data_source_init_copy()
  • The ma_sound and ma_sound_group objects have been unified. Separate API groups still exist for both ma_sound and ma_sound_group, but under the hood they are identical. Previously ma_sound nodes were restricted to a ma_data_source as it's input, but now you can use another node for it's input. This means you can now use an ma_sound node as a group if that suits you. In addition, the input and output channels of an ma_sound node can now be configured rather than being restricted to the engine's native channel count. This enables more optimization opportunities by allowing you to set up effects to run before converting to the engine's native channel count.

@Jaytheway
Copy link

Hi @mackron, I noticed that with distance volume attenuation the volume level just stops attenuating at max distance and continues to play at that level (unless you have linear model with rolloff 1.0). Just wondering if this is how you intend to keep it, or if there are any plans for some sort of fade past max distance mark.

I've set up a desmos graph to check all the curves and parameters, and it illustrates the "leftover" volume level after the max distance. You can check it out by this link if you'd like: https://www.desmos.com/calculator/zo6irujfjv

@Jaytheway
Copy link

To add to my previous comment, my initial thought was that minGain and maxGain define volume level multiplier at maxDistance and minDistnace respectively. But by the looks of it they just clamp the gain after the distance based and cone angle based attenuation.

@mackron
Copy link
Owner Author

mackron commented May 3, 2021

That's the point of the max distance setting so no plans to change that. If you want to continue using that attenuation model you'll need to increase the max distance to the point at which it works for you. If you need it to be completely silent (gain = 0) at max distance you need to use linear with a falloff of 1 like you suggested.

@Jaytheway
Copy link

Jaytheway commented May 8, 2021

@mackron I'm trying to get splitter_node to work with custom node, but I'm having some issues with custom periodSizeInFrames. When splitter connected to an endpoint and to my custom node, which is also connected to the endpoint, and I have set periodSizeInFrames to higher value than device's native, I get crackling and sped up sound.

  1. I initialize engine:
ma_engine_config config = ma_engine_config_init_default();
config.periodSizeInFrames = 1024;
result = ma_engine_init(&config, &engine); 
  1. Then I initialize my custom node with this vtable
static ma_node_vtable reverb_node_vtable = {
    reverb_node_process_pcm_frames,                                     
    nullptr, // get_num_samples_required,                                         
    2, // 2 input buses.
    1, // 1 output bus.
    MA_NODE_FLAG_CONTINUOUS_PROCESSING | MA_NODE_FLAG_ALLOW_NULL_INPUT  
};

...which I tried with implementing get_num_samples_required to return same number of samples as I initialize the engine with.

  1. Then I initialize the splitter_node and connect it to the endpoint and to the custom node
ma_uint8 numChannels = sound.engineNode.baseNode.outputBuses[0].channels;
ma_splitter_node_config splitterNodeConfig = ma_splitter_node_config_init(numChannels);
ma_splitter_node_init(&engine.nodeGraph, &splitterNodeConfig, NULL, &splitterNode);

ma_node_attach_output_bus(&sound, 0, &splitterNode, 0);
ma_node_attach_output_bus(&splitterNode, 0, ma_node_graph_get_endpoint(&engine.nodeGraph), 0);
ma_node_attach_output_bus(&splitterNode, 1, reverb.GetNode(), 0); 

Upon some investigation it occurred to me that my custom node processing block is still running at the native 480 samples, instead of 1024 I set the engine to initialize to.

In all of the following cases there're no audible issues with the playback:

  1. sound node -> my custom node -> endpoint
  2. sound node -> splitter node -> endpoint
  3. sound node -> splitter node -> my custom node -> endpoint
    In all of these cases my custom node is running at native 480, instead of 1024 I set the engine to.

But it's glitching out when the splitter node connected to both: endpoint and my custom node.
It seems as if one branch of the splitter is trying to run engine's 1024 block size, while the other pushing the same number to the custom node which runs native 480 block size. Or something like that.

If I don't change periodSizeInFrames for the engine to initialize with, or set it to less than the device's native, everything works fine.

Am I missing something, or could there be an issue with splitter node, or custom nodes and specified fixed periodSizeInFrames?

@mackron
Copy link
Owner Author

mackron commented May 8, 2021

@Jaytheway The onGetRequiredInputFrameCount variable in the node vtable (get_num_samples_required in your sample) is only needed when your custom node processes input at a different rate to output - i.e. when you're doing resampling - and even then it's entirely optional. If you're not doing any kind of resampling just set this to null and forget about it.

In all of these cases my custom node is running at native 480, instead of 1024 I set the engine to.

Nodes are not guaranteed to process frames in chunks of the engine's period size due to how things are processed and cached internally, and how backends not always using the period size you request. Your custom nodes need to be built with the assumption that the frame counts passed into onProcess callback can be anything. You need to use the variables passed into onProcess to determine how many input frames need processing. If your custom node implementation requires a fixed number of frames you'll need to implement your own caching logic to your custom onProcess to handle that cleanly.

It seems as if one branch of the splitter is trying to run engine's 1024 block size, while the other pushing the same number to the custom node which runs native 480 block size.

I'm sceptical at this. The splitter just copies samples and is very simple - this is it:

/* Splitting is just copying the first input bus and copying it over to each output bus. */
for (iOutputBus = 0; iOutputBus < ma_node_get_output_bus_count(pNodeBase); iOutputBus += 1) {
    ma_copy_pcm_frames(ppFramesOut[iOutputBus], ppFramesIn[0], *pFrameCountOut, ma_format_f32, channels);
}

I think looking at the splitter is leading you down the wrong path - I'm suspecting you're making an incorrect assumption on how the frame counts work. So a few things to consider:

  • There's no connection between the engine's period size and the number of frames processed by a node. Don't tie your custom nodes to the engine's period size.
  • Use the the value passed into pFrameCountIn in onProcess to determine how many input frames to process, not the engine's period size.

@Jaytheway
Copy link

@mackron Thank you for the explanation. It is strange, though, that the issue occurs only when the splitter node is connected on both ends and the periodSizeInFrames is set to higher number than the device's default, which in my case is 480 with WASAPI.

I'm going to do more testing with fixed buffer size from custom data callback and ring buffer, just to make sure.

What is the purpose of engine's period size? Looking though engine initialization code, it seems that it passes it to the device initialization, but where is it actually in use?

@mackron
Copy link
Owner Author

mackron commented May 9, 2021

It's passed through to the backend and defines the size of the internal buffer. Higher values means more time to do processing and fewer wakeups (less resource usage), but at the cost of more latency. Different programs with different requirements will require different values, but typically the default is fine which I think is 10 milliseconds.

If you can write up a simple sample program with no dependencies that demonstrates your splitter issue I can take a look in case there might be a bug somewhere.

@Jaytheway
Copy link

It's passed through to the backend and defines the size of the internal buffer. Higher values means more time to do processing and fewer wakeups (less resource usage), but at the cost of more latency.

Which is why I'm trying to use custom value for the periodSizeInFrames, expecting to have more time in the processing block to do processing. I also was under the assumption that the block size would be fixed.

Here's a simple program that demonstrates the issue I'm having. Test file included.
SplitterTest.zip

@Jaytheway
Copy link

Correct me if I'm wrong, but, as far as I understand, connecting to graph's endpoint means connecting to the last (or first), node that's being pulled from by the engine data callback.

Or does it mean that anything connected to the graph's "endpoint" is going to be pulled directly by the initialized device, bypassing engine's data callback? (this would explain why custom nodes is not being pulled in size of engine's periodSizeInFrames)

@mackron mackron closed this as completed May 9, 2021
@mackron mackron reopened this May 9, 2021
@mackron
Copy link
Owner Author

mackron commented May 9, 2021

The backend is in control of what is actually chosen for the period size - it's just a hint to tell the backend what you want, but you won't necessarily get what you ask for. The engine itself may break down processing into smaller chunks for caching purposes. A fixed sized processing chunk is an incorrect assumption. You need to assume that it can be anything, and that each call to the processing callback may pass in a different number of frames.

The engine will pull data from the endpoint, which pulls data from it's inputs, which pulls data from their inputs, etc., etc. There's not really any engine data callback - it just pulls data directly from the underlying device's data callback using whatever frame count is specified, which can be anything.

@Jaytheway
Copy link

..it's just a hint to tell the backend what you want, but you won't necessarily get what you ask for.

This clarifies it perfectly!

The engine itself may break down processing into smaller chunks for caching purposes.

And this is very interesting 🤔

Thank you for the explanation 🙂

@mackron
Copy link
Owner Author

mackron commented May 16, 2021

For those interested, custom loop points and data source chaining has been implemented. Unfortunately it's not compatible with version 0.10 without some API changes that would break custom data sources. It's in the dev branch, but needs to be enabled with this compile-time option which must be specified on the command line or before the header and implementation of miniaudio.h:

#define MA_EXPERIMENTAL__DATA_LOOPING_AND_CHAINING

Note that this will cause custom data sources to break unless they're updated as per the rules in the revision history down the bottom of miniaudio.h (copied here for your convenience):

  • Change your base data source object from ma_data_source_callbacks to ma_data_source_base.
  • Call ma_data_source_init() for your base object in your custom data source's initialization routine. This takes a config object which includes a pointer to a vtable which is now where your custom callbacks are defined.
  • Call ma_data_source_uninit() in your custom data source's uninitialization routine. This doesn't currently do anything, but it placeholder in case some future uninitialization code is required to be added at a later date.

Loop points and chaining are done at the data source level. The following APIs have been added (only available with the aforementioned option).

MA_API ma_result ma_data_source_set_range_in_pcm_frames(ma_data_source* pDataSource, ma_uint64 rangeBeg, ma_uint64 rangeEnd);
MA_API ma_result ma_data_source_set_current(ma_data_source* pDataSource, ma_data_source* pCurrentDataSource);
MA_API ma_result ma_data_source_set_next(ma_data_source* pDataSource, ma_data_source* pNextDataSource);
MA_API ma_result ma_data_source_set_next_callback(ma_data_source* pDataSource, ma_data_source_get_next_proc onGetNext);

See the data_source_chaining example for usage. Use ma_sound_get_data_source() to retrieve the data source of a ma_sound object.

With this change we're another step closer to getting this high level stuff out. Soon I'll be creating a dev-0.11 branch at which point I'll be integrating the engine code into miniaudio.h. Once that's done, the dev-0.11 branch will be where all future updates to the high level API will be happening. I'll be posting an update here when this happens.

@mackron
Copy link
Owner Author

mackron commented Jul 4, 2021

The high level stuff is pretty much done now 🥳. For those following this development, I've created branch dev-0.11 which is where development will continue, but be warned that I'll be making quite a few API changes to the main library on that branch.

@cshenton
Copy link

cshenton commented Aug 3, 2021

This feature set sounds fantastic, would make miniaudio a great alternative to SoLoud for those preferring more portable source code.

On emscripten, I can see a full audio engine clogging up the main thread pretty quickly though. Hopefully by the time it's ready for release, browser maintainers will have sorted their stuff out with SharedArrayBuffer.

@PlatoSoft
Copy link

Looking forward to get my hands on this once it's on the master branch. Thank you so much for your work.

@mackron
Copy link
Owner Author

mackron commented Dec 18, 2021

This has been released as part of version 0.11. Thanks to everyone who tested and provided feedback and suggestions!

@mackron mackron closed this as completed Dec 18, 2021
@mackron mackron unpinned this issue Dec 18, 2021
@GZGavinZhao
Copy link

@mackron Just some really trivial stuff, but could you please consider tagging your release? For wrapper library authors, this would make it a bit easier and straight-forward to manage versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests