-
-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High Level API - Phase 1 #196
Comments
Awesome as usual! :D |
Might also add these few more features/effects:
Don't think you need the kitchen sink. But these are the basics I would want in the base toolkit... |
I've already got a notch filter (in addition to high/low shelf filters and a peaking filter), unless you're talking about something different? http://miniaud.io/docs/manual/#Filtering. There's no reason I can't support them with the The limiter, compressor and convolution stuff I'll consider, but may delay it to a later stage. The feedback suppression and Paulstretch stuff I'll consider, but will definitely not be included in this phase. |
Wow! Really impressive! I'm not an audio expert but the set of intended features looks really complete! |
That sounds like a great idea to wrap all the existing low level APIs with the flexible
I looked but didn't see enough of the QUESTION: Are you planning to keep P.S. - Thanks for your devotion to a great cross platform audio experience!!! |
It'll all be merged into the one file, otherwise it's defeating one of the major features of miniaudio. I've had other people mention the sidechaining stuff in the past so I'll at least look at it for this phase, but not guaranteeing anything. |
Would it be possible to create a basic occlusion system based on this? I'd pull 3d geometry, throw raycasts from/to audio sources, and cutoff some frequencies at the mixing stage for a given listener. To give a brief recap, I worked in a game where I had to implement sound occlusion too. Dual occlusion explicitly, were adjacent walls and/or roofs would make subtle/big differences while listening to random conversations in a house. Game designers would assign occlusion properties to building materials so during game sessions, engine would raycast positions between audio listeners (players) and audio sources, and would process audio samples every time that a ray did hit one of those materials. Basically at every hit, and depending on the material, engine would decrease volume and cut off a frequency filter; raycast would end if reaching target or if volume was 0 because of multiple hits. Other basic features like audio bouncing and/or material reflectance were just omitted for simplicity (see attached pic). It worked pretty well overall, given the simplicity of such a system. |
Yes, a geometric occlusion system is something I want to build, but I'm planning on doing that at a later stage. But yes, that's totally on the table. I should add a section to the bottom of my original post to list the stuff that's planned but won't make it in the first phase. |
A new fading API has been implemented: d8aa619 I've also added sound cloning to the list of things to implement for phase 1 which has now been requested a few times now. |
Maybe a better name for the cloning would be “sound instancing”? The idea being you preload all your ma_sounds at level load time then anytime you want to play a sound you grab an instance of it: |
The cloning thing probably needs a bit more explanation. Sound cloning will only available for sounds that were created with I realise that might seem a bit unintuitive, so to go into a bit more detail, the There's two ways to initalize a sound: 1) via the resource manager ( Auto-recycling will not be happening. Each cloned sound will still need a manual |
That all makes sense. As does cloning only existing for loaded audio files. I wouldn’t expect miniaudio to be able to clone a data source it doesn’t fully own. |
@frink In case you were interested, I'm experimenting with some ideas for supporting sidechaining in effects, or more generally, supporting multiple input streams when processing an effect. Basically all I've done is repurposed MA_API ma_result ma_effect_process_pcm_frames_ex(ma_effect* pEffect, ma_uint32 inputStreamCount, const void** ppFramesIn, ma_uint64* pFrameCountIn, void* pFramesOut, ma_uint64* pFrameCountOut); You thoughts on that design? |
I think think will work. But I'm having a little trouble visualizing a full effects chain... Could you post something showing both the single stream and multi-stream (sidechain) scenarios? Maybe like:
Pseudo-code is fine. Just need to understand a little better how the full effects chain might work in practice. Thanks! 😄 |
I'm still figuring out that part :). Indeed, I actually needed to remove the built-in chaining stuff (might add it back, not sure) because it just didn't work with this multi-input stuff. Will think about this more later and report back. |
I see... I think we need to think of the chaining as the main case and the sidechaining as the secondary case. Both need to be easy and expandable. Ideally, all control inputs should also be consumed at the same granularity of time even if their values do not change. This way you can automate any parameter with anything else at the same sample rate. That will produce very impressive synthesis but it's a very hard data structure to model. I know a veteran coder (doing audio synthesis since hardware in the 60s) who might be better equipped to handle thse architectural questions. I'll reach out and see if he wants to get involved in the discussion... |
You can still chain effects together, but you'll just need to do it manually at the moment. I'm going to sit down and have a proper think about this problem and how to properly handle stream routing - probably in a few weekends from now. |
I think it's important to consider how other languages will implement bindings that allow for modular addition/subtraction of chained filters/effects. Web Audio has an interesting approach with their node system that I recommend reviewing for consideration. They keep the concept of a node as a separate object, then require connection node by node allowing the api user to specify the order. Maybe bitwise could be used to allow for concatenation of effects/filters? I have seen a similar concept in the algorithmic trading space to allow for signal concatenation. |
If you investigate the WebAudio project, the LabSound project is derived from the WebKit implementation of WebAudio and is boiled down to a lightweight C++ library. We've extensively rewritten most of it at this point for performance, audio quality, or new features. https://github.com/LabSound/LabSound Our current implementation treats node inputs as summing junctions, and uses buffer forwarding for things like pass through nodes, such as gain nodes with a unity value. The code's much easier to follow than the original sources :) The dev branch uses miniaudio as a backend. |
How about position-retrieval? (i.e. how many samples / seconds have passed since the first sample being played) |
@Gargaj Yep, already implemented. |
For anybody whose interested, I've pushed an update to the dev branch which adds some advanced routing infrastructure which is going to sit as the foundation for the engine API and is replacing the old To summarize, it's basically just a node graph system. You create nodes and connect them together. Each node can have a custom effect applied to it by implementing a callback (similar to the old This (hopefully) solves the sidechaining problem mentioned by @frink a while ago and significantly improves on the Also, here's a usage example for those who are curious: https://github.com/mackron/miniaudio/blob/dev/research/miniaudio_routing.c |
An update for those who are following the development of this high level stuff. The engine has now been ported over to the new routing infrastructure. With this change, Currently there aren't any effect nodes other than a splitter (1 input bus; 2 output buses each containing the duplicated signal from the input bus). These will be developed further as we go. There's been a few API changes:
If you're wanting to play around with this just keep in mind that the integration with the routing system is hot off the press and there might be some regressions. I might also be making some API changes here and there. Documentation for the routing system is near the top of miniaudio_engine.h (might need to scroll down a bit). Feel free to post any questions or issues here if you find anything. As usual, always open to feedback no matter how insignificant you think it might be. |
Maybe a small diagram in markdeep would help on the documentation :D |
Yeah some diagrams to visualise some node setups would help. Will look into that when I get a chance. |
A few examples for the routing stuff:
These examples will show how multiple input buses work and how you'd handle sidechaining. When I get the high level stuff out, This is the library I used for the vocoder effect: https://github.com/blastbay/voclib |
@frink Just re-reading your comment from before about those filters you suggested, and the feedback suppression and paulstretch stuff won't ever be making it into the main library I'm sorry. These can be implemented as custom nodes if they were needing to be used in conjunction with the routing system. The notch filter is a definite yes for the first phase (assuming you're talking about the |
Amazing work! I have a question: how are different channels of a sound spatialized? Any plans on implementing a setting for "size" of a sound, or some equivalent to set the distance/positioning for different channels of spatialized sound source? |
@Jaytheway No plans for that right now, sorry. It would require a completely different (and more complex) approach to the spatialization model which I'm not super enthusiastic about. Spatialization works with sounds of all channel counts and works by first calculating an overall gain based on the distance of the sound as a whole to the listener, and then applying a pan to each channel based on the direction of the sound relative to the listener. |
This is good to know! I guess I could try writing a custom panner effect node that would handle 3D source channel mapping based on channel emitter positions calculated with some vector math. Or because it would need endpoint speaker map, it might be better to move this somewhere outside of individual source pan node effects... 🤔 |
Hey @mackron There seems to be a typo in ma_spatializer_set_max_distance(). It sets min distance instead of the max: |
@Jaytheway Thanks! Just had another report from another user at the same time! Fixed in the dev branch. |
@mackron I know you're aware of these two issues, but I don't want these issues to get lost, so I'm adding them here for tracking. Hope you don't mind! First of all, there's a memory leak in the resource manager for index 17c2ef2..58d3004 100644
--- a/research/miniaudio_engine.h
+++ b/research/miniaudio_engine.h
@@ -6429,6 +6429,8 @@ static ma_result ma_resource_manager_data_buffer_uninit_nolock(ma_resource_manag
ma_resource_manager_inline_notification_wait(¬ification);
ma_resource_manager_inline_notification_uninit(¬ification);
}
+ } else {
+ ma_resource_manager_data_buffer_uninit_connector(pDataBuffer->pResourceManager, pDataBuffer);
}
return MA_SUCCESS; But it's probable I've missed other parts of the The second problem is that the spatialization calculations aren't working correctly. We've talked about this at length already, but something's very wrong with the matrix calculations it does. I ended up cheating and figuring out how openal-soft does the math and came up with this patch, even though I don't understand what the math is doing exactly: index efc033f..7eb2606 100644
--- a/research/miniaudio_engine.h
+++ b/research/miniaudio_engine.h
@@ -9314,6 +9314,7 @@ MA_API ma_result ma_spatializer_process_pcm_frames(ma_spatializer* pSpatializer,
ma_vec3f relativePosNormalized;
ma_vec3f relativePos; /* The position relative to the listener. */
ma_vec3f relativeDir; /* The direction of the sound, relative to the listener. */
+ ma_vec3f relativeVel;
ma_vec3f listenerVel; /* The volocity of the listener. For doppler pitch calculation. */
float speedOfSound;
float distance = 0;
@@ -9354,8 +9355,8 @@ MA_API ma_result ma_spatializer_process_pcm_frames(ma_spatializer* pSpatializer,
a cross product.
*/
axisZ = ma_vec3f_normalize(pListener->direction); /* Normalization required here because we can't trust the caller. */
- axisX = ma_vec3f_normalize(ma_vec3f_cross(axisZ, pListener->config.worldUp)); /* Normalization required here because the world up vector may not be perpendicular with the forward vector. */
- axisY = ma_vec3f_cross(axisX, axisZ); /* No normalization is required here because axisX and axisZ are unit length and perpendicular. */
+ axisX = ma_vec3f_normalize(pListener->config.worldUp); /* Normalization required here because the world up vector may not be perpendicular with the forward vector. */
+ axisY = ma_vec3f_cross(axisZ, axisX); /* No normalization is required here because axisX and axisZ are unit length and perpendicular. */
/*
We need to swap the X axis if we're left handed because otherwise the cross product above
@@ -9366,50 +9367,50 @@ MA_API ma_result ma_spatializer_process_pcm_frames(ma_spatializer* pSpatializer,
axisX = ma_vec3f_neg(axisX);
}
- #if 1
{
- m[0][0] = axisX.x; m[0][1] = axisY.x; m[0][2] = -axisZ.x; m[0][3] = -ma_vec3f_dot(axisX, pListener->position);
- m[1][0] = axisX.y; m[1][1] = axisY.y; m[1][2] = -axisZ.y; m[1][3] = -ma_vec3f_dot(axisY, pListener->position);
- m[2][0] = axisX.z; m[2][1] = axisY.z; m[2][2] = -axisZ.z; m[2][3] = -ma_vec3f_dot(axisZ, pListener->position);
+ m[0][0] = axisY.x; m[0][1] = axisX.x; m[0][2] = -axisZ.x; m[0][3] = 0;
+ m[1][0] = axisY.y; m[1][1] = axisX.y; m[1][2] = -axisZ.y; m[1][3] = 0;
+ m[2][0] = axisY.z; m[2][1] = axisX.z; m[2][2] = -axisZ.z; m[2][3] = 0;
m[3][0] = 0; m[3][1] = 0; m[3][2] = 0; m[3][3] = 1;
}
- #else
+
+ v = pListener->position;
{
- m[0][0] = axisX.x; m[1][0] = axisY.x; m[2][0] = -axisZ.x; m[3][0] = -ma_vec3f_dot(axisX, pListener->position);
- m[0][1] = axisX.y; m[1][1] = axisY.y; m[2][1] = -axisZ.y; m[3][1] = -ma_vec3f_dot(axisY, pListener->position);
- m[0][2] = axisX.z; m[1][2] = axisY.z; m[2][2] = -axisZ.z; m[3][2] = -ma_vec3f_dot(axisZ, pListener->position);
- m[0][3] = 0; m[1][3] = 0; m[2][3] = 0; m[3][3] = 1;
+ m[3][0] = -(m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 1);
+ m[3][1] = -(m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 1);
+ m[3][2] = -(m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 1);
}
- #endif
- /*
- Multiply the lookat matrix by the spatializer position to transform it to listener
- space. This allows calculations to work based on the sound being relative to the
- origin which makes things simpler.
- */
+ v = pListener->velocity;
+ {
+ listenerVel.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 0;
+ listenerVel.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 0;
+ listenerVel.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 0;
+ }
+
+
+ /* Now that we have all the listener parameters calculated, translate the spatializer into listener space. */
v = pSpatializer->position;
- #if 1
{
relativePos.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 1;
relativePos.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 1;
relativePos.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 1;
}
- #else
+
+ v = pSpatializer->velocity;
{
- relativePos.x = m[0][0] * v.x + m[0][1] * v.y + m[0][2] * v.z + m[0][3] * 1;
- relativePos.y = m[1][0] * v.x + m[1][1] * v.y + m[1][2] * v.z + m[1][3] * 1;
- relativePos.z = m[2][0] * v.x + m[2][1] * v.y + m[2][2] * v.z + m[2][3] * 1;
+ relativeVel.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 0;
+ relativeVel.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 0;
+ relativeVel.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 0;
}
- #endif
- /*
- The direction of the sound needs to also be transformed so that it's relative to the
- rotation of the listener.
- */
v = pSpatializer->direction;
- relativeDir.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z;
- relativeDir.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z;
- relativeDir.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z;
+ {
+ relativeDir.x = m[0][0] * v.x + m[1][0] * v.y + m[2][0] * v.z + m[3][0] * 0;
+ relativeDir.y = m[0][1] * v.x + m[1][1] * v.y + m[2][1] * v.z + m[3][1] * 0;
+ relativeDir.z = m[0][2] * v.x + m[1][2] * v.y + m[2][2] * v.z + m[3][2] * 0;
+ }
+ relativeDir = ma_vec3f_normalize(relativeDir);
#if defined(MA_DEBUG_OUTPUT)
{
@@ -9615,7 +9616,7 @@ MA_API ma_result ma_spatializer_process_pcm_frames(ma_spatializer* pSpatializer,
source.
*/
if (pSpatializer->config.dopplerFactor > 0) {
- pSpatializer->dopplerPitch = ma_doppler_pitch(ma_vec3f_neg(relativePos), pSpatializer->velocity, listenerVel, speedOfSound, pSpatializer->config.dopplerFactor);
+ pSpatializer->dopplerPitch = ma_doppler_pitch(ma_vec3f_neg(relativePos), relativeVel, listenerVel, speedOfSound, pSpatializer->config.dopplerFactor);
} else {
pSpatializer->dopplerPitch = 1;
} This does make the spatialization work perfectly for me (in an OpenGL game), but it's probably doing more work than it needs to. |
@tycho I've pushed an experimental fix for the memory leak to the dev branch. For the spatialization stuff, there's a few things in there that isn't making sense to me from a high level so I want to give that a bit more thought. I'll go through this with you on Discord. |
A quick update on this since I haven't posted in a while:
|
Hi @mackron, I noticed that with distance volume attenuation the volume level just stops attenuating at I've set up a desmos graph to check all the curves and parameters, and it illustrates the "leftover" volume level after the |
To add to my previous comment, my initial thought was that |
That's the point of the max distance setting so no plans to change that. If you want to continue using that attenuation model you'll need to increase the max distance to the point at which it works for you. If you need it to be completely silent (gain = 0) at max distance you need to use linear with a falloff of 1 like you suggested. |
@mackron I'm trying to get
...which I tried with implementing
Upon some investigation it occurred to me that my In all of the following cases there're no audible issues with the playback:
But it's glitching out when the If I don't change Am I missing something, or could there be an issue with |
@Jaytheway The
Nodes are not guaranteed to process frames in chunks of the engine's period size due to how things are processed and cached internally, and how backends not always using the period size you request. Your custom nodes need to be built with the assumption that the frame counts passed into
I'm sceptical at this. The splitter just copies samples and is very simple - this is it: /* Splitting is just copying the first input bus and copying it over to each output bus. */
for (iOutputBus = 0; iOutputBus < ma_node_get_output_bus_count(pNodeBase); iOutputBus += 1) {
ma_copy_pcm_frames(ppFramesOut[iOutputBus], ppFramesIn[0], *pFrameCountOut, ma_format_f32, channels);
} I think looking at the splitter is leading you down the wrong path - I'm suspecting you're making an incorrect assumption on how the frame counts work. So a few things to consider:
|
@mackron Thank you for the explanation. It is strange, though, that the issue occurs only when the splitter node is connected on both ends and the I'm going to do more testing with fixed buffer size from custom data callback and ring buffer, just to make sure. What is the purpose of engine's period size? Looking though engine initialization code, it seems that it passes it to the device initialization, but where is it actually in use? |
It's passed through to the backend and defines the size of the internal buffer. Higher values means more time to do processing and fewer wakeups (less resource usage), but at the cost of more latency. Different programs with different requirements will require different values, but typically the default is fine which I think is 10 milliseconds. If you can write up a simple sample program with no dependencies that demonstrates your splitter issue I can take a look in case there might be a bug somewhere. |
Which is why I'm trying to use custom value for the Here's a simple program that demonstrates the issue I'm having. Test file included. |
Correct me if I'm wrong, but, as far as I understand, connecting to graph's endpoint means connecting to the last (or first), node that's being pulled from by the engine data callback. Or does it mean that anything connected to the graph's "endpoint" is going to be pulled directly by the initialized device, bypassing engine's data callback? (this would explain why custom nodes is not being pulled in size of engine's |
The backend is in control of what is actually chosen for the period size - it's just a hint to tell the backend what you want, but you won't necessarily get what you ask for. The engine itself may break down processing into smaller chunks for caching purposes. A fixed sized processing chunk is an incorrect assumption. You need to assume that it can be anything, and that each call to the processing callback may pass in a different number of frames. The engine will pull data from the endpoint, which pulls data from it's inputs, which pulls data from their inputs, etc., etc. There's not really any engine data callback - it just pulls data directly from the underlying device's data callback using whatever frame count is specified, which can be anything. |
This clarifies it perfectly!
And this is very interesting 🤔 Thank you for the explanation 🙂 |
For those interested, custom loop points and data source chaining has been implemented. Unfortunately it's not compatible with version 0.10 without some API changes that would break custom data sources. It's in the dev branch, but needs to be enabled with this compile-time option which must be specified on the command line or before the header and implementation of miniaudio.h: #define MA_EXPERIMENTAL__DATA_LOOPING_AND_CHAINING Note that this will cause custom data sources to break unless they're updated as per the rules in the revision history down the bottom of miniaudio.h (copied here for your convenience):
Loop points and chaining are done at the data source level. The following APIs have been added (only available with the aforementioned option). MA_API ma_result ma_data_source_set_range_in_pcm_frames(ma_data_source* pDataSource, ma_uint64 rangeBeg, ma_uint64 rangeEnd);
MA_API ma_result ma_data_source_set_current(ma_data_source* pDataSource, ma_data_source* pCurrentDataSource);
MA_API ma_result ma_data_source_set_next(ma_data_source* pDataSource, ma_data_source* pNextDataSource);
MA_API ma_result ma_data_source_set_next_callback(ma_data_source* pDataSource, ma_data_source_get_next_proc onGetNext); See the data_source_chaining example for usage. Use With this change we're another step closer to getting this high level stuff out. Soon I'll be creating a dev-0.11 branch at which point I'll be integrating the engine code into miniaudio.h. Once that's done, the dev-0.11 branch will be where all future updates to the high level API will be happening. I'll be posting an update here when this happens. |
The high level stuff is pretty much done now 🥳. For those following this development, I've created branch dev-0.11 which is where development will continue, but be warned that I'll be making quite a few API changes to the main library on that branch. |
This feature set sounds fantastic, would make miniaudio a great alternative to SoLoud for those preferring more portable source code. On emscripten, I can see a full audio engine clogging up the main thread pretty quickly though. Hopefully by the time it's ready for release, browser maintainers will have sorted their stuff out with |
Looking forward to get my hands on this once it's on the master branch. Thank you so much for your work. |
This has been released as part of version 0.11. Thanks to everyone who tested and provided feedback and suggestions! |
@mackron Just some really trivial stuff, but could you please consider tagging your release? For wrapper library authors, this would make it a bit easier and straight-forward to manage versions. |
This issue is for tracking development and gathering feedback on the new high level API coming to miniaudio. This API sits on top of the existing low level API and will be able to be disabled at compile time for those who don't need or want it thereby not imposing any kind of significant overhead in build size. This is not a replacement for the low level API. It's just an optional high level layer sitting on top of it.
The high level API is for users who don't want or need to do their own mixing and effect processing themselves via the low level API. The problem with the low level API is that as soon as you want to mix multiple sounds you need to do it all manually. The high level API is intended to alleviate this burden from those who don't need or want to do it themselves. This will be useful for game developers in particular.
There's a lot to the high level API, and for practicality I've decided to break it down into parts. This issue relates to the first phase. This phase will be completed before the next begins. The sections below summarise the main features of the high level API. If you have ideas or feedback on features feel free to leave a comment and I'll consider it.
You can use the high level API by doing something like this:
In your header files, just do something like this:
Everything is being developed and updated in the
dev
branch.Look at miniaudio_engine.h for the code and preliminary documentation. You can probably figure out a lot of features by looking at the header section.
The checklists below are what I'm planning on adding to the high level API. If you have suggestions, leave a comment.
Resource Management
The resource management system is responsible for loading audio data and delivering it to the audio engine.
wchar_t
)The Web/Emscripten backend does not support multithreading. This therefore requires the periodic calling of a function
to process the next pending async job, if any, which will need to be done manually by the application.
Engine
The engine is responsible for the management and playback of sounds.
Simple 3D Spatialization
The spatialization model will be simple to begin with. Advanced features such as HRTF will come later once we get an initial implementation done.
Standard Effects Suite
Support for custom effects is implemented, but it would be useful to have a suite of standard effects so applications don't need to implement them themselves.
For examples on how to get started, see the _examples folder, and start with engine_hello_world. The general idea is that you have the
ma_engine
object which you initialize using miniaudio's standard config/init process. You then load sounds which are calledma_sound
. The resource manager can be used independently ofma_engine
and is calledma_resource_manager
. See the resource_manager example for how to use the resource manager manually.If you're interested in the high level API and it's progress consider subscribing to this issue. I'll be updating this as we go. Feedback is welcome, and I encourage you to play around with it - everything is on the table for review.
The text was updated successfully, but these errors were encountered: