Suggest adding `StreamSplitter::pullInto()` to reduce memory allocation when recording samples #395

dpgeorge · 2023-11-14T00:27:20Z

When recording a stream of audio from the microphone using StreamSplitter, there is potentially an unnecessary allocating of intermediate buffer space, and unnecessary copying of data.

In MicroPython, to record audio one first creates a big buffer to hold the entire sample (eg 5 seconds) and then records into that buffer. It would be convenient and more efficient if CODAL could be told "record into this buffer directly".

With the current architecture of streaming and pullRequest() methods and different consumers with different sample rates, it might be difficult to add the functionality for a zero-copy interface to record samples. So instead I suggest adding a method StreamSplitter::pullInto(uint8_t *buf, size_t len). This is the same as StreamSplitter::pull() except that it uses the given buffer instead of allocating the outData buffer and returning that. This will eliminate a memory allocation for each call to this function, and also eliminate a subsequent memcpy().

The text was updated successfully, but these errors were encountered:

JohnVidler · 2023-11-16T16:04:27Z

I think this is a pretty good suggestion for cases where 'userland' runtimes want more direct C-alike access to memory from a stream. To this end, I plan to roll this in with the changes required to support/fix #394 and #321 and squash all three bugs at once.

Unfortunately I'm on leave for the next week, and will be back in the office on Monday 27th, so I'll pick this up then as a priority.

JohnVidler · 2023-11-16T16:09:16Z

Related: #365 and #284

JohnVidler · 2023-11-16T16:12:23Z

Note: Using this issue to track the discussion for implementation to address all the linked issues here.

JohnVidler · 2024-02-07T16:26:23Z

With the changes currently pending merge to address the sample rate change issues I've also now included a uint8_t * SplitterChannel::pullInto( uint8_t * rawBuffer, int length ) method as part of the SplitterChannel.

This lets you supply an arbitrary block of data, and have the benefits of automatic resampling along with keeping the 'contract' between stream classes to keep the microphone auto-wake functions working. The return pointer is set to the next byte in memory if you're using a large buffer with multiple samples stored sequentially, so by tracking the return between calls you can automatically wander the larger buffer.

If the supplied memory is smaller than required, it will be completely filled and the data will be truncated; if the memory is longer than required, the complete upstream buffer will be copied in and the rest of the memory will be untouched.

Thanks for everyone's patience on this issue, its hit quite a few things at once :)

dpgeorge · 2024-02-26T07:31:55Z

if the memory is longer than required, the complete upstream buffer will be copied in and the rest of the memory will be untouched.

Testing out this new pullInto() method, I'm not sure the above behaviour is implemented correctly. Passing in a large buffer, it seems to just keep on sampling the input buffer past its end, and only stops once it has filled the given output buffer. This means that you have to request exactly the number of bytes that are available, and it's not really possible to know this.

In the code you can see that the loop to copy the samples from input to output uses while( outPtr - output < length ), and if a buffer is passed in then length is just taken as the passed-in length. The following patch gets it working for me:

--- a/source/streams/StreamSplitter.cpp
+++ b/source/streams/StreamSplitter.cpp
@@ -74,6 +74,10 @@ ManagedBuffer SplitterChannel::resample( ManagedBuffer _in, uint8_t * buffer, in
     if( output == NULL ) {
         output = (uint8_t *)malloc( samplesPerOut * bytesPerSample );
         length = samplesPerOut * bytesPerSample;
+    } else {
+        if (length > samplesPerOut * bytesPerSample) {
+            length = samplesPerOut * bytesPerSample;
+        }
     }
 
     int oversample_offset = 0;

JohnVidler · 2024-02-26T13:28:33Z

Testing out this new pullInto() method, I'm not sure the above behaviour is implemented correctly. Passing in a large buffer, it seems to just keep on sampling the input buffer past its end, and only stops once it has filled the given output buffer.

Ah! Good catch! The pullInto() method was less completely tested (I don't have a use case for this inside CODAL itself) so I missed that one. I'll merge your proposed patch into master and can you let me know if everything then works with Python when built against it all? If so, we'll call this the next minor rev with a couple of other smaller merges.

JohnVidler · 2024-02-26T16:11:43Z

Aaand patched, pushed, tagged and released. Might as well get this out the door ASAP so you have everything you need for Python @dpgeorge 👍

For ref: https://github.com/lancaster-university/codal-microbit-v2/releases/tag/v0.2.66

dpgeorge · 2024-02-26T23:47:05Z

Thanks for the quick patch and release! I have tested it and it works. I have updated all our code to use the new v0.2.66 release.

microbit-carlos assigned JohnVidler Nov 14, 2023

microbit-carlos added this to the v0.2.64 milestone Nov 14, 2023

microbit-carlos added the micropython label Nov 14, 2023

JohnVidler added the p0 label Nov 16, 2023

dpgeorge mentioned this issue Dec 19, 2023

Add setSampleRange() to StreamSplitter to match Mixer2::setSampleRange() #393

Open

JohnVidler closed this as completed Feb 26, 2024

JohnVidler reopened this Feb 26, 2024

microbit-carlos modified the milestones: v0.2.65, v0.2.66 Feb 26, 2024

JohnVidler closed this as completed Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggest adding `StreamSplitter::pullInto()` to reduce memory allocation when recording samples #395

Suggest adding `StreamSplitter::pullInto()` to reduce memory allocation when recording samples #395

dpgeorge commented Nov 14, 2023

JohnVidler commented Nov 16, 2023

JohnVidler commented Nov 16, 2023

JohnVidler commented Nov 16, 2023

JohnVidler commented Feb 7, 2024

dpgeorge commented Feb 26, 2024

JohnVidler commented Feb 26, 2024

JohnVidler commented Feb 26, 2024

dpgeorge commented Feb 26, 2024

Suggest adding StreamSplitter::pullInto() to reduce memory allocation when recording samples #395

Suggest adding StreamSplitter::pullInto() to reduce memory allocation when recording samples #395

Comments

dpgeorge commented Nov 14, 2023

JohnVidler commented Nov 16, 2023

JohnVidler commented Nov 16, 2023

JohnVidler commented Nov 16, 2023

JohnVidler commented Feb 7, 2024

dpgeorge commented Feb 26, 2024

JohnVidler commented Feb 26, 2024

JohnVidler commented Feb 26, 2024

dpgeorge commented Feb 26, 2024

Suggest adding `StreamSplitter::pullInto()` to reduce memory allocation when recording samples #395

Suggest adding `StreamSplitter::pullInto()` to reduce memory allocation when recording samples #395