-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
π΅ Implement deserialization for ogg files #72
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly fine. Left some comments to consider.
src/Bearded.Audio/Core/OggStream.cs
Outdated
|
||
namespace Bearded.Audio; | ||
|
||
sealed class OggStream : IDisposable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stream is a specific .net class that most(all?) streams inherit from and that has a pretty specific interface and semantics.
Provides a generic view of a sequence of bytes.
https://learn.microsoft.com/en-us/dotnet/api/system.io.stream?view=net-7.0
As such I think calling this a Reader makes more sense - that is much more in line semantically with the variety of different readers in .net and libraries like json.net.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, renamed.
src/Bearded.Audio/Core/OggStream.cs
Outdated
reader.Dispose(); | ||
} | ||
|
||
public static OggStream FromFile(Stream file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static OggStream FromFile(Stream file) | |
public static OggStream FromStream(Stream file) |
Or
public static OggStream FromFile(Stream file) | |
public static OggStream FromFileStream(Stream file) |
Or simply
public static OggStream FromFile(Stream file) | |
public static OggStream From(Stream file) |
?
No strong opinion, perhaps we can look up what classes like TextReader do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They use constructor overloads, which I don't like much. Using FromStream
now.
src/Bearded.Audio/Core/OggStream.cs
Outdated
|
||
public int SampleRate => reader.SampleRate; | ||
|
||
public bool Ended => reader.IsEndOfStream; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the name of this kind of property in .net classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StreamReader
calls this EndOfStream
, StringReader
and BinaryReader
don't have it, and Stream
itself only exposes the position and length, so you have to do it yourself. The VorbisReader
from NVorbis uses IsEndOfStream
as you can see. So... really no consistency there.
src/Bearded.Audio/Core/OggStream.cs
Outdated
this.reader = reader; | ||
} | ||
|
||
public IList<short[]> ReadAllRemainingBuffers(int maxBufferSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return types should be specific I believe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're completely right.
src/Bearded.Audio/Core/OggStream.cs
Outdated
{ | ||
TryReadSingleBuffer(out var buffer, maxBufferSize); | ||
return buffer!; | ||
}).ToImmutableArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the size is known, use an immutable array builder for fewer allocations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good shout, done.
src/Bearded.Audio/Core/OggStream.cs
Outdated
throw new ArgumentException("Max buffer size must be positive.", nameof(maxBufferSize)); | ||
} | ||
|
||
if (reader.IsEndOfStream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use Ended like we do in the method above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/Bearded.Audio/Core/OggStream.cs
Outdated
|
||
public bool TryReadSingleBuffer([NotNullWhen(true)] out short[]? buffer, int maxBufferSize) | ||
{ | ||
if (maxBufferSize <= 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the above method should have a check like this too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough
src/Bearded.Audio/Core/OggStream.cs
Outdated
return false; | ||
} | ||
|
||
var floatBuffer = new float[largestBufferSizeDivisibleByChannelCount(maxBufferSize)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do anything to prevent reallocation of this temporary array? Or can we perhaps allocate it on the stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now been introduced to the wonders of stackalloc
. That's pretty cool actually!
src/Bearded.Audio/Core/OggStream.cs
Outdated
|
||
var numSamplesRead = reader.ReadSamples(new Span<float>(floatBuffer)); | ||
|
||
buffer = new short[numSamplesRead]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not asking to change it now, but it would be great if we can later read into a set of provided buffers instead of allocations new ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Since sound buffer data is immutable though, right now you'll always have to allocate an array somewhere, so might as well keep it here for now.
src/Bearded.Audio/Core/OggStream.cs
Outdated
|
||
return Enumerable.Range(0, totalBuffersNeeded).Select(_ => | ||
{ | ||
TryReadSingleBuffer(out var buffer, maxBufferSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method repeats some of the checks and calculations - perhaps we can extract a common private method for the actual reading from the reader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored a bit.
var builder = ImmutableArray.CreateBuilder<short[]>((int) totalBuffersNeeded); | ||
for (var i = 0; i < builder.Capacity; i++) | ||
{ | ||
builder.Add(readSamples(bufferSize)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to treat a potentially partial last buffer differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The readSamples
already handles the partial buffer correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see - I did not realise that Reader.ReadSamples did this automatically. I was looking at the array/spam allocation and couldn't find it there. It's fine then (and so was the name of the readSamples parameter) though it is perhaps slightly unfortunate that we may allocate a big float array even if we are only gonna use a part of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly that's the way it is, and we try to limit this problem by not making the buffer size too big. In practice - especially for ogg files - the files will hopefully be so long that most of the time you're filling at least a few buffers fully.
src/Bearded.Audio/Core/OggReader.cs
Outdated
return true; | ||
} | ||
|
||
private short[] readSamples(int maxSampleCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument is called 'max', but is it now simply the sampleCount at this point? (or even just 'count' given the context of the method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may be reading fewer samples than the actual sample count, if we are filling the last partial buffer. In other words, numSamplesRead
(renamed to readSampleCount
now for consistency in naming) can be less than or equal to maxSampleCount
, I have now called it sampleCount
, but it should be noted that the number of samples returned may be fewer than the specified parameter.
β¨ What's this?
This PR adds support for extracting sound buffer data from OGG files, including a snapshot test.
π Relationships
Closes #8
π Why do we want this?
OGG is an open format and is commonly used for compressed audio.
π How is it done?
The deserialization is largely done by the NVorbis library, so logically this PR is pretty straightforward. After looking at NAudio, a different audio library not built on OpenAL, I kinda like the idea of putting the actual file parsing logic in a separate class. The idea is that this class is a stateful object, so it can be used to stream the files rather than loading the entire file in memory at once.
My intent is to have a future PR migrate the wave file loading to a similar class. Hopefully we can extract a common interface, and have some streaming loading logic layer on top of that interface, so that the streaming can be applied to any file format you may use, as long as you have a stream for that file type.
All in all, I am aware that this is very WIP, and it wouldn't surprise me if there will be several changes to the underlying logic, though having the current static methods to load as convenience method is not something that will go away quickly. I have left the
OggStream
class as internal for the time being while we work on this.π₯ Breaking changes
None.
π¬ Why not another way?
The current approach opens up the use cases I want to support (particularly streaming in #17), and contributes to a better reusable framework (see #16). There are probably other solutions, but this one seems the best for the time being. We probably need to start using this to get a better sense of what works and what doesn't, so I'd like to commit to this approach from now, and take this a bit agile.
π¦ Side effects
WAV file loading needs to be migrated. Will create an issue if this PR is approved to validate the approach.
π‘ Review hints
Stream
is a super overloaded term, so open to suggestions for better naming.Reader
is a possibility, though also very vague.Decoder
doesn't really imply that it's holding the lock to a file open.