New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

🎵 Implement deserialization for ogg files #72

Merged

tomrijnbeek merged 8 commits into main from read-ogg

Oct 16, 2022

Member

tomrijnbeek commented Oct 15, 2022

✨ What's this?

This PR adds support for extracting sound buffer data from OGG files, including a snapshot test.

🔗 Relationships

Closes #8

🔍 Why do we want this?

OGG is an open format and is commonly used for compressed audio.

🏗 How is it done?

The deserialization is largely done by the NVorbis library, so logically this PR is pretty straightforward. After looking at NAudio, a different audio library not built on OpenAL, I kinda like the idea of putting the actual file parsing logic in a separate class. The idea is that this class is a stateful object, so it can be used to stream the files rather than loading the entire file in memory at once.

My intent is to have a future PR migrate the wave file loading to a similar class. Hopefully we can extract a common interface, and have some streaming loading logic layer on top of that interface, so that the streaming can be applied to any file format you may use, as long as you have a stream for that file type.

All in all, I am aware that this is very WIP, and it wouldn't surprise me if there will be several changes to the underlying logic, though having the current static methods to load as convenience method is not something that will go away quickly. I have left the OggStream class as internal for the time being while we work on this.

💥 Breaking changes

None.

🔬 Why not another way?

The current approach opens up the use cases I want to support (particularly streaming in #17), and contributes to a better reusable framework (see #16). There are probably other solutions, but this one seems the best for the time being. We probably need to start using this to get a better sense of what works and what doesn't, so I'd like to commit to this approach from now, and take this a bit agile.

🦋 Side effects

WAV file loading needs to be migrated. Will create an issue if this PR is approved to validate the approach.

💡 Review hints

Stream is a super overloaded term, so open to suggestions for better naming. Reader is a possibility, though also very vague. Decoder doesn't really imply that it's holding the lock to a file open.


          🎵 Implement deserialization for ogg files

91d65d4

tomrijnbeek added the feature label

tomrijnbeek requested a review from paulcscharf

October 15, 2022 12:31

paulcscharf requested changes

View reviewed changes

Member

paulcscharf left a comment

Looks mostly fine. Left some comments to consider.

src/Bearded.Audio/Core/OggStream.cs Outdated


		namespace Bearded.Audio;

		sealed class OggStream : IDisposable

Member

paulcscharf Oct 15, 2022

Stream is a specific .net class that most(all?) streams inherit from and that has a pretty specific interface and semantics.

Provides a generic view of a sequence of bytes.

https://learn.microsoft.com/en-us/dotnet/api/system.io.stream?view=net-7.0

As such I think calling this a Reader makes more sense - that is much more in line semantically with the variety of different readers in .net and libraries like json.net.

Member Author

tomrijnbeek Oct 15, 2022

Fair enough, renamed.

src/Bearded.Audio/Core/OggStream.cs Outdated

+                      reader.Dispose();
+                  }
+                  public static OggStream FromFile(Stream file)

Member

paulcscharf Oct 15, 2022

Suggested change

      
                public static OggStream FromFile(Stream file)
          
                public static OggStream FromStream(Stream file)

Or

Suggested change

      
                public static OggStream FromFile(Stream file)
          
                public static OggStream FromFileStream(Stream file)

Or simply

Suggested change

      
                public static OggStream FromFile(Stream file)
          
                public static OggStream From(Stream file)

?

No strong opinion, perhaps we can look up what classes like TextReader do.

Member Author

tomrijnbeek Oct 15, 2022

They use constructor overloads, which I don't like much. Using FromStream now.

src/Bearded.Audio/Core/OggStream.cs Outdated


		public int SampleRate => reader.SampleRate;

		public bool Ended => reader.IsEndOfStream;

Member

paulcscharf Oct 15, 2022

What is the name of this kind of property in .net classes?

Member Author

tomrijnbeek Oct 15, 2022

StreamReader calls this EndOfStream, StringReader and BinaryReader don't have it, and Stream itself only exposes the position and length, so you have to do it yourself. The VorbisReader from NVorbis uses IsEndOfStream as you can see. So... really no consistency there.

src/Bearded.Audio/Core/OggStream.cs Outdated

+                      this.reader = reader;
+                  }
+                  public IList<short[]> ReadAllRemainingBuffers(int maxBufferSize)

Member

paulcscharf Oct 15, 2022

Return types should be specific I believe?

Member Author

tomrijnbeek Oct 15, 2022

You're completely right.

src/Bearded.Audio/Core/OggStream.cs Outdated

+                      {
+                          TryReadSingleBuffer(out var buffer, maxBufferSize);
+                          return buffer!;
+                      }).ToImmutableArray();

Member

paulcscharf Oct 15, 2022

Since the size is known, use an immutable array builder for fewer allocations?

Member Author

tomrijnbeek Oct 15, 2022

Good shout, done.

src/Bearded.Audio/Core/OggStream.cs Outdated

+                          throw new ArgumentException("Max buffer size must be positive.", nameof(maxBufferSize));
+                      }
+                      if (reader.IsEndOfStream)

Member

paulcscharf Oct 15, 2022

Why not use Ended like we do in the method above?

Member Author

tomrijnbeek Oct 15, 2022

Done

src/Bearded.Audio/Core/OggStream.cs Outdated

+                  public bool TryReadSingleBuffer([NotNullWhen(true)] out short[]? buffer, int maxBufferSize)
+                  {
+                      if (maxBufferSize <= 0)

Member

paulcscharf Oct 15, 2022

Perhaps the above method should have a check like this too?

Member Author

tomrijnbeek Oct 15, 2022

Fair enough

src/Bearded.Audio/Core/OggStream.cs Outdated

+                          return false;
+                      }
+                      var floatBuffer = new float[largestBufferSizeDivisibleByChannelCount(maxBufferSize)];

Member

paulcscharf Oct 15, 2022

Can we do anything to prevent reallocation of this temporary array? Or can we perhaps allocate it on the stack?

Member Author

tomrijnbeek Oct 15, 2022

I have now been introduced to the wonders of stackalloc. That's pretty cool actually!

src/Bearded.Audio/Core/OggStream.cs Outdated


		var numSamplesRead = reader.ReadSamples(new Span<float>(floatBuffer));

		buffer = new short[numSamplesRead];

Member

paulcscharf Oct 15, 2022

I'm not asking to change it now, but it would be great if we can later read into a set of provided buffers instead of allocations new ones.

Member Author

tomrijnbeek Oct 15, 2022

Agreed. Since sound buffer data is immutable though, right now you'll always have to allocate an array somewhere, so might as well keep it here for now.

src/Bearded.Audio/Core/OggStream.cs Outdated

+                      return Enumerable.Range(0, totalBuffersNeeded).Select(_ =>
+                      {
+                          TryReadSingleBuffer(out var buffer, maxBufferSize);

Member

paulcscharf Oct 15, 2022

I think this method repeats some of the checks and calculations - perhaps we can extract a common private method for the actual reading from the reader?

Member Author

tomrijnbeek Oct 15, 2022

Refactored a bit.

tomrijnbeek added 5 commits

October 15, 2022 17:25


          📝 Add missing summaries

b417ce6


          📝 Address review comments

ae07523


          💄 Do some more refactoring

4abb6ec


          🔥 Remove unnecessary unsafe property

ac1eaa7


          💄

bc523a5

paulcscharf requested changes

View reviewed changes

src/Bearded.Audio/Core/OggReader.cs Outdated Show resolved Hide resolved

src/Bearded.Audio/Core/OggReader.cs Outdated Show resolved Hide resolved

src/Bearded.Audio/Core/OggReader.cs

+                      var builder = ImmutableArray.CreateBuilder<short[]>((int) totalBuffersNeeded);
+                      for (var i = 0; i < builder.Capacity; i++)
+                      {
+                          builder.Add(readSamples(bufferSize));

Member

paulcscharf Oct 15, 2022

Don't we need to treat a potentially partial last buffer differently?

Member Author

tomrijnbeek Oct 16, 2022

The readSamples already handles the partial buffer correctly.

Member

paulcscharf Oct 16, 2022

I see - I did not realise that Reader.ReadSamples did this automatically. I was looking at the array/spam allocation and couldn't find it there. It's fine then (and so was the name of the readSamples parameter) though it is perhaps slightly unfortunate that we may allocate a big float array even if we are only gonna use a part of it.

Member Author

tomrijnbeek Oct 16, 2022

Sadly that's the way it is, and we try to limit this problem by not making the buffer size too big. In practice - especially for ogg files - the files will hopefully be so long that most of the time you're filling at least a few buffers fully.

src/Bearded.Audio/Core/OggReader.cs Outdated

+                      return true;
+                  }
+                  private short[] readSamples(int maxSampleCount)

Member

paulcscharf Oct 15, 2022

The argument is called 'max', but is it now simply the sampleCount at this point? (or even just 'count' given the context of the method)

Member Author

tomrijnbeek Oct 16, 2022

We may be reading fewer samples than the actual sample count, if we are filling the last partial buffer. In other words, numSamplesRead (renamed to readSampleCount now for consistency in naming) can be less than or equal to maxSampleCount, I have now called it sampleCount, but it should be noted that the number of samples returned may be fewer than the specified parameter.

tomrijnbeek added 2 commits

October 16, 2022 14:23


          📝 Another round of review comments

a0e507a


          Merge branch 'main' into read-ogg

52c1494

paulcscharf approved these changes

View reviewed changes

tomrijnbeek merged commit 1aa491e into main

tomrijnbeek deleted the read-ogg branch

October 16, 2022 14:22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels