Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix capacity checks in BufferExtensions #504

Merged
merged 18 commits into from
Oct 19, 2020
Merged

Conversation

j0shuams
Copy link
Contributor

@j0shuams j0shuams commented Oct 15, 2020

This PR adds support for 0 length slices of buffers and arrays for buffer extension methods. There was another issue found that showed an overflow vulnerability that has been fixed by adding a check. Unit tests have been added that verify taking 0 length slices of buffers and arrays works the same when done from the middle, beginning and end.

I also did these same tests with length 1 slices to ensure that errors would be thrown properly in the case that the selection would be out of bounds.

Right now most of these unit tests are have to be skipped until the fix for #497 gets merged.
I grabbed the change from that work (a new version of IBufferByteAccess.cs), and all tests with Skip attribute pass when the new version is used. I left a note on the 497 page to remove the attribute when before merging.

Closes #334

Copy link

@weltkante weltkante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

besides the one I commented GetWindowsRuntimeBuffer(this MemoryStream underlyingStream, int positionInStream, int length) also has a <= operator which is supicious candidate for throwing when retrieving a zero sized buffer at end of memory stream capacity

The other fixes look like I would expect, but its late for me and I'll do a more thorough review tomorrow.

Of course test cases also would be useful, to ensure that doing zero sized operations to/from the end of a buffer works the same as in the middle of a buffer.

@j0shuams
Copy link
Contributor Author

@weltkante

After some thought, I don't think we should support indices that are equal to the length of the buffer or array, as this is technically out of memory bounds. You mentioned the invariant was offest <= buffer.Length but I believe the invariant is actually offset < buffer.Length since buffers are 0-based. The special case for the indices being equal to length are when length is 0 and count is 0, which the original PR supports.

Please provide an example scenario / code that you believe should work that doesn't

@weltkante
Copy link

weltkante commented Oct 16, 2020

After some thought, I don't think we should support indices that are equal to the length of the buffer or array, as this is technically out of memory bounds.

You are mistaken and mixing up the concepts of element access and span access. A span of length zero is valid at the end of a buffer, for example Insert methods will happily allow an index == buffer.Length because Insert methods are not about element access, they take a span of (implied) length zero as argument, which is the position after the last element.

If you do not like the Insert example, consider a general list.Replace(index, count, items) method (which replaces the given span by the given items, possibly a different amount). This API would be taking a span as input and a zero-size span at the end is perfectly valid, and in fact common, as it represents the "append" operation. If the replace operation is providing undo-capability it will want to take a snapshot of the to-be-replaced span. I hope this can serve as a reasonable example where "read zero sized span at end of buffer" happens.

You mentioned the invariant was offest <= buffer.Length but I believe the invariant is actually offset < buffer.Length since buffers are 0-based.

This is incorrect, offset < buffer.Length is the invariant for slot access, which implies count = 1. Being 0-based does matter, but only to make buffer.Length an exclusive upper bound of the buffer, for the actual upper bounds check its then irrelevant.

The full mathematical invariant for spans is 0 <= offset && offset <= ending && ending <= buffer.Length which needs <= in all positions, I hope you can agree with that. (Note that offset is inclusive while ending is exclusive when specifying it this way.)

From this invariant the individual checks are derived, needing four checks in total instead of three since you have offset + count instead of ending and computers do overflow arithmetic. The four checks usually are

  • offset >= 0 (throw if offset < 0)
  • count >= 0 (throw if count < 0) - this is offset <= ending being offset <= offset + count and then simplified
  • offset + count <= buffer.Length can't easily be checked in one comparison due to overflow being possible, so its split, which also conveniently lets you assign blame to single variables instead of the sum
    • offset <= buffer.Length (throw if offset > buffer.Length)
    • count <= buffer.Length - offset (throw if count > buffer.Length - offset)
      the split can be done in other ways, with offset and count swapped, or other variations

The invariant for indexed slot access you have in mind is 0 <= index && index < buffer.Length but actually is a special-case of a span access check with count one (since you are accessing a single slot). It is equivalent to 0 <= index && index + 1 <= buffer.Length when you substitute count = 1 in the more general span invariant. Of course nobody uses the + 1 version of the invariant, everyone writes < instead of <=. The inverse throwing condition becomes index >= buffer.Length, but it is for indexed access.

The special case for the indices being equal to length are when length is 0 and count is 0, which the original PR supports.

You can of course solve the currently present bugs by special casing zero, but its a bit inelegant. Its not my codebase so I'm not arguing over it, if you find it easier to reason about special casing zero instead of reasoning about the invariants (which I agree can be hard if you aren't used to it) then feel free to do so. Maintainability of the codebase is more important than elegance, and the more people can easily understand it, the better.

Note though, that in the previous PR the early-exit on zero count did appear in a spot where some argument checks were skipped, i.e. when special casing zero without reordering the other checks you may accidently change the semantics and no longer throw on some invalid arguments. Not critical to me personally, since the arguments are invalid anyways, just mentioning it because it may prevent detecting bugs in callers.

Please provide an example scenario / code that you believe should work that doesn't

I'll make the list whith scenarios that don't work, will take some time and should be ready later today.

@j0shuams
Copy link
Contributor Author

j0shuams commented Oct 16, 2020

Thank you for your reply!

You are mistaken and mixing up the concepts of element access and span access. A span of length zero is valid at the end of a buffer, for example Insert methods will happily allow an index == buffer.Length because Insert methods are not about element access, they take a span of (implied) length zero as argument, which is the position after the last element.

If you do not like the Insert example, consider a general list.Replace(index, count, items) method (which replaces the given span by the given items, possibly a different amount). This API would be taking a span as input and a zero-size span at the end is perfectly valid, and in fact common, as it represents the "append" operation. If the replace operation is providing undo-capability it will want to take a snapshot of the to-be-replaced span. I hope this can serve as a reasonable example where "read zero sized span at end of buffer" happens.

This example doesn't quite make sense to me because you are using List as the example data-structure, which can grow to arbitrary size; arrays and buffers have fixed lengths. I am concerned that if we change the <= to < we now allow the possibility for someone to pass in indices beyond the the bounds when they also pass a non-zero count. Which is leading me to think we should special case for count == 0 instead.

The full mathematical invariant for spans is 0 <= offset && offset <= ending && ending <= buffer.Length which needs <= in all positions, I hope you can agree with that. (Note that offset is inclusive while ending is exclusive when specifying it this way.)

If ending is exclusive shouldn't it be ending < buffer.Length ?

I am writing some test cases now to exemplify my concerns.

==
edit
I have pushed some tests that I think go through scenarios all the scenarios we have discussed. Am having some local build troubles to see the results -- but thought it would be good to share in the meantime. This has been an enlightening discussion, thank you by the way. Also, I know the names of the tests are bad, I will fix that later.

@weltkante
Copy link

weltkante commented Oct 16, 2020

This example doesn't quite make sense to me because you are using List as the example data-structure, which can grow to arbitrary size

I think you misunderstood the example, the buffers would be an implementation detail of implementing such a list. Immutable lists and strings do not grow at all, yet they have such an API. It is common to implement file editors by having a tree of buffers, and range based APIs are the fundamental building block there. Having to code around the defect of the framework not allowing zero length ranges at end of a buffer is just annoying. Its possible to easily work around, I have done so in the past, but you have to litter the callers with checks that wouldn't be necessary otherwise.

Or did you ever see anyone doing length checks when calling string.substring or string.replace ? These "just work" because zero length ranges are correctly implemented. In fact there are many .NET APIs which correctly implement zero length ranges at the end of their collection, e.g. spans, arrays, strings, lists, etc. all have correct range checks (as far as I'm aware) for APIs which allow to take ranges.

If ending is exclusive shouldn't it be ending < buffer.Length ?

No, that would be if ending were inclusive .

If you need to make yourself familiar with range indexing first, take the string API as an example, string.substring and string.replace both operate on range parameters and should have proper validation. Strings are immutable and its perfectly reasonable for someone to implement something similar on top of arrays or buffers, wanting to use these extension methods provided here, yet they are broken and you cannot use them without adding additional checks on every single call site. Alternatively look at the System.Memory APIs, they are literally all about buffers, so should be functionally equivalent except its .NET and not WinRT buffers.

I've created a bunch of test cases, not covering the previous PR, but instead what is still looking wrong in the current source. I've also included examples for other APIs which have correct range checking. Its really just these extensions that are broken because their range checks are wrong. Everyone else seems to have gotten it right.

test cases failing
using System;
using System.IO;
using System.Linq;
using System.Runtime.InteropServices.WindowsRuntime;
using Microsoft.VisualStudio.TestTools.UnitTesting;

// add 'System.Memory' nuget package for span support

namespace UnitTestProject1
{
    [TestClass]
    public class UnitTest1
    {
        [TestMethod]
        public void CopyArrayToBufferWithZeroCountAtEnd()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            var buffer = WindowsRuntimeBuffer.Create(10);
            array.CopyTo(3, buffer, 10, 0);
        }

        [TestMethod]
        public void CopyArrayToBufferWithZeroCountAtEnd_WorksWithArrays()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            byte[] buffer = new byte[10];
            Array.Copy(array, 3, buffer, 10, 0);
        }

        [TestMethod]
        public void CopyBufferToArrayWithZeroCountAtEnd()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            var buffer = array.AsBuffer();
            var target = new byte[10];
            buffer.CopyTo(3, target, 10, 0);
        }

        [TestMethod]
        public void CopyBufferToArrayWithZeroCountAtEnd_WorksWithSpans()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            byte[] buffer = new byte[10];
            array.AsSpan(3).CopyTo(buffer.AsSpan(10, 0));
        }

        [TestMethod]
        public void CopyBufferToBufferWithZeroCountAtEnd()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            var sourceBuffer = array.AsBuffer();
            var targetBuffer = WindowsRuntimeBuffer.Create(10);
            sourceBuffer.CopyTo(3, targetBuffer, 10, 0);
        }

        [TestMethod]
        public void CopyBufferToBufferWithZeroCountAtEnd_WorksWithSpans()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            var sourceBuffer = array.AsSpan();
            var targetBuffer = new byte[10].AsSpan();
            sourceBuffer.Slice(3).CopyTo(targetBuffer.Slice(10, 0));
        }

        [TestMethod]
        public void BufferToArrayWithZeroCountAtEnd()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            var result = array.AsBuffer().ToArray(3, 0);
            Assert.IsNotNull(result);
            Assert.AreEqual(0, result.Length);
        }

        [TestMethod]
        public void BufferToArrayWithZeroCountAtEnd_WorksWithSpans()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            var result = array.AsSpan().Slice(3, 0).ToArray();
            Assert.IsNotNull(result);
            Assert.AreEqual(0, result.Length);
        }

        [TestMethod]
        public void BufferToArrayWithZeroCountAtEnd_WorksInLinq()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            var result = array.ToList().Skip(3).Take(0).ToArray();
            Assert.IsNotNull(result);
            Assert.AreEqual(0, result.Length);
        }

        [TestMethod]
        public void GetWindowsRuntimeBufferWithZeroCountAtEnd()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            var stream = new MemoryStream(array);
            var result = stream.GetWindowsRuntimeBuffer(3, 0);
            Assert.IsNotNull(result);
            Assert.AreEqual(0, result.Length);
        }

        [TestMethod]
        public void GetWindowsRuntimeBufferWithZeroCountAtEnd_WorksWithSpan()
        {
            byte[] array = { 0xA1, 0xA2, 0xA3 };
            var stream = new MemoryStream(array);
            var result = stream.ToArray().AsSpan(3, 0).ToArray();
            Assert.IsNotNull(result);
            Assert.AreEqual(0, result.Length);
        }
    }
}

@weltkante
Copy link

There is an unrelated bug I discovered while reviewing, which is actually security relevant because it can corrupt memory, should this get a separate issue? Its not covered by #334 (which wants to reduce checks, not add new checks). Obviously it can just be fixed in this PR.

@j0shuams
Copy link
Contributor Author

There is an unrelated bug I discovered while reviewing, which is actually security relevant because it can corrupt memory, should this get a separate issue? Its not covered by #334 (which wants to reduce checks, not add new checks). Obviously it can just be fixed in this PR.

I made the change, the check was already present in the CopyTo for IBuffer to IBuffer, strangely enough. Nice catch

@weltkante
Copy link

@j0shuams did you forget to push? I'm not seeing the changes.

@j0shuams
Copy link
Contributor Author

@weltkante I am about to, I was comparing the unit tests I have been developing with the ones you provided to make sure I was handling proper cases.

@Scottj1s
Copy link
Member

@weltkante thank you for your attention on this PR - much appreciated

@Scottj1s
Copy link
Member

@j0shuams is this PR ready for review?

@Scottj1s
Copy link
Member

        if (source.Capacity <= byteOffset) throw new ArgumentException(global::Windows.Storage.Streams.SR.Argument_BufferIndexExceedsCapacity, nameof(byteOffset));

source.Length here too


Refers to: cswinrt/strings/additions/Windows.Storage.Streams/WindowsRuntimeBufferExtensions.cs:424 in d721226. [](commit_id = d721226, deletion_comment = False)

Copy link
Member

@Scottj1s Scottj1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally, shouldn't all bounds checks look at:

  • Length for source, to ensure valid bytes are read
  • Capacity for destination, to ensure sufficient space

@microsoft microsoft deleted a comment from j0shuams Oct 19, 2020
@weltkante
Copy link

weltkante commented Oct 19, 2020

@Scottj1s I'm not sure you can make a breaking change from Capacity to Length as far as IBuffer is concerned. When implementing data transformations using IBuffer the Length property may not always be up to date in intermediate steps when you take copies/snapshots, so since this API always allowed access to the full buffer it should continue to do so for compatibility reasons.

@Scottj1s
Copy link
Member

@Scottj1s I'm not sure you can make a breaking change from Capacity to Length as far as IBuffer is concerned. When implementing data transformations using IBuffer the Length property may not always be up to date in intermediate steps when you take copies/snapshots, so since this API always allowed access to the full buffer it should continue to do so for compatibility reasons.

We're already committed to breaking changes here, with the new conditions on params, beyond the the Length check. This code (and any client code) has to be able to rely on IBuffer.Length being stable.

@Scottj1s
Copy link
Member

@weltkante thanks again for the invaluable feedback. we're going to run with this and will be on the lookout for any regressions.

@Scottj1s Scottj1s merged commit cca14ad into master Oct 19, 2020
@Scottj1s Scottj1s deleted the jlarkin/fix-buffer-checks branch October 19, 2020 09:16
@weltkante
Copy link

weltkante commented Oct 19, 2020

Making previously throwing cases non-throwing and handled correctly is IMHO less breaking than starting to throw exceptions in previously working cases, which is why I was pointing this out, but its your call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WindowsRuntimeBufferExtensions.ToArray incorrectly checks Capacity
3 participants