Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceptions when deserializing from NetworkStream .. Hyperion seems to try to deserialize incompletely received messages #40

Closed
Jivko1212 opened this issue Mar 22, 2017 · 19 comments

Comments

@Jivko1212
Copy link

Jivko1212 commented Mar 22, 2017

I have been getting random exceptions when deserializing objects from NetworkStream. It seems that Hyperion is not waiting for the whole message but trying to deserialize a message that is half received

It works fine if the client and the server are running on the same machine but when they are running on different machines I am getting exceptions like null reference exceptions or it is trying to look for types with half written assemblies for example if I have JMM.Util.Msg it is throwing exceptions saying somthing like cannot find type "JMM.Ut". I was able to fix the problem by using BinaryWriter/BinaryReader to write an int indicating the msg length followed by a byte[] obtained from Hyperion serializer of the message.

This is my deserialization code:
var msg = wireSerRead.Deserialize(ns);

This is my serialization code:
wireSerWrite.Serialize(msg, ms);
var msArr = ms.ToArray();
await ns.WriteAsync(msArr, 0, msArr.Length);

ns is of type NetworkStream
ms is of type MemoryStream

my serializers are instantiated like so:
Serializer wireSerRead = new Serializer(new SerializerOptions(false, false));
Serializer wireSerWrite = new Serializer(new SerializerOptions(false, false));

@Ralf1108
Copy link
Contributor

Ralf1108 commented Aug 23, 2017

Hi, I also got into this issue and found the cause.
I use Hyperion in a ServiceStack plugin to exchange REST data.
For bigger request I got unexpected NullReferenceException. I was lucky that the error was reproducible.
First solution was to use a BufferedStream, but as soon as the buffer size was smaller than the payload size the error occured again. So I only covered the error instead of fixing it.

Digging into Hyperion code I found the following code which is the same in all ValueSerializers:
e.g. in class DoubleSerializer.ReadValueImpl()

stream.Read(bytes, 0, Size);

This is the position where bytes where taken from stream to be consumed so here must be the problem.
No return value is checked!
Checking the type of stream in the ServiceStack plugin showed me that it provides me a "System.Net.ConnectStream". Seems to be something like a NetworkStream. Because network is unreliable but handled via TCP my theory was:

If you execute a "Read()" on a stream can you expect that it returns your amount of bytes or provide some EOF.
What if the stream returns you only some of your requested bytes even when there are more bytes available in future?
Was this allowed?

Searching the inet made it clear, yes it is:
https://stackoverflow.com/questions/33972270/networkstream-reads-fewer-bytes-than-expected
https://stackoverflow.com/questions/39892520/stream-based-on-streamsocket-returns-less-data-than-requested

That explains:

  • Why the provided solution via BinaryReader worked because I think BinaryReader seems to handle that case internally
  • Why Hyperion fails deserializing with an unexplainable exception. Hyperion assumes that if it asks for n bytes from the stream that n bytes will be read. But with "unreliable" streams there are situations where this assumption doesn't work, hence the exception.

Question is now who is responsible to ensures that there are only "full" reads?

  1. Hyperion could do this by itself, using only ReadFull extension method below
  2. Hyperion only supports streams which ensures this.
  3. Hyperion uses FullReadStream by default to prevent this

@Ralf1108
Copy link
Contributor

Ralf1108 commented Aug 23, 2017

For completion I provide a an implementation of a stream extension method which ensures "full reads":

see updated version in comment:
#40 (comment)

@Ralf1108
Copy link
Contributor

Ralf1108 commented Aug 23, 2017

Here is also an implementation for a stream which only allows FullReads by using the extension method above. If you encapsulate your stream with it you can safely use the hyperion deserializer with "unreliable" streams,
Does it make sense that Hyperion wraps deserialize streams into FullReadStream by default?

public class FullReadStream : Stream
    {
        private readonly Stream _stream;

        public FullReadStream(Stream stream)
        {
            _stream = stream;
        }

        public override bool CanRead => _stream.CanRead;
        public override bool CanSeek => _stream.CanSeek;
        public override bool CanWrite => _stream.CanWrite;
        public override long Length => _stream.Length;

        public override long Position
        {
            get { return _stream.Position; }
            set { _stream.Position = value; }
        }

        public override void Flush()
        {
            _stream.Flush();
        }

        public override long Seek(long offset, SeekOrigin origin)
        {
            return _stream.Seek(offset, origin);
        }

        public override void SetLength(long value)
        {
            _stream.SetLength(value);
        }

        public override int Read(byte[] buffer, int offset, int count)
        {
            return _stream.ReadFull(buffer, offset, count);
        }

        public override void Write(byte[] buffer, int offset, int count)
        {
            throw new NotSupportedException();
        }
    }

@Horusiath
Copy link
Contributor

@Ralf1108 I don't think, that there's a need to use that stream by default, as it solves problems only in case of subset of streams allowing partial reads, while bringing extra weight to those which doesn't need them.

Personally I think, we should stick with requiring full-read streams for now. Maybe it will change in the future.

@Ralf1108
Copy link
Contributor

Ralf1108 commented Aug 23, 2017

@Horusiath I am not sure about that.

As MSDN states for streams Read() methods:
https://msdn.microsoft.com/en-us/library/system.io.stream.read(v=vs.110).aspx

An implementation is free to return fewer bytes than requested even if the end of the stream has not been reached

The assumption in stream usage in Hyperion is wrong according to the documentation. This is maybe a theoretical bug but it is still an incomplete usage of the stream.

What about using the extension method which provides a fast path for the practical cases and a fallback for the theoretical ones?

@Danthar
Copy link
Member

Danthar commented Aug 23, 2017

Hyperion is a serializer. Its job is to translate bytes into types and vice versa, in a correct way, as fast as possible.

Dealing with incomplete byte arrays because its being used in a networking context. Is not its responsibility. Its the responsibility of the user to make sure its being fed a complete byte array.

IMHO This request is like asking json.net to accept an incomplete json string and not blow up.

Garbage in, garbage out.

@Ralf1108
Copy link
Contributor

Ralf1108 commented Aug 23, 2017

Can't agree with that.

Hyperion reads bytes from a stream and assumes all n bytes were read.
The documentation of the used "Read()" method points out that this is not enforced by the stream class contract.
So for me hyperion is using the stream wrong and its using violates the stream class contract.

Instead of using the stream.Read() method in a wrong way it should use

  • the stream in a correct way w/o extension method "ReadFull()"
  • outsource the correct usage by using a binary reader or specific stream implementation (e.g. using FullReadStream)

which fulfilles the assumption from Hyperion of streams .

Not fixing that means that Hyperion serializer will have to state out in its documentation that it wont work with all streams which fulfilles the stream contract provided by MSDN.

So all in all Hyperion seems to violate the Liskov principle here as it only works correctly with a subset of all possible stream derivatives.

Btw. you pointed out that Hyperion would deal with incomplete byte arrays which is incorrect. Hyperion deals with a stream of bytes which requires to follow the stream contract.

@Ralf1108
Copy link
Contributor

Proposal for "ReadFull()" extension method with fast path.
If compiler would inline this there shouldn't be a noticeable performance penalty.

public static class StreamExtensions
{
    /// <summary>
    /// Repeats reading from stream until requested bytes were read.
    /// Returns if stream can't provide enough bytes
    /// </summary>
    public static int ReadFull(this Stream stream, byte[] buffer, int offset, int count)
    {
        // fast path for streams which doesn't deliver partial results
        var totalReadBytes = stream.Read(buffer, offset, count);
        if (totalReadBytes == count)
            return totalReadBytes;
            
        // support streams with partial results
        do
        {
            var readBytes = stream.Read(buffer, offset + totalReadBytes, count - totalReadBytes);
            if (readBytes == 0)
                break;

            totalReadBytes += readBytes;
        }
        while (totalReadBytes < count);

        return totalReadBytes;
    }
}

@Ralf1108
Copy link
Contributor

Ralf1108 commented Feb 2, 2018

any progress if this will be merged?

@Ralf1108
Copy link
Contributor

Maybe somebody can spin up BenchmarkDotNet to check if there is a performance penalty?

@Ralf1108
Copy link
Contributor

So maybe this issue can be closed if there is no intention to fix it?

Issue here was just a misunderstanding of the differences between a byte array and a stream of bytes.

@marius-klimantavicius
Copy link

@Ralf1108 Do you have a fork with your suggested stream changes?

Streams in .net are always allowed to return partially read data, this is different from garbage-in-garbage-out, because the stream has not finished and it will (eventually) return full data. Json.net (and other stream based serializers) do correctly handle such cases.

Ralf1108 pushed a commit to Ralf1108/Hyperion that referenced this issue Oct 7, 2020
@Ralf1108
Copy link
Contributor

Ralf1108 commented Oct 7, 2020

@Ralf1108
Copy link
Contributor

Ralf1108 commented Oct 7, 2020

to simulate the original issue just replace the body of method ReadFull with this code

return stream.Read(buffer, offset, count)

Then you see the failing serialize/deserialize tests which requires more than 1 byte of the stream.

@Ralf1108
Copy link
Contributor

Ralf1108 commented Oct 8, 2020

added PR #185

Arkatufus added a commit that referenced this issue Oct 29, 2020
fixed issue #40 regarding partial streams
@Ralf1108
Copy link
Contributor

PR was merged so this is fixed in next Hyperion release

@Ralf1108
Copy link
Contributor

Issue can be closed because PR was already merged

@Ralf1108
Copy link
Contributor

FYI: Just read that ".NET 7 Preview 5" will provide a convenience method to avoid this common error via ReadExactly and ReadAtLeast

@Aaronontheweb
Copy link
Member

That's great to know @Ralf1108

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants