Skip to content
This repository has been archived by the owner on May 8, 2019. It is now read-only.

Type Mapping with MessagePack

dchenk edited this page May 5, 2018 · 5 revisions

Because MessagePack uses a schema-less, polymorphic type system and Go is a strongly-typed language, any Go implementation of MessagePack serialization will have to make choices about how Go types map onto MessagePack types and vice-versa. Here we explain the rules that msgp uses and the justification behind them.

Numerical Precision and Overflow

Rule 1: No Overflow

msgp always attempts to encode Go values in the smallest wire representation possible without any loss in numerical precision, as is suggested by the MessagePack standard. For example, even though a Go int is 64 bits on 64-bit hardware, the encoding of int(5) is one byte on the wire.

As a consequence of this rule, msgp will never let you decode a value that would overflow the object you are decoding into. For instance, if you use msgp.ReadInt16Bytes() or (*Reader).ReadInt16() to read an integer value, the reading will succeed only if the integer is between math.MinInt16 and math.MaxInt16. For clarity's sake, here is the actual code for (*Reader).ReadInt16():

func (m *Reader) ReadInt16() (int16, error) {
	in, err := m.ReadInt64()
	if in > math.MaxInt16 || in < math.MinInt16 {
		return 0, IntOverflow{Value: in, FailedBitsize: 16}
	}
	return int16(in), err // err may be nil
}

Tip: Use int64 or uint64 when you cannot be sure of the magnitude of the encoded value.

Rule 2: No Loss of Precision

msgp will always encode a Go float32 as a 32-bit IEEE-754 float, and a float64 as a 64-bit IEEE-754 float.

When decoding, it is legal to decode a 32-bit float on the wire as a Go float64, but the opposite is illegal. This is to avoid the possibility of losing numerical precision.

Tip: Don't mix-and-match; pick either float32 or float64 and use it everywhere.

Rule 3: Sign Matters

msgp will not allow an int8, int16, int32, or int64 on the wire to be decoded into an unsigned integer like uint32. Most current implementations of MessagePack encode non-negative integers as unsigned integers. The msgp runtime library will therefore decode unsigned integers into signed integers if there is no overflow.

There is a built-in type called msgp.Number that can represent a MessagePack int, uint, float32, and float64. If you are decoding objects serialized by other libraries into Go objects using msgp, you may want to use msgp.Number instead of a specific numeric type.

Structs, Maps, and Arrays

Like JSON, MessagePack has no notion of strongly-typed data structures. msgp encodes Go struct objects as MessagePack maps by default, but it can also encode them as tuples (ordered arrays). Decoding maps into Go structs can present some peculiar edge cases.

Rule 1: Keys are string-able

msgp does not support decoding maps with keys that are not "string-able" (either str or bin type).

You can still manually decode arbitrary maps with the helpers built into the library.

Rule 2: Map-to-Struct decoding is an Intersect

The generated implementations of msgp.Unmarshaler and msgp.Decoder decode the intersection of the map being decoded and the map represented by the struct. One of the most important consequences of this is that it is perfectly valid for a decode operation to not mutate the method receiver and return no error.

For example, let's assume we have the following type:

type Thing struct {
    Name  string  `msgp:"name"`
    Value float64 `msgp:"value"`
}

The following objects would all be legal to decode for the Thing type:

{} // the object is not mutated
{"name":"bob"} // only "name" is mutated
{"name":"bob","value":0.0} // both "name" and "value" are mutated
{"name":"bob","uncle":"joe"} // "name" is mutated; "uncle" is ignored

You may want to take care to reset the values of objects that are repeatedly decoded in order to avoid conflating a previously decoded value with a new one.

The advantage to such a forgiving decoding algorithm is that you can change struct definitions in production and still maintain some level of backwards-compatibility with previously-encoded values.

Rule 3: Tuple-to-Struct

If you use the tuple encoding directive for a struct, it will be encoded as a list of its fields rather than a map. Structs encoded/decoded this way are only compatible with lists of the same size and constituent types.

Tuple encoding is faster and stricter (and therefore "safer") than map encoding but comes at the cost of backwards-compatibility.

Null

In Go, maps and slices can both have be nil. However, msgp will never decode a map or a slice as a nil value, instead encoding them as a zero-length map and a zero-length slice, respectively.

The only types that can be encoded as MessagePack nil are pointers and interface{}.

Decoding a nil object into anything yields a msgp.TypeError.

interface{}

The following concrete types are legal to encode with methods that take interface{}:

  • int{8,16,32,64}, uint{8,16,32,64}, complex{64,128}, time.Time, string, []byte, float{32,64}, map[string]interface{}, map[string]string, nil
  • A pointer to one of the above
  • A type that satisfies the msgp.Encoder or msgp.Marshaler interface, depending on method (defer to the documentation of the method in question)
  • A type that satisfies the msgp.Extension interface

When using decoding methods that return interface{}, the following types will be returned depending on the MessagePack encoding (MessagePack type -> Go type):

  • uint -> uint64
  • int -> int64
  • bin -> []byte
  • str -> string
  • map -> map[string]interface{}
  • array -> []interface{}
  • float32 -> float32
  • float64 -> float64
  • ext -> time.Time, complex64, complex128, msgp.RawExtension, or a registered extension type