Skip to content

Describe how decimals are encoded in the binary protocol #992

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Totktonada opened this issue Nov 17, 2019 · 5 comments
Closed

Describe how decimals are encoded in the binary protocol #992

Totktonada opened this issue Nov 17, 2019 · 5 comments
Assignees
Labels
feature A new functionality reference [location] Tarantool manual, Reference part server [area] Task relates to Tarantool's server (core) functionality

Comments

@Totktonada
Copy link
Member

The encoding was introduced in this commit. The information how decimals are encoded in the binary protocol is important for connectors developers.

Cited from the commit message:

The decimal MsgPack representation looks like this:
+--------+-------------------+------------+===============+
| MP_EXT | length (optional) | MP_DECIMAL | PackedDecimal |
+--------+-------------------+------------+===============+

MsgPack spec defines fixext 1/2/4/8/16 and ext 8/16/32 types. fixext types have fixed length, so it is not encoded explicitly, while ext types require to encode a data length. MP_EXP + optional length meant usage of one of those types.

MP_DECIMAL is 1.

I don't know how exactly PackedDecimal is encoded. @sergepetrenko, can you share more info?

@sergepetrenko
Copy link
Contributor

sergepetrenko commented Nov 18, 2019

length is of type MP_UINT, if it is present at all (i.e. when type is ext 8/16/32)
length is the length of PackedDecimal
PackedDecimal has the following structure:

 <--- length bytes -->
+-------+=============+
| scale |     BCD     |
+-------+=============+

scale is either MP_INT or MP_UINT
scale = -exponent (exponent negated(!))
BCD is a sequence of bytes representing decimal digits of the encoded number
(each byte represents two decimal digits each encoded using 4 bits), so byte >> 4 is the first digit and byte & 0x0f is the second digit.
The leftmost digit in the array is the most significant.
The rightmost digit in the array is the least significant.

The first byte in the BCD array may have only the second digit.
The last byte in the BCD array has only the first digit and a nibble.

A nibble represents the number sign. 0x0a, 0x0c, 0x0e, 0x0f stand for plus,
0x0b, 0x0d stand for minus.

An example: decimal -12.34 will be encoded as 0xd6,0x01,0x02,0x01,0x23,0x4d

|MP_EXT (fixext 4) | MP_DECIMAL | scale |  1   |  2,3 |  4 (minus) |
|       0xd6       |    0x01    | 0x02  | 0x01 | 0x23 | 0x4d       |

Another example: decimal 0.000000000000000000000000000000000010 will be encoded as 0xc7,0x03,0x01,0x24,0x01,0x0c

| MP_EXT (ext 8) | length | MP_DECIMAL | scale |  1   | 0 (plus) |
|      0xc7      |  0x03  |    0x01    | 0x24  | 0x01 | 0x0c     |

@lenkis lenkis added 2.2 feature A new functionality reference [location] Tarantool manual, Reference part server [area] Task relates to Tarantool's server (core) functionality labels Nov 18, 2019
@rybakit
Copy link
Contributor

rybakit commented Dec 16, 2019

@sergepetrenko Could you clarify the difference between nibble codes? For example, when should a negative number be packed with 0x0b and when with 0x0d?

@sergepetrenko
Copy link
Contributor

@rybakit
Here's an extract from DecNuber documentation.

The sign nibble may be any of the six possible values:
1010 (0x0a)plus
1011 (0x0b)minus
1100 (0x0c)plus(preferred)
1101 (0x0d)minus(preferred)
1110 (0x0e)plus
1111 (0x0f)plus (conventionally, this sign code can also be used to indicate that a number was originally unsigned.)

DecNumber package itself uses preferred values: 0x0c for plus and 0x0d for minus.
I couldn't find any more info on when should one use alternative nibbles instead of preferred ones.

@rybakit
Copy link
Contributor

rybakit commented Dec 17, 2019

DecNumber package itself uses preferred values: 0x0c for plus and 0x0d for minus.

From a connector's perspective, does this mean that only 0x0c and 0x0d can be checked while encoding/decoding decimals? Or in other words, could it be a case when the connector can retrieve a negative number packed with 0x0b?

@sergepetrenko
Copy link
Contributor

From a connector's perspective, does this mean that only 0x0c and 0x0d can be checked while encoding/decoding decimals? Or in other words, could it be a case when the connector can retrieve a negative number packed with 0x0b?

decNumber uses only 0x0c and 0x0d when encoding decimals.
I still think it will be better to check for all the cases, including alternative ones while decoding (decNumber does this too)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality reference [location] Tarantool manual, Reference part server [area] Task relates to Tarantool's server (core) functionality
Projects
None yet
Development

No branches or pull requests

5 participants