From b93c837627c87aa5af02e298d868691e2bb5df39 Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva Date: Fri, 10 Jan 2020 17:59:30 +0300 Subject: [PATCH 1/4] gh-992 add submodule about encoding of non default types such as decimals --- doc/dev_guide/internals/box_protocol.rst | 74 ++++++++++++++++++++++++ 1 file changed, 74 insertions(+) diff --git a/doc/dev_guide/internals/box_protocol.rst b/doc/dev_guide/internals/box_protocol.rst index e5620bfe7b..de4ffbfa1d 100644 --- a/doc/dev_guide/internals/box_protocol.rst +++ b/doc/dev_guide/internals/box_protocol.rst @@ -43,6 +43,80 @@ MsgPack data types: * **MP_OBJECT** - Any MsgPack object * **MP_BIN** - MsgPack binary format +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Encoding of Tarantool-specific data types +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some of the data types used in Tarantool are application-specific in terms of the MsgPack standard. +For these data types, we use the following representation. + +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +Decimals +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + +MsgPack EXT type ``MP_EXT`` together with a new extension type +``MP_DECIMAL`` is used as a record header. + +MP_DECIMAL is 1. + +`MsgPack spec `_ +defines ``fixext 1/2/4/8/16`` and ``ext 8/16/32`` types. ``fixext`` +types have fixed length so it is not encoded explicitly, while ``ext`` types require +the data length to be encoded. ``MP_EXP`` + optional ``length`` meant usage of one of those types. + +The decimal MsgPack representation looks like this: + +.. code-block:: none + + +--------+-------------------+------------+===============+ + | MP_EXT | length (optional) | MP_DECIMAL | PackedDecimal | + +--------+-------------------+------------+===============+ + +Here ``length`` is the length of PackedDecimal field, and it is of type +``MP_UINT``, when encoded explicitly (i.e. when type is ``ext 8/16/32``). + +PackedDecimal has the following structure: + +.. code-block:: none + + <--- length bytes --> + +-------+=============+ + | scale | BCD | + +-------+=============+ + +Here ``scale`` is either ``MP_INT`` or ``MP_UINT`` +``scale`` = -exponent (exponent negated(!)) + +``BCD`` is a sequence of bytes representing decimal digits of the encoded number +(each byte represents two decimal digits each encoded using 4 bits), +so ``byte >> 4`` is the first digit and ``byte & 0x0f`` is the second digit. +The leftmost digit in the array is the most significant. +The rightmost digit in the array is the least significant. + +The first byte in the BCD array may have only second digit. +The last byte in the BCD array has only first digit and a ``nibble``. + +The ``nibble`` represents the number's sign. ``0x0a``, ``0x0c``, ``0x0e``, ``0x0f`` +stand for plus, ``0x0b`` and ``0x0d`` stand for minus. + +**Example** + +For example, decimal ``-12.34`` will be encoded as ``0xd6,0x01,0x02,0x01,0x23,0x4d`` + +.. code-block:: none + + |MP_EXT (fixext 4) | MP_DECIMAL | scale | 1 | 2,3 | 4 (minus) | + | 0xd6 | 0x01 | 0x02 | 0x01 | 0x23 | 0x4d | + +Another example: decimal 0.000000000000000000000000000000000010 will be encoded +as ``0xc7,0x03,0x01,0x24,0x01,0x0c`` + +.. code-block:: none + + | MP_EXT (ext 8) | length | MP_DECIMAL | scale | 1 | 0 (plus) | + | 0xc7 | 0x03 | 0x01 | 0x24 | 0x01 | 0x0c | + + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Greeting packet ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From b983da9893709e1e63b67176746db859de7e819e Mon Sep 17 00:00:00 2001 From: lenkis Date: Thu, 23 Jan 2020 21:23:19 +0300 Subject: [PATCH 2/4] Minor markup fixes and rephrases --- doc/dev_guide/internals/box_protocol.rst | 34 +++++++++++++----------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/doc/dev_guide/internals/box_protocol.rst b/doc/dev_guide/internals/box_protocol.rst index de4ffbfa1d..8dd135fea6 100644 --- a/doc/dev_guide/internals/box_protocol.rst +++ b/doc/dev_guide/internals/box_protocol.rst @@ -47,22 +47,26 @@ MsgPack data types: Encoding of Tarantool-specific data types ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Some of the data types used in Tarantool are application-specific in terms of the MsgPack standard. +Some of the data types used in Tarantool are application-specific in terms of +the MsgPack standard. For these data types, we use the following representation. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Decimals ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -MsgPack EXT type ``MP_EXT`` together with a new extension type +MsgPack EXT type ``MP_EXT`` together with the extension type ``MP_DECIMAL`` is used as a record header. MP_DECIMAL is 1. `MsgPack spec `_ -defines ``fixext 1/2/4/8/16`` and ``ext 8/16/32`` types. ``fixext`` -types have fixed length so it is not encoded explicitly, while ``ext`` types require -the data length to be encoded. ``MP_EXP`` + optional ``length`` meant usage of one of those types. +defines two kinds of types: + +* ``fixext 1/2/4/8/16`` types have fixed length so the length is not encoded explicitly; +* ``ext 8/16/32`` types require the data length to be encoded. + +``MP_EXP`` + optional ``length`` imply using one of these types. The decimal MsgPack representation looks like this: @@ -72,10 +76,10 @@ The decimal MsgPack representation looks like this: | MP_EXT | length (optional) | MP_DECIMAL | PackedDecimal | +--------+-------------------+------------+===============+ -Here ``length`` is the length of PackedDecimal field, and it is of type -``MP_UINT``, when encoded explicitly (i.e. when type is ``ext 8/16/32``). +Here ``length`` is the length of ``PackedDecimal`` field, and it is of type +``MP_UINT``, when encoded explicitly (i.e. when the type is ``ext 8/16/32``). -PackedDecimal has the following structure: +``PackedDecimal`` has the following structure: .. code-block:: none @@ -84,8 +88,8 @@ PackedDecimal has the following structure: | scale | BCD | +-------+=============+ -Here ``scale`` is either ``MP_INT`` or ``MP_UINT`` -``scale`` = -exponent (exponent negated(!)) +Here ``scale`` is either ``MP_INT`` or ``MP_UINT``. |br| +``scale`` = -exponent (exponent negated!) ``BCD`` is a sequence of bytes representing decimal digits of the encoded number (each byte represents two decimal digits each encoded using 4 bits), @@ -93,11 +97,12 @@ so ``byte >> 4`` is the first digit and ``byte & 0x0f`` is the second digit. The leftmost digit in the array is the most significant. The rightmost digit in the array is the least significant. -The first byte in the BCD array may have only second digit. -The last byte in the BCD array has only first digit and a ``nibble``. +The first byte in the ``BCD`` array may have only the second digit. +The last byte in the BCD array has only the first digit and a ``nibble``. -The ``nibble`` represents the number's sign. ``0x0a``, ``0x0c``, ``0x0e``, ``0x0f`` -stand for plus, ``0x0b`` and ``0x0d`` stand for minus. +The ``nibble`` represents the number's sign: +``0x0a``, ``0x0c``, ``0x0e``, ``0x0f`` stand for plus, +``0x0b`` and ``0x0d`` stand for minus. **Example** @@ -116,7 +121,6 @@ as ``0xc7,0x03,0x01,0x24,0x01,0x0c`` | MP_EXT (ext 8) | length | MP_DECIMAL | scale | 1 | 0 (plus) | | 0xc7 | 0x03 | 0x01 | 0x24 | 0x01 | 0x0c | - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Greeting packet ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From b74932709bd5944f5eb4ff1a9795d39c6e0d3102 Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva Date: Fri, 24 Jan 2020 12:58:04 +0300 Subject: [PATCH 3/4] gh-992 clearify bcd description --- doc/dev_guide/internals/box_protocol.rst | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/doc/dev_guide/internals/box_protocol.rst b/doc/dev_guide/internals/box_protocol.rst index 8dd135fea6..a65c233ba1 100644 --- a/doc/dev_guide/internals/box_protocol.rst +++ b/doc/dev_guide/internals/box_protocol.rst @@ -97,8 +97,21 @@ so ``byte >> 4`` is the first digit and ``byte & 0x0f`` is the second digit. The leftmost digit in the array is the most significant. The rightmost digit in the array is the least significant. -The first byte in the ``BCD`` array may have only the second digit. -The last byte in the BCD array has only the first digit and a ``nibble``. +The first byte of the ``BCD`` array contains the first digit of the number +represented as follows: + +.. code-block:: none + + | 4 bits | 4 bits | + = 0x = the 1st digit + +The last byte of the ``BCD`` array contains the last digit of the number and the +``nibble`` represented as follows: + +.. code-block:: none + + | 4 bits | 4 bits | + = the last digit = nibble The ``nibble`` represents the number's sign: ``0x0a``, ``0x0c``, ``0x0e``, ``0x0f`` stand for plus, From c25159f7fe7ee48e09def26de0fc8dcb96fac620 Mon Sep 17 00:00:00 2001 From: lenkis Date: Mon, 27 Jan 2020 15:23:25 +0300 Subject: [PATCH 4/4] Minor fixes --- doc/dev_guide/internals/box_protocol.rst | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/doc/dev_guide/internals/box_protocol.rst b/doc/dev_guide/internals/box_protocol.rst index a65c233ba1..75d1cdca1a 100644 --- a/doc/dev_guide/internals/box_protocol.rst +++ b/doc/dev_guide/internals/box_protocol.rst @@ -97,7 +97,7 @@ so ``byte >> 4`` is the first digit and ``byte & 0x0f`` is the second digit. The leftmost digit in the array is the most significant. The rightmost digit in the array is the least significant. -The first byte of the ``BCD`` array contains the first digit of the number +The first byte of the ``BCD`` array contains the first digit of the number, represented as follows: .. code-block:: none @@ -105,8 +105,8 @@ represented as follows: | 4 bits | 4 bits | = 0x = the 1st digit -The last byte of the ``BCD`` array contains the last digit of the number and the -``nibble`` represented as follows: +The last byte of the ``BCD`` array contains the last digit of the number and the +``nibble``, represented as follows: .. code-block:: none @@ -114,20 +114,21 @@ The last byte of the ``BCD`` array contains the last digit of the number and the = the last digit = nibble The ``nibble`` represents the number's sign: -``0x0a``, ``0x0c``, ``0x0e``, ``0x0f`` stand for plus, -``0x0b`` and ``0x0d`` stand for minus. -**Example** +* ``0x0a``, ``0x0c``, ``0x0e``, ``0x0f`` stand for plus, +* ``0x0b`` and ``0x0d`` stand for minus. -For example, decimal ``-12.34`` will be encoded as ``0xd6,0x01,0x02,0x01,0x23,0x4d`` +**Examples** + +The decimal ``-12.34`` will be encoded as ``0xd6,0x01,0x02,0x01,0x23,0x4d``: .. code-block:: none |MP_EXT (fixext 4) | MP_DECIMAL | scale | 1 | 2,3 | 4 (minus) | | 0xd6 | 0x01 | 0x02 | 0x01 | 0x23 | 0x4d | -Another example: decimal 0.000000000000000000000000000000000010 will be encoded -as ``0xc7,0x03,0x01,0x24,0x01,0x0c`` +The decimal 0.000000000000000000000000000000000010 +will be encoded as ``0xc7,0x03,0x01,0x24,0x01,0x0c``: .. code-block:: none