Skip to content

High level Protobuf Features

dogtopus edited this page Jan 31, 2020 · 5 revisions

(Highly WIP)

MiniPB focuses on serialization/deserialization of primitive Protobuf types. As a result, most of the high-level Protobuf features are intentionally not implemented in MiniPB. Below documents how to simulate some of these features in user code.

Enums

In Protobuf, enums are treated as 32-bit signed vints at low level. This means it is possible to use Python's enum.IntEnum along with field type t to simulate a Protobuf enum.

For example, give the following example copied from Protobuf documentation:

message SearchRequest {
  string query = 1;
  int32 page_number = 2;
  int32 result_per_page = 3;
  enum Corpus {
    UNIVERSAL = 0;
    WEB = 1;
    IMAGES = 2;
    LOCAL = 3;
    NEWS = 4;
    PRODUCTS = 5;
    VIDEO = 6;
  }
  Corpus corpus = 4;
}

First you write the schema like this:

import minipb

search_req_schema = minipb.Wire((
    ('query', 'U'),
    ('page_number', 't'),
    ('result_per_page', 't'),
    ('corpus', 't'), # Note that we are using the regular t type
))

Then you construct the Corpus enum class by inheriting the IntEnum base class.

import enum

class Corpus(enum.IntEnum):
    UNIVERSAL = 0
    WEB = 1
    IMAGES = 2
    LOCAL = 3
    NEWS = 4
    PRODUCTS = 5
    VIDEO = 6

Now, you can use the following code to deserialize the incoming msg and read the integer field corpus as an enum item:

request = search_req_schema.decode(msg)
corpus = Corpus(request['corpus'])

Similarly, during serialization, you can use this to serialize a message with vint field(s) filled with enum item(s):

request = {
    ...
    'corpus': Corpus.WEB
    ...
}
msg = search_req_schema.encode(request)

Note that this only works with IntEnum/IntFlag because they will be implicitly converted to a regular int during message encoding.

TODO: MicroPython

Any type

The Any message type was introduced in proto3 and lets you assign a single protobuf message of any type to a field. It works by creating a 2-tuple that contains the message type (a URL that describes the type, usually in the format of type.googleapis.com/<message-name-including-package>) and the encoded bytes of the message. Simply put, the Any type is equivalent to the following proto snippet:

message Any {
    string T = 1;
    bytes msg = 2;
}

Therefore, it is easy to represent the Any container in MiniPB by using the following code:

PBAny = (
    ('@type', 'U'),
    ('msg', 'a'),
)

Then use it in your message like this:

import minipb

ErrorStatus = minipb.Wire((
    ('message', 'U'),
    ('details', '+[', PBAny),
))

This creates the Wire object ErrorStatus that can decompose mesages serialized with the following official example:

import "google/protobuf/any.proto";

message ErrorStatus {
  string message = 1;
  repeated google.protobuf.Any details = 2;
}

To fully decode the message, you need to write some basic logic for mapping the type URL to a Wire object, then invoke the corresponding Wire decoder manually. For example:

PB_TYPES = {
    'type.googleapis.com/NetworkErrorDetails': minipb.Wire((
        ('errno', 't'),
        ('dest_ip', 'U'),
        ('port', 'V'),
    )),
}

def decode_any_field(item):
    # Since the type_ is a URL, more sophisticated and proper parsing can be done via urllib.parse when needed
    # For simplicity simple dictionary lookup is used here
    type_ = item['@type']
    msg = item['msg']
    if type_ not in PB_TYPES:
        raise TypeError('Unrecognized message type {}'.format(type_))
    return PB_TYPES[type_].decode(msg)

# ...

msg_decoded = ErrorStatus.decode(some_data_you_received)
if msg_decoded['details'] is not None:
    # Actually decode the error mesages
    msg_decoded['details'] = tuple(decode_any_field(item) for item in msg_decoded['details'])
else:
    # Optional. Create an empty tuple for easy parsing later
    msg_decoded = tuple()

# Do something else with msg_decoded

To encode a message, simply reverse the process:

def encode_any_field(type_, msg):
    if type_ not in PB_TYPES:
        raise TypeError('Unrecognized message type {}'.format(type_))
    return {'@type': type_, 'msg': PB_TYPES[type_].encode(msg)}

msg = {'message': 'An network error has occurred.', 'details': []}
msg['details'].append(encode_any_field(
    'type.googleapis.com/NetworkErrorDetails',
    {'errno': 104, 'dest_ip': '192.168.1.1', 'port': 12345}
))
encoded = ErrorStatus.encode(msg)
Clone this wiki locally