Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How does the messagepack encoding/decoding compare to msgpack-cpp in terms of performance? #585

Closed
sztomi opened this issue May 15, 2017 · 21 comments
Labels
aspect: binary formats BSON, CBOR, MessagePack, UBJSON kind: question

Comments

@sztomi
Copy link

sztomi commented May 15, 2017

No description provided.

@nlohmann
Copy link
Owner

I have no idea. Is there a benchmark suite?

@nlohmann nlohmann added the aspect: binary formats BSON, CBOR, MessagePack, UBJSON label May 15, 2017
@sztomi
Copy link
Author

sztomi commented May 15, 2017

They used to have benchmark suite written in Perl, but apparently it's gone now. This is their main repo: https://github.com/msgpack/msgpack-c

@nlohmann
Copy link
Owner

I have little knowledge about that library. Would this be a fair benchmark example:

#include <msgpack.hpp>
#include <fstream>
#include <iostream>
#include <sstream>

int main()
{
    for (auto filename : {"sample.json.msgpack", "all_unicode.json.msgpack"})
    {
        std::ifstream f_msgpack(filename, std::ios::binary);
        std::vector<char> packed((std::istreambuf_iterator<char>(f_msgpack)),
                                    std::istreambuf_iterator<char>());

        auto x = msgpack::unpack(packed.data(), packed.size());
        std::stringstream ss;
        ss << x.get();
        std::cout << ss.str().size() << std::endl;
    }
}

vs.

#include <json.hpp>
#include <fstream>
#include <sstream>

using json = nlohmann::json;

int main()
{
    for (auto filename : {"sample.json.msgpack", "all_unicode.json.msgpack"})
    {
        std::ifstream f_msgpack(filename, std::ios::binary);
        std::vector<uint8_t> packed((std::istreambuf_iterator<char>(f_msgpack)),
                                    std::istreambuf_iterator<char>());
        std::stringstream ss;
        ss << json::from_msgpack(packed);
        std::cout << ss.str().size() << std::endl;
    }
}

input files

@sztomi
Copy link
Author

sztomi commented May 15, 2017

Probably, but without the I/O (I mean that only the deserialization should be timed). I/O is unpredictable and will skew the results. Are you using I/O to make sure that the results are not optimized away?

@nlohmann
Copy link
Owner

Of course, the time reading the files into memory should not be measured. I was rather asking about the semantics, because I was searching for a size() equivalent in msgpack-c to get the number of elements in an object/array.

@sztomi
Copy link
Author

sztomi commented May 15, 2017

Ah, OK. You can use x.via.array.size, but this might not actually make msgpack-c deserialize the object. In fact, I think it's best to deserialize into a vector or something in both cases.

@nlohmann
Copy link
Owner

This whole benchmark heavily depends on what you want to do... In my case, I create a JSON value from the MessagePack input, so I would like to compare against something similar.

@sztomi
Copy link
Author

sztomi commented May 15, 2017

You are right, that's a valid comparison. However, why not benchmark both? I think both depicts an interesting facet of the performance.

@nlohmann
Copy link
Owner

If you can provide examples with a description for msgpack-c, I can try to add examples for my library.

@sztomi
Copy link
Author

sztomi commented May 15, 2017

Sure, I will put together some examples of my use cases and add them as a comment here this weekend.

@nlohmann
Copy link
Owner

That's great! There are also some large JSON files in the benchmarks folder that could be used.

@nlohmann
Copy link
Owner

nlohmann commented Jun 7, 2017

Any news on this @sztomi ?

@sztomi
Copy link
Author

sztomi commented Jun 9, 2017

Sorry @nlohmann I was on a vacation and forgot to update this thread. Will get back to it soon, thanks for your patience!

@sztomi
Copy link
Author

sztomi commented Jun 14, 2017

Thanks again for the patience, here are my use cases:

I'm using msgpack for implementing msgpack-rpc in rpclib. Each request is encoded as a an object like this:

[type, msgid, method, params]

where type and msgid are integers, method is a string and params is an arbitrary object. The library reads from a stream and whenever a complete object is read, it will dispatch the call. To this,

  • The request has to be decoded from binary into msgpack::object
msgpack::unpacked result;
while (pac_.next(&result) && !exit_) {   // <-- decoding to msgpack::object
    auto msg = result.get();
    auto response = dispatch(msg);
}
  • If the method is found by name, the params object is deserialized into a tuple that is used to invoke the selected function.

dispatch ultimately calls a type-erased function wrapper that performs this deserialization and passes the arguments to the function.

template <typename F>
void dispatcher::bind(std::string const &name, F func,
                      detail::tags::void_result const &,
                      detail::tags::nonzero_arg const &) {
    using detail::func_traits;
    using args_type = typename func_traits<F>::args_type; // this is a tuple type of function arguments

    funcs_.insert(
        std::make_pair(name, [func, name](msgpack::object const &args) {
            constexpr int args_count = std::tuple_size<args_type>::value;
            args_type args_real;
            args.convert(&args_real);  // <-- deserialization 
        }));
}

These are my primary use-cases.

@nlohmann
Copy link
Owner

@sztomi Could you provide a complete example (self-contained file with a main function) please?

@sztomi
Copy link
Author

sztomi commented Jun 20, 2017

Sure, I'll put together a self-contained git repo and link it here. Forgive my slow response times, I'm having a very busy week.

@nlohmann
Copy link
Owner

nlohmann commented Jul 8, 2017

Any news on this?

@sztomi
Copy link
Author

sztomi commented Jul 10, 2017

Will update soon.

@sztomi
Copy link
Author

sztomi commented Jul 17, 2017

@nlohmann
Copy link
Owner

Thanks for checking back!

It's hard to make a comparison. because that msgpack library does not process arbitrary JSON. I could measure the time it takes to parse a list of, say, 10000 integers, but I'm not sure this is helpful.

It would be handy if I could measure reading an arbitrary MessagePack serialization. Do you know whether this is possible?

@nlohmann
Copy link
Owner

I cannot make this comparison myself. If anyone can perform such a benchmark, I would be happy to hear about the outcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aspect: binary formats BSON, CBOR, MessagePack, UBJSON kind: question
Projects
None yet
Development

No branches or pull requests

2 participants