Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative data format for matrix shards storage #5

Open
5kg opened this issue Jun 11, 2015 · 26 comments
Open

Alternative data format for matrix shards storage #5

5kg opened this issue Jun 11, 2015 · 26 comments

Comments

@5kg
Copy link
Contributor

5kg commented Jun 11, 2015

Quote 1, 2, 3:

Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy.

For example, you have to parse a message all at once, and serialize it all at once. So if you have a message containing a 100MB blob, you can't read any part of the message unless you read in the entire 100MB and block the calling thread while it parses. Also problematic is the fact that the 100MB blob will be allocated as one gigantic flat byte array. On 64-bit systems this may be fine but on 32-bit you may have address space fragmentation issues. Finally, there is a hard message size limit at 2GB.

Options:

  • csv
  • hdf5
  • leveldb
  • lmdb

We can probably use snappy for data compression.

@hongchaodeng
Copy link

Thanks @5kg for sharing. This is interesting.
Do you suggest that we shouldn't use protobuf to transfer large dataset of models? Otherwise, can you give more details w.r.t. other options?

@xiang90
Copy link
Contributor

xiang90 commented Jun 11, 2015

@5kg

Well. I was also thinking about this when I wanted to use pb to send a 1GB snapshot.

Then I asked the designer of protobuf and the one who implemented go-protobuf.

They told me "never mind, you can do it actually". Then I benchmark it, all thing just works fine.

I suggest you to benchmark it first.

I am not against to use any compression mechanism. However, I am not quite sure why lmdb,leveldb(kv db), csv,hdf5(format), snappy(compression) group together. A little confused.

@5kg 5kg changed the title Alternative data format for matrix shards Alternative data format for matrix shards storage Jun 15, 2015
@5kg
Copy link
Contributor Author

5kg commented Jun 15, 2015

Hi,

The original description is indeed confusing. Sorry for that.

I think the data format for matrix shards storage and transfer between taskgraph nodes are two separated issues. Of course, if we found a data format fits both, that will be great.

For data transfer, we may take a look at flatbuffers, or Cap’n Proto. They are both featuring "access to serialized data without parsing/unpacking", which might be favorable for our usage (transferring huge float array). I've not used these two library before. We need do some additional research and benchmark before adopting those libraries. Or maybe just wait until protobuf becomes the bottleneck.

For data storage, there is a 2G hard limit for the size of protobuf message, since they encode data use 32-bit integer I think. Such a limit has been affecting people ignoring the "limit you size of message to 1 MB" rule of thumb -- BVLC/caffe#2006 😄 .

Most commonly used data format for storage are csv & hdf5. Some machine learning library like caffe also use lmdb/leveldb for data storage. Their use cases for kv store are mostly fetching independent data bolbs such as image, which I think is irrelevant for bwmf.

@5kg
Copy link
Contributor Author

5kg commented Jun 15, 2015

In term of snappy, we could integrated it into taskgraph?

Not a priority for now.

@xiang90
Copy link
Contributor

xiang90 commented Jun 15, 2015

For data transfer, we may take a look at flatbuffers, or Cap’n Proto. They are both featuring "access to serialized data without parsing/unpacking"

Several things to think about it:
These encoding mechanisms simply amortize the overall encoding/decoding cost into each operation.

If you only access the data once, it would be slower than encoding/decoding in a bulk intuitively and in theory.

It really depends on your data access pattern. If you can provide the common access pattern of bwmf, we can make a better decision.

For data storage, there is a 2G hard limit for the size of protobuf message

Reference? On a 64bit machine, I do not think there is a 2GB limit actually.

@xiang90
Copy link
Contributor

xiang90 commented Jun 15, 2015

Most commonly used data format for storage are csv & hdf5. Some machine learning library like caffe also use lmdb/leveldb for data storage. Their use cases for kv store are mostly fetching independent data bolbs such as image, which I think is irrelevant for bwmf.

If you use flatbuffers or Cap’n Proto, one nice thing you would get for free is the data is mmap-able. Thus you do not need to save them into any kv store.

@xiang90
Copy link
Contributor

xiang90 commented Jun 15, 2015

In term of snappy, we could integrated it into taskgraph?

I think grpc will support snappy eventually. So you do not need to worry about it for communication.

@5kg
Copy link
Contributor Author

5kg commented Jun 15, 2015

Reference? On a 64bit machine, I do not think there is a 2GB limit actually.

They use int explicitly: https://github.com/google/protobuf/blob/4644f99d1af4250dec95339be6a13e149787ab33/src/google/protobuf/message_lite.cc#L243

@hongchaodeng
Copy link

Could we write a test case to try the 2GB limit?

@xiang90
Copy link
Contributor

xiang90 commented Jun 15, 2015

@5kg int on 32bit = 32 bit; int on 64bit = 64bit. int32=32; int64=64.

I may be wrong though... Using go too much.

@hongchaodeng
Copy link

That's c++

@5kg
Copy link
Contributor Author

5kg commented Jun 15, 2015

@5kg int on 32bit = 32 bit; int on 64bit = 64bit. int32=32; int64=64

For c++, sizeof(int) = 4 on 64bit machine. Maybe it's only a limit for c++?

Let me write a test for it.

@hongchaodeng
Copy link

Great. Thanks @5kg !

@5kg
Copy link
Contributor Author

5kg commented Jun 18, 2015

@fengjingchao @xiang90
I tried to generate protobuf files larger than 2g -> https://gist.github.com/5kg/f27a32f238c376635024

Marshaling/unmarshaling seems to be OK. But I found the memory usage blowed up during the process. It uses as much as 20g memory unmarshaling a 2.8g protobuf file.

@xiang90
Copy link
Contributor

xiang90 commented Jun 18, 2015

@5kg Have you tried gogoproto? I think that is the problem with the inefficient goproto library...

@5kg
Copy link
Contributor Author

5kg commented Jun 18, 2015

@xiang90 Never heard about gogoproto. Good to know.

All I need to do is change the test.pb.go with protoc --gofast_out=. test.proto? I don't see much difference.

Before:

$ /usr/bin/time -v go run unmarshal.go
    Command being timed: "go run unmarshal.go"
    User time (seconds): 28.09
    System time (seconds): 25.28
    Percent of CPU this job got: 89%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:59.73
......
    Maximum resident set size (kbytes): 20805872 # (20.8059 G)
......
    Exit status: 0

After:

zifei@wuxiaoyun-desktop:/tmp/f27a32f238c376635024$ /usr/bin/time -v go run unmarshal.go
    Command being timed: "go run unmarshal.go"
    User time (seconds): 17.43
    System time (seconds): 34.02
    Percent of CPU this job got: 90%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:56.63
......
    Maximum resident set size (kbytes): 20800400
......
    Exit status: 0

@5kg
Copy link
Contributor Author

5kg commented Jun 18, 2015

I also changed import "github.com/golang/protobuf/proto" to import "github.com/gogo/protobuf/proto"

@hongchaodeng
Copy link

Can you upload the code to gist with the commands too?
We can investigate it aligned with you.

On Wed, Jun 17, 2015 at 10:44 PM, Zifei Tong notifications@github.com
wrote:

I also changed import "github.com/golang/protobuf/proto" to import "
github.com/gogo/protobuf/proto"


Reply to this email directly or view it on GitHub
#5 (comment).

- Hongchao Deng
Software Engineer

@5kg
Copy link
Contributor Author

5kg commented Jun 18, 2015

@fengjingchao Sure. Please see https://gist.github.com/5kg/f27a32f238c376635024/revisions

You can clone the gist then checkout an older version.

@xiang90
Copy link
Contributor

xiang90 commented Jun 18, 2015

@xiang90
Copy link
Contributor

xiang90 commented Jun 18, 2015

@5kg gogoproto will generate marshal code, which is significantly better than reflection based marshaling.

@5kg
Copy link
Contributor Author

5kg commented Jun 18, 2015

No luck. It's faster but memory usage is still the same.

zifei@wuxiaoyun-desktop:/tmp/f27a32f238c376635024$ /usr/bin/time -v go run unmarshal.go
    Command being timed: "go run unmarshal.go"
    User time (seconds): 13.53
    System time (seconds): 12.72
    Percent of CPU this job got: 75%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:34.91
    Maximum resident set size (kbytes): 20803320
    Exit status: 0

Updated gist: https://gist.github.com/5kg/f27a32f238c376635024

@xiang90
Copy link
Contributor

xiang90 commented Jun 18, 2015

@5kg Wired... Can you try to use go's runtime pkg to print out the memstats?

@xiang90
Copy link
Contributor

xiang90 commented Jun 18, 2015

@5kg I can help you to debug this when I have time too.

@xiang90
Copy link
Contributor

xiang90 commented Jun 18, 2015

@5kg Also you can quickly scan the generated code. It should be very easy to understand.

@xiaoyunwu
Copy link
Contributor

is it because this is map in the proto message?

Xiaoyun

On Wed, Jun 17, 2015 at 11:35 PM, Zifei Tong notifications@github.com
wrote:

No luck. It's faster but memory usage is still the same.

zifei@wuxiaoyun-desktop:/tmp/f27a32f238c376635024$ /usr/bin/time -v go run unmarshal.go
Command being timed: "go run unmarshal.go"
User time (seconds): 13.53
System time (seconds): 12.72
Percent of CPU this job got: 75%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:34.91
Maximum resident set size (kbytes): 20803320
Exit status: 0

Updated gist: https://gist.github.com/5kg/f27a32f238c376635024


Reply to this email directly or view it on GitHub
#5 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants