Binary size is too big! #169

ncw · 2024-08-09T15:33:25Z

We use storj.io/uplink and storj.io/uplink/edge in the Storj backend for rclone.

I've just discovered in rclone/rclone#7998 that the storj backend is adding over 6MiB to the rclone binary! That is nearly 10% of the binary size. Most backends weigh in at about 200k - it is only the ones with big external SDKs that exceed 1MB and the storj backend is by far the largest.

I suspect that this package is pulling in more things than it needs to which are making its way into the rclone binary.

I used https://github.com/Zxilly/go-size-analyzer to examine the rclone binary and it seems storj.io/common/pb is the biggest culprit. Maybe this could be split up into sub packages per task?

I note that AWS split their SDK up into sub modules to help with this problem. The code all still lives in the same single github repo, but subpackages have their own go.mod and with the new workspaces feature that makes it manageable to work on.

Any thoughts on how to improve this?

Thanks

The text was updated successfully, but these errors were encountered:

egonelbre · 2024-08-12T07:59:26Z

I definitely feel the pain as well. Without delving into the backstory of things.

We've been working on https://github.com/storj/picobuf in the background to minimize the binary sizes. There are still some features missing that would allow to replace gogo/protobuf (and protobuf) in general. But, the progress got a stuck due to shifting priorities.

Why a full replacement? Because one of the major problems with protobuf is that it relies on reflection and Method.Call, which forces the compiler into conservative mode, which doesn't allow a lot of code to be elided.

Splitting up ended up feeling like a temporary bandaid for the protobuf situation, so we didn't go that route.

The main reason the bloat exists is due to the proto.RegisterType and fileDescriptor stuff. This means that the compiler won't be able to remove the unused types. I'll first review whether I can remove them, because AFAIK, we don't need that part of protobuf.

storj-gerrit · 2024-08-12T08:36:48Z

Change pb: remove descriptors and type registration mentions this issue.

ncw · 2024-08-12T12:12:52Z

@egonelbre thanks for taking a look at this - much appreciated.

Because one of the major problems with protobuf is that it relies on reflection and Method.Call, which forces the compiler into conservative mode, which doesn't allow a lot of code to be elided.

I did not realise this and that probably explains some of the other bloat in the rclone binary from the HDFS backend which also uses protobufs.

Descriptors are used to dynamically read protobuf information that has not been encoded into the types. Similarly, the type registration ends up forcing every protobuf type to be reachable. This ends up significantly bloating the binary sizes. The type registrations are used to deserialize any messages, which Storj does not use. Updates storj/uplink#169 Change-Id: I705e3a481b2623d164e9f5c237bf88ef855a386c

egonelbre · 2024-08-13T12:15:17Z

So, removing the descriptors only ended up reducing things by 300kb.

I also experimented with using picobuf, which reduced things by 2.5MB compared to the baseline.

current.exe         83315712
no-descriptors.exe  82867712
picobuf.exe         80880128
no-storj.exe        76346368

But, I'm still slightly confused why it ends up that large - even with picobuf. I kind of understand why it ends up larger than other backends (mostly because many other use net/http and hence can share the backend), but not to that extent. Some of our monitoring could be the culprit, due to inlining.

egonelbre · 2024-08-13T12:38:42Z

As far as I can tell, we might be hitting a inliner bug, which causes some code to be inlined, which probably shouldn't be. I'll need to dig into that more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Binary size is too big! #169

Binary size is too big! #169

ncw commented Aug 9, 2024

egonelbre commented Aug 12, 2024

storj-gerrit bot commented Aug 12, 2024

ncw commented Aug 12, 2024

egonelbre commented Aug 13, 2024

egonelbre commented Aug 13, 2024

Binary size is too big! #169

Binary size is too big! #169

Comments

ncw commented Aug 9, 2024

egonelbre commented Aug 12, 2024

storj-gerrit bot commented Aug 12, 2024

ncw commented Aug 12, 2024

egonelbre commented Aug 13, 2024

egonelbre commented Aug 13, 2024