-
Notifications
You must be signed in to change notification settings - Fork 3
Breaking Change: reduce bundle size by shipping no codecs or hashes by default #4
Comments
Digging into this some more. Another large source of bundle bloat is just the multicodec and multihash tables themselves. If we create an API for configuration that requires a string identifier and an integer, we can create the parts of the table we support and leave out the rest and not actually need to include the full tables. |
While it's convenient to have the codecs and algorithms bundled here, it's not necessary at this level and I'd anticipate higher layers of an application stack bundling codecs and algorithms in ways that are specific to that application. Just needs to be well documented I suppose.
What sort of sizes are we talking about here? You're not being too obsessive about size are you (in a premature optimization kind of way)? |
It’s 58K after minification and gzip, which is pretty big for being such a low level dependency. https://bundlephobia.com/result?p=@ipld/block@2.0.6 As you can see from the visualization, the big contributors are the codec tables, borc for cbor encoding, and bignumber which is used by one of the hash algorithms. |
I agree that keeping the bundle size in mind is a good idea. Though I'm with @rvagg here and think this might be a premature optimization. E.g. if you're using IPLD, you are likely to use CBOR. So in any application, you will have borc as a dependency somewhere. So if you create your bundle it doesn't really matter the js-block also contains it. I would rather look into an API to be able to remove unnecessary codec, rather then loading them dynamically somehow. I prefer having it easy to use by default and having it possible to reduce the size for people that really need it. |
So, I want to disagree on two counts ;) I’m actually working on a browser thing that uses unixsfv2 but only Second,
This would be nice but unfortunately it doesn’t work well with the way that bundlers work. Once you import something it’s going to be in the bundle. If we hand someone a dependency that imports other modules they are going to get them in their bundle even if they disable them somehow. This is one of the many reasons I’ve grown quite weary of the entire practice of bundling, but it’s the dominant pattern right now so we need to figure out how to work well with it. |
Another option we could consider is a secondary endpoint that excludes any codecs. Something like We could then leave the main block API as-is. |
I would still consider that a special case where you want to be able to trim it down :)
I have to admit I know too little about bundling and how to make things practically usable. |
I'm seeing this pop up a bit these days, @alanshaw gifted it to bl which happens to have streaming functionality but isn't its core value-add, so to use it without readable-stream (which is not getting any smaller) you can I don't mind this approach and we should document it (same goes for bl which I'm noticing doesn't document it). But back to my earlier point, I view this as an intermediate piece of a larger puzzle and I'd rather the glue go in higher levels of abstraction. The further you go up the abstraction chain toward a concrete application, the more firm your opinions can be because your needs become clearer. I'm pretty confident this piece will be very low in the abstraction hierarchy and therefore its opinions about what pieces it's going to be coupled with should be very lose. |
This deserves some discussion before implementation.
The largest contributor to our bundle size is codecs and hash algorithms. When unused these are completely unnecessary bundle bloat, and rarely are all of the codecs and hashes we bundle actually used. The
get-codec
library ships with only 4 codecs (raw, dag-json, dag-cbor, and dag-pb) and is already an unnacceptable bundle size for a dependency. Cutting down to justraw
or mayberaw
anddag-json
should be an acceptable size.But it’s worth discussing what the optimal API is for configuring which codecs and hash functions should be bundled. The
Block
API is in a weird place in the application stack, instances are created at a fairly low level and then passed on to higher level code, so it’s a bit hard to figure out where users would pass in these implementations.We already have an API for configuring/adding codecs to the block interface but I’m not sure how I feel about it. It just exposes the
get-codec
interface.There’s no equivalent API in
multihashing-async
so we’ll probably have to drop it and write our own.Thoughts?
PS. When we removed async from
encode()
anddecode()
we also removed the ability to dynamically load a codec over the network. That may have been a mistake, but it’s not something I want to re-litigate until we have a better understanding of the performance tradeoffs. However,block.cid()
is still async, and will remain so because browser hashing algorithms are async, and that means we could implement dynamic loading for hashing algorithms.The text was updated successfully, but these errors were encountered: