Description
(Continuing a discussion started here.)
The cld2
library is a natural-language detection library from Google, and it does some pretty cool stuff. I've packaged it as two Rust libraries, cld2
and cld2-sys
. But because the upstream cld2
library is packaged by very few Linux distributions, I've chosen to distribute the source code with the cld2-sys
package and build it using the Rust gcc
library. So far, so good—all this works quite nicely.
But I can't upload the package to crates.io because it contains statistical language models, and those models are just too big:
$ du -sh target/package/cld2-sys-0.0.1.crate
35M target/package/cld2-sys-0.0.1.crate
I can shrink this down somewhat (by omitting everything I don't need for the build), but I almost certainly can't get it under the 10MB limit. I can think of a couple of ways to address this issue:
- Accept that certain
*-sys
packages will be larger than 10MB, and provide some way to override the limit selectively. - Store compressed source code in an S3 bucket, and ask
build.rs
to download it. But this introduces a dependency on an outside data source that may go away.
Any thoughts on the best way to handle this? Thank you for your advice, and for a great package-management system!