Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning when large binary files are included into the bundle #9058

Open
Tracked by #84
mexus opened this issue Jan 8, 2021 · 4 comments
Open
Tracked by #84

Warning when large binary files are included into the bundle #9058

mexus opened this issue Jan 8, 2021 · 4 comments
Labels
A-diagnostics Area: Error and warning messages generated by Cargo itself. C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-publish S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.

Comments

@mexus
Copy link

mexus commented Jan 8, 2021

Describe the problem you are trying to solve

It seems like some crate authors do not ask cargo to exclude pictures/pdfs/other-blobs by providing an appropriate exclude = [ .. ] directive at their Cargo.tomls.

And it also seems like most of the time it is not intentional, it's more like developers might simply forget that anything that is not ignored by default gets packages into the published .crate file.

I'm not quite sure how common the situation is, though it might be happening quite often, since a deliberate action is required to exclude blobs from the package, and I wasn't able to find any advices to exclude "extra" files in the official guides.

Describe the solution you'd like
I suppose a warning when including binary large objects should do the trick, something like

[WARN] It might happen that you are unintentionally packaging a large binary object 'exapmles/drawing.bmp' (25 MB).
If it is intentional, you can turn off this warning by providing the `--allow-blobs` flag to the cargo.

Notes
It would be great if somebody's got an idea how to actually check the assumption that developers are unintentionally including extra blobs to their packages on crates.io. Currently I can only think of an article in the TWIR which explains the possible issue, but I'm looking for better ideas :)

@mexus mexus added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label Jan 8, 2021
@Eh2406
Copy link
Contributor

Eh2406 commented Jan 9, 2021

There are many people that have downloaded all the crates and can get us a size by file type histogram.
Adding a comment about this to the instructions for publishing a crate would be great.
Adding a output line when running cargo publish that lists oddly large files, may help people notice this without changing our API and without making more work for those that are including them intentionally.

@mexus
Copy link
Author

mexus commented Jan 10, 2021

I know it's not that representative, but just for the reference, here's a distribution of binary files in my local /.cargo/registry/src/github.com-.. in log scale:

distribution

So in my case, most of the binary files are less than 100 KiB (but there's a significant amount of them though), but at I've also got at least a couple of files that are larger than 1 MiB. I also wonder how many of them have been included deliberately though 🤔

@ehuss ehuss added A-diagnostics Area: Error and warning messages generated by Cargo itself. Command-publish labels Feb 13, 2021
bors added a commit that referenced this issue Oct 29, 2022
Report crate size on package and publish

### Motivation

Fixes #11251.

This adds a line like `Packaged 42 files, 727.0KiB (143.8KiB compressed)` to the output of `cargo package` and `cargo publish`. See the associated issue for more details.

### Test info

I've updated associated tests to account for the new line in the output, including the file count where relevant. I've also added a few additional tests specifically to address the uncompressed and compressed file sizes.

If you'd like to test this manually, simply run `cargo package` or `cargo publish` within a project of your choice. The new `Packaged` line will appear at the end of the stderr output, or directly before the `Uploaded` line, respectively.

### Review info

This PR touches many test files to account for the additional line of stderr output. For convenience, I've split the PR into 4 commits.
1. contains the actual implementation
2. updates existing tests unrelated to `cargo package` and `cargo publish`
3. updates existing tests related to `cargo package` and `cargo publish`, including file counts where relevant
4. adds new tests specifically for the file sizes, which are generally not covered by existing tests

### Additional information

The feature was discussed [on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Report.20crate.20size.20when.20packaging) prior to implementation.

Potential future extensions to explore include:
- Report size per file when using `--verbose`
- Compare to the size of the previously uploaded version to warn if the increase is sufficiently large
- Consider designs that can draw more attention to the most important information
- Warn if large binary files are included ([#9058](#9058))
@weihanglo weihanglo added the S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. label Jun 3, 2023
@weihanglo
Copy link
Member

Note that with -Zlints we could have a configurable lint for Cargo to emit this warning.

@kornelski
Copy link
Contributor

This should also complain about directories with CACHEDIR.TAG.

Cargo is smart enough not to package target/ dir in the default configuration, but inconsistent use of CARGO_TARGET_DIR can create copies of the target dirs that do get packaged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Error and warning messages generated by Cargo itself. C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-publish S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.
Projects
None yet
Development

No branches or pull requests

5 participants