Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are dependencies vendored? #95

Closed
LPGhatguy opened this issue Jan 17, 2019 · 15 comments
Closed

Why are dependencies vendored? #95

LPGhatguy opened this issue Jan 17, 2019 · 15 comments

Comments

@LPGhatguy
Copy link

Hey! I pulled down Azul today to give it a shot with hello world, it's pretty great!

I did notice that there was a gigantic Git checkout at the part of my first build, so I dug around and found that all the dependencies are in one giant repo, azul-dependencies.

Why are dependencies vendored like this?

It's super strange and makes it so that there's two versions of every crate that I have in common with Azul in my executable! It also blocks the entire download/compilation step of my application with a single giant Git clone.

@fschutt
Copy link
Owner

fschutt commented Jan 17, 2019

The fundamental problem why this decision was made is because cargo ignores Cargo.lock files on libraries and relies purely on semantic versioning to be correct. Which, in the past year has been shown to be very prone to human error. Meaning, if someone just mis-versions their crate, cargo will "auto-upgrade" your crate version in the dependencies of this library and also break your build in the process, meaning you can't compile your code anymore. But then I get the blame for "why does azul not build anymore" and I have to run around and fix everything - and the more dependencies azul has, the more attack points. cargo is currently not in a state where it can reliably produce builds. I reported this to the cargo team, back in March 2018 but it simply got ignored.

The fundamental problem is that cargo completely ignores Cargo.lock files on dependencies, which means that if dependencies are yanked (which usually does not happen for security reasons, but for petty reasons like making a mistake in the readme), then it breaks all libraries that depend on that library (since cargo ignores lock files for libraries).

I am not alone with this problem either, it's just that other repositories like winit for example, who experienced the same problem simply say "we don't guarantee that the code deployed on crates.io actually works". I can at least guarantee that if you write code and you lock the version of azul that you're using, that you will at least not have problems with azul (because the code is self-contained). You might still have problems with controlling the versions of other crates (and I encourage you to vendor your dependencies for any long-term project), but at least you won't have problems with azul.

cargo also does not respect compiler versions with regards to semantic versioning - azul supports rustc 1.30 and that will probably stay this way - there is currently no reason to upgrade to anything newer, except if the dependencies force you to upgrade. However, with auto-upgrading cargo you may get a 2018-edition crate for a 2015-edition compiler, which, again, will break your build.

Lastly, cargo also breaks code with this auto-upgrade-the-version BS. For example, if code relies on global variables or lazy-static and cargo merges two versions of a crate behind your back, very weird things can happen (such as #57), because the code expected the globals to be used "in isolation", but cargo merged the code together, so now you have funny bugs, depending on what mood cargo is currently in.

Yes, that means that you'll have to compile dependencies twice, but choosing between that and fixing dependency problems every five days it is the lesser evil from my perspective. cargos version selection algorithm simply does not work, cargo would need automatic versioning to cope with the human error aspect. See #80 and #31 for further discussion on this topic and why this problem is so annoying.

So overall - yes, you'll have to compile things twice. But at least your code compiles at all. That's just the way it is for now, it's not unusual for other repositories to do this. While I've written a seven-page rant on this topic and what needs to be done to fix this situation, I haven't contacted the cargo team at large about it (beside that one issue), because I simply don't have the time to deal with this or the inevitable drama that will ensure.

@fschutt fschutt closed this as completed Jan 17, 2019
@mominul
Copy link

mominul commented Jan 17, 2019

Cargo uses the maximal version resolution, so cargo will select the most new patch version of the dependency automatically. After reading the links you have posted, I think the first culprit is the stb-truetype crate. It had migrated to the 2018 edition but with only a patch release (that crate only supports the new rust releases, so it's logical for it). And the breaking drama ensures. Azul currently provides minimum rust version support guarantee though it's dependencies doesn't. So I think Azul should also take the path of it's dependencies and upgrade to the 2018 edition(I can help with it). As it seems logical to me. But the current hack seems really really ugly. Thanks 😊

@fschutt
Copy link
Owner

fschutt commented Jan 17, 2019

@mominul "winit can't guarantee a stable compiler version, so azul should neither" - sorry, I don't get that argument. Reproducability and stability are more important than compile time.

I know what resolution algorithm cargo is using, the problem is that it relies on human versioning, ignores Cargo.lock files, doesn't concern itself at all with compiler breakage, and basically operates a la "move fast and break everything". This may work for small projects but not for larger ones.

What's the reason for changing it - "it looks ugly" - that's not really an argument, that's just subjective taste. There is no reason for upgrading to Rust 2018. The breaking changes are mostly of small, syntactic nature. In the past I've only upgraded the Rust compiler when there was a reason to do so, when the new version had something that made old hacks easier. But there is no reason to upgrade now.

@Lymia
Copy link

Lymia commented Jan 17, 2019

Why completely jump the shark to dependency vendoring instead trying to find more middle ground solution to begin with?

This approach is not without nonfrivolous downsides. For now it's one UI toolkit, but imagine if more libraries start doing this. This is actively harmful to the greater Rust ecosystem.

Binary size is affected in addition to compile time. If you use three dependencies that do this, this will easily triple your binary size, even if they could have shared dependencies safely otherwise. Plus it can interfere with some features that depend on crates sharing global state (correctly, unlike the winit strangeness) such as parking_lot_core's deadlock_detection feature.

And, well, you have the time to set up and update vendored dependencies, but not to work with the cargo team to properly solve this issue?

I understand your previous issue didn't go well, but I also can understand why the cargo team "ignored" it. The issue you submitted (rust-lang/cargo#5263) just says an issue exists, it doesn't provide an actionable solution, and it seems the cargo team misunderstood what you meant. I have submitted an issue to try and solve the same issue in a more actionable way: rust-lang/cargo#6558.

We'll see how that goes.

@fschutt
Copy link
Owner

fschutt commented Jan 17, 2019

And, well, you have the time to set up and update vendored dependencies, but not to work with the cargo team to properly solve this issue?

Yeah, because fixing this took a few days and implementing it in cargo, doing RFCs, etc. will take months if not years. This is a structural change to cargo, not a simple bug fix. Thanks for making the issue on cargo - I know that more people than just me have this issue, but it's hard convincing others that the change is necessary. I know that I have to contact the cargo team "properly", but I wanted to wait a few months because I have a lot of stuff to do atm.

As for binary size and compile time... I've upgraded my production app to the newest azul version, it bumped the dependency count by 20, increasing compile time (fresh) by one minute and increasing the rebuild time for debug builds by a few seconds. It's not unmanagable, especially since I already cut down unnecessary dependencies and features from azul-dependencies.

Why I jumped the shark... because it already happened five times to me and I am sick of it. Today, stb_truetype breaks, tomorrow dwrote breaks, next week it's leftpad-rs. If I don't put an end to this, I'll spend more time fixing dependencies than actually coding. Yeah I'm sorry for the inconvenience, but at the moment it's the only way I can get azul to even build predictably and not break over night.

This is actively harmful to the greater Rust ecosystem.

Cargos version selection algorithm is harmful to the greater Rust ecosystem, yes, because it does not provide any guarantees towards stability and already-deployed crates can break at a whim. The problem is that this was completely predictable and the cargo team knew that issues like this will happen, but they actively decided against it (basically forcing you to always upgrade your crates, always upgrade your compiler to the latest one, not allowing two versions of the same crate, etc.).

@est31
Copy link

est31 commented Jan 23, 2019

I think that most of your problems @fschutt can be solved by external, unofficial, tools. It should be possible to create a cargo fork that has an --ignore-yanked option so that it can generate lockfiles with crates that were yanked. It should be possible to obtain the MSRV of all crates by a crater-like tool that tries various compilers and then upload the MSRV info to some place. that info can then be read by that cargo fork and used during resolval. It's a bit of work I guess, needs a bit of cloud resources to do the crater-like thing, but it's all doable without having to waste time with trying to convince people in power by writing RFCs or whatever. Such a tool would dampen the harm to the community that cargo's version selection is causing.

@est31
Copy link

est31 commented Jan 23, 2019

--ignore-yanked was simple: est31/cargo@1c087ee

$ git clone https://github.com/est31/cargo
$ cd cargo
$ cargo build --release
$ cd ..
$ cargo init some_test_lib
$ cd some_test_lib
$ echo 'dwrote = "0.5"' >> Cargo.toml
$ ../cargo/target/debug/cargo generate-lockfile
$ cat Cargo.lock | grep '"dwrote '
"dwrote 0.5.1 (registry+https://github.com/rust-lang/crates.io-index)",
$ ../cargo/target/debug/cargo generate-lockfile --ignore-yanked
"dwrote 0.5.2 (registry+https://github.com/rust-lang/crates.io-index)",

Also works for cargo build, cargo check, cargo update, etc.

As for MSRV... that's a bit tougher.

@est31
Copy link

est31 commented Jan 23, 2019

As for MSRV... that's a bit tougher.

There you go: est31/cargo@6ce9b61

$ git clone https://github.com/est31/cargo
$ cd cargo
$ cargo build --release
$ cd ..
$ cargo init some_test_lib
$ cd some_test_lib
$ echo 'stb_truetype = "0.2"' >> Cargo.toml
$ ../cargo/target/debug/cargo generate-lockfile
$ cat Cargo.lock | grep '"stb_truetype '
 "stb_truetype 0.2.5 (registry+https://github.com/rust-lang/crates.io-index)",
$ mkdir .cargo
$ echo 'msrv="1.30.0"' >> .cargo/config
$ echo 'msrv_infos=".cargo/msrv.json"' >> .cargo/config
$ echo '[{"name":"stb_truetype", "vers":"0.2.5", "msrv":"1.32.0"}]' >> .cargo/msrv.json
$ ../cargo/target/debug/cargo generate-lockfile
$ cat Cargo.lock | grep '"stb_truetype '
 "stb_truetype 0.2.4 (registry+https://github.com/rust-lang/crates.io-index)",

Like the ignore yanked option, this should also work for the other cargo commands, and it should also work recursively.

The implementation can be improved, e.g. it could sync the msrv json from github or somewhere, just like how its done for the registry.

@fschutt
Copy link
Owner

fschutt commented Jan 23, 2019

As for MSRV... that's a bit tougher.

Yeah, and that's exactly the core problem that needs to be solved. Fine, let's say that yanking crates breaking all their dependents is not a problem (it is, but to a lesser extent). Even then, how do I prevent cargo from breaking dependencies due to semver mistakes - since cargo applies maximum-version-selection and ignores lock files on libraries (which are supposed to take care of issues with maximum-version-selection), it's not possible for me to ensure that azul is always built with the same (or at least compatible) dependencies, i.e. that the library always builds.

It should be possible to obtain the MSRV of all crates by a crater-like tool that tries various compilers and then upload the MSRV info to some place. that info can then be read by that cargo fork and used during resolval.

Well, cargo needs to either drop maximum-version-selection and switch to minimum-version-selection or (preferrably) you'd need automatic semantic versioning built into cargo (for example, making cargo aware that the edition = "2018" flag breaks semver no matter what the version number says, or doing that by diffing public API changes, etc.). Why should it be my job to invent another tool (that doesn't exist yet, I'd have to write it) and spin up servers for crater and compiler testing, just to fix cargos shitty version selection and cargo ignoring the edition = "2018" field? And this would still not stop cargo from semver mistakes. This is neither a maintainable solution in the long run, nor does it fix the underlying issue.

$ echo 'dwrote = "0.5"' >> Cargo.toml

Well yeah, that's the problem, this doesn't lock dwrote at all! Some users might get 0.5.1, some might get 0.5.2 and a compile error because 0.5.2 may have a semver error. Cargo ignores .lock files on libraries completely, making a tool to mess with lock files is therefore a complete waste of time.

that info can then be read by that cargo fork and used during resolval.

So you'd agree that I'd need to fork cargo anyway to make this change happen - but then I'd have to force users of azul to use a completely different build tool, just so that my library builds? Basically "in order to compile azul, you'd first need to download my-cargo-fork" - should every library in existence now make their own cargo fork? What's easier, forking the dependencies or forking the entire build tool?

Such a tool would dampen the harm to the community that cargo's version selection is causing.

What "harm"? That people may have to compile crates twice? Oh no, the horror. With the current solution, nobody has to switch build tools and azul at least builds reliably. What good is having code that doesn't compile? The thing is, I am not developing this crate "for the community" or "for the greater Rust ecosystem", I am building it mainly for myself, and as such I need it to build the same today, tomorrow and in a year. I do not regard fixing dependencies and upgrading compilers every 5 days to be a worthy sacrifice just so I can be at peace with "the community". It's not worth it, I'm not getting paid for this, so I want to waste as little time as possible on it. Yes of course I'd love to develop it in the "normal" way, but the "normal" way of cargo is simply broken.

I'm repeating myself here, but: Yanking crates breaking dependents is a symptom, not the cause. The cause is cargo ignoring lock files (and therefore hashes), relying completely on humans to version their crates correctly (no semver checks) and further not being able to compile two versions of the same crate.

As for my involvement around this topic, I had planned to delay this topic for a few months before getting involved in any kind of way. I've stated my case clearly enough now, I'd just repeat myself. I am not sure how the future of cargo will look and azul is also still in heavy development (so it's not really important right now whether the dependencies are vendored or not, that will only come into play if azul needs to be deployed to crates.io). So at least for now, it's a fairly low-priority issue.

@est31
Copy link

est31 commented Jan 23, 2019

So you'd agree that I'd need to fork cargo anyway to make this change happen - but then I'd have to force users of azul to use a completely different build tool,

My cargo fork is only needed to create/update/otherwise manage Cargo.lock files of applications.
You can still build the application with traditional cargo, as long as the lockfile is present.
Just use my fork in CI or idk tell your users who are filing bugs about yanked crates or MSRV issues to use it.

What "harm"? That people may have to compile crates twice?

Note that with harm I've meant the harm that crates other than yours are causing, by pushing MSRV upgrades in minor semver updates: right now, you can't defend yourself from that as a downstream library user nor as an application user, and the winit approach, just dropping MSRV altogether isn't a valid solution because then you will end up having to need the latest compiler in order to compile Rust stuff.

I think I've just provided solutions for the two issues you have encountered, no? So I don't think that vendoring dependencies is needed any more, is it?

@LPGhatguy
Copy link
Author

Vendoring all your dependencies breaks -sys crates, since there can only one crate can link to a given external library.

It looks like you have an open bug report for Azul because of dependency vendoring breaking stuff on MacOS, see #96.

@fschutt
Copy link
Owner

fschutt commented Jan 23, 2019

My cargo fork is only needed to create/update/otherwise manage Cargo.lock files of applications.
You can still build the application with traditional cargo, as long as the lockfile is present.

And I'm telling you again that this is useless because cargo happily ignores the lockfile of azul. Cargo only respects the lockfile of binary applications, meaning that I'd have to somehow mess with the lockfile of my users. Which, even with your modified cargo, is still impossible.

@fschutt
Copy link
Owner

fschutt commented Jan 23, 2019

@LPGhatguy

Vendoring all your dependencies breaks -sys crates, since there can only one crate can link to a given external library.

That is simply false, cargo can link to an external library multiple times. Vendoring does not affect -sys crates. The only situation where you can't do it is when the external crate is marked as links = "blahblah" in the Cargo.toml file (which in the azul-dependencies case, doesn't happen).

It looks like you have an open bug report for Azul because of dependency vendoring breaking stuff on MacOS, see #96.

Yes, but that's just a version mismatch (because azul-dependencies isn't completely vendored on Mac systems, because I could only verify it to work on Windows and Linux platforms), that's not because of multiple -sys crates are present.

@knappador
Copy link

IRC says:
rust-lang/cargo#6589

Looks like they are at least firing at the issue? Since I have little context, I recommend reviewing the change to see if it's on the path to being out of the way for Azul.

@fschutt
Copy link
Owner

fschutt commented Feb 12, 2019

@knappador The issue you mentioned is for alternate registries. That doesn't have anything to do with the problem of cargo ignoring lock files and semver problems breaking all dependent crates. Alternate registries have been around for a while - that issue has nothing to do with this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants