-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to Cargo for alternative registries #2141
Add support to Cargo for alternative registries #2141
Conversation
text/0000-alternative-registries.md
Outdated
# Rationale and Alternatives | ||
[alternatives]: #alternatives | ||
|
||
A [previous RFC](https://github.com/rust-lang/rfcs/pull/2006) proposed having the registry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Java ecosystem has gone the other direction. Gradle requires that you specify all of your upstream repositories in your build.gradle, and Maven supports both configuration in the project itself and at the user level.
It seems kind of messy for the dev setup instructions to go from "clone the repo" to "clone the repo, add these registries to your ~/.cargo/config, and make sure the names agree across all of the projects you're working on".
When Cargo searches for a .cargo/config
, does it stop at the first one it finds or continue looking and union all of them? One nice option could be to go the union route so you could check a .cargo/config
into the repo with the right registry configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the points for having the .cargo/config
outside of the repository is to avoid checking authentication information into the code-base. From my view this would be a way to support private registries for closed source projects and the common use case is most likely that you will have one internal registry and use crates.io for all publicly available code.
Maybe there could be a cargo add-registry
command for the future that can be used to setup any third party registry that is to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registry authentication information is already stored in a separate file than Cargo.toml
and .cargo/config
- I don't know why anything would be different here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When Cargo searches for a .cargo/config, does it stop at the first one it finds or continue looking and union all of them? One nice option could be to go the union route so you could check a .cargo/config into the repo with the right registry configurations.
It continues looking and unifies all of them. I just made a PR to cargo's docs to make this more readily apparent.
Maybe there could be a cargo add-registry command for the future that can be used to setup any third party registry that is to be used.
That sounds like a great idea! I'll add a note about that :)
Registry authentication information is already stored in a separate file than Cargo.toml and .cargo/config - I don't know why anything would be different here.
You're right that usernames and passwords should probably go in .cargo/credentials instead of .cargo/config, I'll make that change. Right now, only the token to authenticate to a registry's API is stored in .cargo/credentials
, so this RFC will be adding the ability to specify a username and password to enable access to either a registry index or an API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, as long as it's something that can be checked into the repo and doesn't totally suppress user-level configuration I'm on board.
Thanks for proposing this support. This is a blocker for any kind of Rust adoption at my employer (sadly it does not guarantee that we will adopt rust). Has there been a discussion about supporting organizations and private repositories in |
text/0000-alternative-registries.md
Outdated
|
||
```toml | ||
[dependencies] | ||
secret-crate = { version = "1.0", registry = "my-registry" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to support a short form of this for convenience:
"my-registry/secret-crate" = "1.0"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sfackler could we have that syntax reserved for crate namespacing?
Instead I'd propose: [dependencies.my-registry] secret-crate = "1.0"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would this mean in that setup?
[dependencies.foobar]
version = "1.0"
Is it a crate called "version" at 1.0 in the "foobar" registry or a crate called "foobar" in the default registry at version 1.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh dumb me, that syntax already has a meaning... What about this then:
[registry.my-registry.depdendencies]
secret-crate="1.0"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative (loosely based on how URLs work):
"//my-registry/secret-crate" = "1.0"
text/0000-alternative-registries.md
Outdated
|
||
```toml | ||
[registry.$choose-a-name] | ||
index = "https://username:password@my-intranet:8080/index" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like credentials should live separately. We've recently moved the crates.io publish token out of .cargo/config
.
Awesome RFC @carols10cents (et al!). I have a branch of cargo which I believe implements this, though I haven't tested it thoroughly. The only pertinent difference I'm aware of is in the format for declaring a new registry. What I went with was:
e.g: [registries]
foobar = "https://github.com/foobar-co/foobar-index"
[registries.bazquux]
index = "https://github.com/bazquux-org/bazquux-index" Another possible format choice would be to instead support a syntax like this in the toml, instead of having a registry key in the dependency object itself: [registry.foobar-co.dependencies]
# all the dependencies in this table come from foobar-co My branch doesn't implement that, but its worth considering, since it makes it easier to add more dependencies from that alternate registry. |
text/0000-alternative-registries.md
Outdated
it is possible to have a local crates.io server which crates can be pushed to, while still making | ||
use of the public crates.io server. | ||
|
||
We would also like to support the use of crates.io mirrors. These differ from alternative |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would mirrors work in this setup? We'd need some way to say that a registry "acts as" https://github.com/rust-lang/crates.io-index, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. I forgot to put in details about mirrors. I'm starting to think that could be separate from this RFC-- we already support source replacement but I want to extend it so you can list multiple mirrors and cargo will automatically fall back if one is inaccessible. That's starting to feel separate, so I'm going to take this paragraph about mirrors out.
This is a feature that's important for corporate use, so I'll chime in with my experience. Storing the passwords in the working directory is generally a bad idea, because in corporate environments the working directory is often on a share drive with fairly open permissions. (Even allowing them to be stored there isn't ideal, because someone will make a mistake.) The best solution is to put username and password into the OS-specific keystore (GNOME Keyring / Windows Credentials Management / Apple Keychain). If I read #3978 correctly, Cargo access tokens are already stored in ~/.cargo/credentials. Putting passwords there wouldn't be ideal, but would be much better than the working directory or main Cargo config file. Storing the username and password as part of the URL is very inflexible. In the future, we may want to support Kerberos/SAML/LDAP/etc logon, so storing the USER/PASSWORD/AUTH_TYPE as separate fields is a good idea. Ideally the user name would not be in the same file as the registry-name to URL mapping, so the mapping file can be checked in and it will "just work" within a company LAN. |
I mentioned this on #2006 and rust-lang/cargo#4208 but I want to make sure that it doesn't get lost in the shuffle. I want to make sure that when new registries are specified that it is possible to specify the full hostname and root path for the registry and not just the host name. For multiple repository hosting solutions like nexus and artifactory it needs to be possible to specify a path as well. For instance, artifactory hosts npm repos at https://host.company.com/api/npm/private-repo so you can host multiple repo types and multiple repos for the same language. Specifying just the host should have a good default but it should be overridable. |
@bbatha You specify the url of the registry index, which is required to contain the url of the backing store. It is not possible to specify just the hostname. For example, crates.io would be declared: [registry.crates-io]
index = "https://github.com/rust-lang/crates.io-index" |
text/0000-alternative-registries.md
Outdated
- `name`: the name of the crate | ||
- `vers`: the version of the crate this row is describing | ||
- `deps`: a list of all dependencies of this crate | ||
- `cksum`: a checksum of this version's files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/this version's files/the tarball downloaded/
text/0000-alternative-registries.md
Outdated
{ | ||
"name": "serde", | ||
"req": "^1.0", | ||
"registry": "https://crates.io", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this, like allowed-registries
above, specify the index rather than this URL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, probably.
text/0000-alternative-registries.md
Outdated
specifying the list of registries that are allowed with `cargo publish`. | ||
|
||
``` | ||
publish-registries = ["my-registry"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cargo currently has a publish = false
key for totally disallowing publishing, I wonder if we could perhaps overload it?
publish = true # default, publish to crates.io
publish = false # don't publish this anywhere
publish = [] # don't publish this anywhere
publish = ["https://some-other-registry.com"] # publish somewhere other than crates.io
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, do TOML/serde support different types like that???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - being able to do that kind of thing was one of the main advantages of serde over rustc-serialize. A simple way of doing it is via the "untagged" enum representation: https://serde.rs/enum-representations.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can even find this in Cargo today!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL!
In the detailed design section there's a note of related issues:
Just to be clear, though, this RFC isn't specifically proposing solutions to these? Are they possible future extensions? (I'd be fine adding solutions for them to this RFC, I think they may be all relatively trivially fixable) |
crates.io is likely to remain open source only, but stay tuned :) |
Awww I was close!!! I like what you've implemented though, I'm going to update this to go with yours :) |
text/0000-alternative-registries.md
Outdated
Currently, the knowledge of how to create a file in the registry index format is spread between | ||
Cargo and crates.io. This RFC proposes the addition of a Cargo command that would generate this | ||
file locally for the current crate so that it can be added to the git repository using a mechanism | ||
other than a server running crates.io's codebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, for this use-case, we'll also need a way to make a .crate file manually. This is already handled by cargo package
. Then perhaps cargo package
could create both a .crate tarbol, and a .json index metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see us rolling the metadata into package eventually, yeah. I think we should try having them separate at first, cargo already has enough things tangled up with each other that could be independent ;)
@rfcbot reviewed |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
This seems pretty cool. I really like the idea of putting the registry names in the checked-in project, and the name-to-address mapping in the environment! However, if i work for a paranoid company that wants all crate downloads to come from an internal registry, and i want to build some random project i've cloned off Github, can i do that? That is, if i have a project whose Cargo.toml contains this: [dependencies]
byteorder = "1.0.0" Can i force Cargo to go to repo.initech.com rather than crates.io to get it? I got the impression on reading the RFC that i wouldn't be able to do that. AIUI, the only way to get a crate to come from a specific registry is to say so in the dependency declaration. I would have to say: [dependencies]
byteorder = { version = "1.0.0", registry = "initech-internal" } Happily, i don't work for such a paranoid company, so i can get public crates from crates.io and internal crates from some internal registry. But in the past, i have worked for companies where this would not have flown. So, if this isn't currently possible, could we have it? Perhaps we could define a name for crates.io ("default", "crates-io", "pub", whatever), and say that will be used by default. Then i could get those crates from my internal registry by redefining the address that name maps to. |
text/0000-alternative-registries.md
Outdated
|
||
A valid registry index meets the following criteria: | ||
|
||
- The registry index is stored in a git repository so that Cargo can efficiently fetch incremental |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) Can the address of the index be a file: URL, or a plain file path? That could be really useful for setting up a local repository. I've done this a few times in the Java world. It could also be useful in reptilian corporate environments where it's easy to put something on a shared drive, but much harder to stand up a server.
(2) Could we allow plain HTTP as well as Git? I could imagine writing a little registry server (20-30 lines of Java!) to serve up my team's internal crates. We only have a few, and don't update them often, so downloading the whole index wouldn't take long. Whereas writing or setting up a Git server would be quite a headache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) Can the address of the index be a file: URL, or a plain file path?
Yes indeed, this is how I publish to a local instance of crates.io when developing, actually.
(2) Could we allow plain HTTP as well as Git? I could imagine writing a little registry server (20-30 lines of Java!) to serve up my team's internal crates. We only have a few, and don't update them often, so downloading the whole index wouldn't take long. Whereas writing or setting up a Git server would be quite a headache.
For now, we're going to stay with git; being able to send only the delta of changes rather than the whole change is a huge win. While you might only have a few crates to start with, you might have more later, or just more versions of those few crates.
Git includes straightforward ways to run a server, if it's within your firewall and unauthenticated, it's not bad at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please accept my belated thanks for your reponse! The point of writing a server would be to proxy to our existing infrastructure, so being able to run a Git server doesn't really help. But being able to use local indices addresses most of of the internal use cases i can imagine, so that shouldn't matter.
@tomwhoiscontrary repository mirrors were originally discussed a bit in this RFC but have since been pulled out. There'll presumably be a follow-up RFC to deal with that use case. |
Cargo already supports source replacement, so you are able to do this today! 🎉 What isn't supported yet is being able to list multiple mirrors and automatically falling back to whichever is available. Running a mirror, whether pre-emptively caching everything on crates.io or only caching what's requetsed, is also not simple right now. As @sfackler noted, neither of these concerns are especially related to the changes in this RFC. |
The final comment period is now complete. |
This RFC has been merged! Tracking issue. Thanks @carols10cents, @natboehm and @shepmaster! |
Doing some work my observations:
|
Writing code & struggling to see what the schema would look like if multiple registries are involved, most importantly:
Suggestion: add to config.json a registry-id which is something like an UUID where mirrors of same registry and moving same registry can be recognized ... |
I searched this thread, but could not find relevant information about how the crates that are published to alternative registry should depend on crates that are on crates.io registry. It seems that leaving it empty means that dependency is in the same registry thus not crates.io. Should all those dependencies use a proxy-name such as "cratesio", and then everyone define locally the index, or is there another way? |
@WiSaGaN I'm not sure which part you mean by "leaving it empty". In For dependencies cargo downloads, the crate file tracks the URL (not the registry name), so there's no need to have local definitions. Documentation can be found at https://doc.rust-lang.org/cargo/reference/registries.html. |
@ehuss , say I want to publish a crate called [dependencies]
serde = "1.0"
bar-dep = { version = "1.0", registry = "foo" } Or [dependencies]
serde = { version = "1.0", registry = "cratesio" }
bar-dep = "1.0" And define registry entries in |
The first one. |
EDIT: I've realised we're talking about different things here, I'm thinking of the registry's representation in the index, while @WiSaGaN is asking about Cargo.toml, so my request for clarification doesn't really make sense. Will open a seperate issue elsewhere, sorry for the confusion. |
so right now interesting problem I see, when I declare in private
and then make another package "dummy-2"
when resolving dummy-2 with update cargo tries to find itertools in private1 instead of private2 ! Looking @ the depndency index the registry is not stored so no wonder. Or do I miss something? |
Rendered
Tracking issue
This RFC built on previous work done in RFC 2006. The biggest difference is that this RFC includes a specification for the index format that any registry will need to conform to. Another difference is that this RFC proposes configuring registry locations once, in a
.cargo/config
, rather than multiple times in each project, both to avoid duplication and to discourage including credentials in each project.@natboehm and @shepmaster also worked on this RFC :)