-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a way to get the URL to download a pipeline to the CLI #11175
Conversation
I think there's too much danger for this to conflict with the pip flags. I noticed that this was just merged, maybe we can use this somehow instead? pypa/pip#10771 Or alternatively this should be in a separate command (not |
There's no way for pip to resolve things without downloading so |
OK, we definitely shouldn't use this if pip is adding a flag with the same name. Since pip has a lot of options it seems like the only safe option is adding another command. I'm not sure what it should be called, maybe We could also provide more information and call it something like |
I think a separate command sounds easier, but the naming is hard. In general I think people would like to be able to do something like: spacy whatever en_core_web_sm >> requirements.txt |
What if we had commands like |
Let's see, I'd never noticed this but we already have |
I like this idea - could it be
that would basically work regardless of whether the model is already installed or not? |
I was imagining something that could show a bit more about how to specify it as a dependency, so at least some option with more details like:
Also a plain |
OK, I have updated this to support this syntax:
Initially I had this work like other info commands and print a table by default, but consdering the comment about piping the output upthread, and that there's only one relevant field, I instead modified to print just the URL of the most recent compatible version of the pipeline when successful. As far as failure modes:
There are some unresolved questions. There are lots of things we could add but I think it's better to keep this a simple, minimal feature that can be integrated in other things, so I've kept things simple when in doubt about the below. (In particular, because this is overloading the existing
For the |
I think providing just the wheel URL is fine and just providing the same URL as you would get from The no-internet error is pretty ugly here, especially since it's not that clear to users that it's going to need internet access to provide the URL. |
Actually if the model is installed locally, it might make more sense to provide that version, or possibly to provide both? |
I don't think it's that bad? It clearly states what the failure was (couldn't look up the compatibility table). It would be nice if we could check compatibility offline, but since we can't... The ideal solution would probably be a URL that could be generated locally that would redirect to the latest compatible version. (There is also the issue that looking up the compatibility table offline doesn't throw an exception, it exits the program, so it would need to be modified to handle it elegantly.)
I'm not sure I understand the use case? These are the uses I was imagining:
I guess for locally installed models, you might want the URL if you want to declare dependencies for a project you've been working on - is that what you had in mind? For that use case, would it be better to add the download URL to the |
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
I just pushed a change to add output for This is the one I prefer:
This is a little hacky though - it's using the same This is output with one key per entry. This is the same way other commands work but I think it's too hard to read, mainly because of the newline for
I also feel like at this point For this PR I think we should leave |
This should make mypy happy
Just to clarify the state of this a bit, what I think we should do now is:
About generating different URLs based on installed packages, I'm still not clear what the use case for that is for downloading (it makes sense for I have left the |
I think this is OK to merge now? |
I'll take another look. Can you update the PR description? |
Updated the PR description to reflect the current behavior. |
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
I think the functionality is looking good. Do we want to include the changes to the docs in this PR or a follow-up PR? There are at least these sections: https://spacy.io/usage/models#download-pip And maybe easier instructions on the models pages themselves? (I think having to dig through releases for the asset URL is the too-difficult part.) |
I'd prefer having the docs changes with the PR! |
Add a sidebar about finding download URLs, with some examples of the new command.
OK, I updated the docs. In addition to adding mention of the new command to the places linked above, I put download links directly in the tables on the model pages. How's that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the summary of this is that I don't like "Download URL" as the name for this. It's just a URL? Possibly a "(pipeline) (package) URL"?
Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
…#11175) * Add a dry run flag to download * Remove --dry-run, add --url option to `spacy info` instead * Make mypy happy * Print only the URL, so it's easier to use in scripts * Don't add the egg hash unless downloading an sdist * Update spacy/cli/info.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Add two implementations of requirements * Clean up requirements sample slightly This should make mypy happy * Update URL help string * Remove requirements option * Add url option to docs * Add URL to spacy info model output, when available * Add types-setuptools to testing reqs * Add types-setuptools to requirements * Add "compatible", expand docstring * Update spacy/cli/info.py Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Run prettier on CLI docs * Update docs Add a sidebar about finding download URLs, with some examples of the new command. * Add download URLs to table on model page * Apply suggestions from code review Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com> * Updates from review * download url -> download link * Update docs Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Description
This adds a
--url
flag to thespacy info [model]
command that prints the URL of the most recent compatible version of the named pipeline. This URL is printed without any formatting for use in scripts, like:Because this requires a compatibility check, it requires an Internet connection.
This also adds the download URL for installed packages to the output of
spacy info [model]
if available. The information is typically available for pipelines installed with pip, and is accessed throughpkg_resources
. This does not require an Internet connection.Original description:
This is a simple implementation of a
--dry-run
flag forspacy download
that makes it print the URL of the package to be installed without downloading anything.I don't have strong opinions about the way this is done, but there are some questions:
I'm not sure how universal this is, but I used
-n
for the short flag, which I understand to come fromnoop
.Types of change
minor enhancement
Checklist