Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel download: How should we display the progress? #8698

Open
McSinyx opened this issue Aug 4, 2020 · 13 comments
Open

Parallel download: How should we display the progress? #8698

McSinyx opened this issue Aug 4, 2020 · 13 comments
Labels
C: cli Command line interface related things (optparse, option grouping etc) C: download About fetching data from PyPI and other sources state: needs discussion This needs some more discussion type: question User question UX User experience related

Comments

@McSinyx
Copy link
Contributor

McSinyx commented Aug 4, 2020

This issue is opened to discuss the UI to be implemented for parallel download, with accessibility as well as platform compatibility in mind (relevant ticket: GH-8518). Concerning cursor moving support, the safest approach is to have a meta progress bar at the bottom of the output, i.e. something like apt's but not stuck to the bottom of the viewport (since that would require curses-or-similar support and defeat the purpose):

apt

@pradyunsg suggested having multiple progress bar, similar to what showed in python-poetry/poetry#2595 and what rich provides (it unfortunately requires Python >=3.6, but similar TUI can be implemented/found elsewhere):

rich

I believe that there are other designs as well and would love to hear more suggestions and opinions!

@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Aug 4, 2020
@willmcgugan
Copy link

Hello. Author of Rich here. Text based progress bars is one of my favourite subjects. Mention me if I can contribute in any way.

@McSinyx
Copy link
Contributor Author

McSinyx commented Aug 5, 2020

Thanks for your attention @willmcgugan! I have yet to look into rich's codebase: it there any barrier to implement cross-platform multiple progress bars on Python 2 (since pip is going to support at least until the end of the year)? In addition, is there any magic involved to make it work on Jupiter Notebook? I notice that ncurses apps don't seem to work well there (e.g. top) but I don't know any further 😄

@willmcgugan
Copy link

It wouldn't me much of a problem to get progress bars running on both Python 2 and 3. Principle is the same.

There is a bit of magic for Jupyter support. I don't think anything that moves the cursor around work work in Jupyter.

Jupyter supports HTML widgets so you could either implement it in HTML, which gives you a lot of freedom, or convert you text based bars in to HTML (which is what Rich does).

I'm surprised to hear people run pip in Jupyter tbh. But I'm not a big Jupyter user.

@pfmoore
Copy link
Member

pfmoore commented Aug 6, 2020

I'm surprised to hear people run pip in Jupyter tbh. But I'm not a big Jupyter user.

I'm pretty sure it's a bad idea. I dabble in Jupyter, but wouldn't say I'm a heavy user. But I'd have thought that you'd need to restart the kernel after installing to pick up the changes correctly. I'm -1 on having any special handling in pip for it being run in Jupyter, it's not a scenario I think we should encourage.

@uranusjr
Copy link
Member

uranusjr commented Aug 6, 2020

Every data science workshop/talk I’ve ever attended (granted I seldom do) that involves live coding use %pip install in a Jupyter Notebook. It sounds like a bad idea, but is extremely prevalent. But I agree pip doesn’t need to do too much special things for it.

@willmcgugan
Copy link

It might be worth at least detecting Jupyter and disabling progress bars. Otherwise I suspect you will get a stream of progress bars in the output.

As much as I like multiple progress bars, does pip need them? I wouldn't care too much about the progress of individual downloads, but I would be interested in the a single bar for the progress as a whole.

@pfmoore
Copy link
Member

pfmoore commented Aug 6, 2020

Every data science workshop/talk I’ve ever attended (granted I seldom do) that involves live coding use %pip install

I note that uses %pip. I don't know if there's special support in the pip magic command to make it work better. Equally, installing stuff in-process doesn't often fail, it's just risky. But the whole thing is a long-standing (core Python) issue, and affects more than just Jupyter - the discussion has come up for Idle, and I assume that other IDEs like VS Code and PyCharm need to deal with this as well. I don't think it's something pip should concern itself with directly (other than to point out that it's not our issue 🙂).

It might be worth at least detecting Jupyter and disabling progress bars.

Hopefully, our code for detecting when we aren't running in a terminal can cope with that (or can be made to, without needing special "detect Jupyter" code). Or the Jupyter people can make the %pip magic command add --no-progress automatically.

@willmcgugan
Copy link

Hopefully, our code for detecting when we aren't running in a terminal can cope with that (or can be made to, without needing special "detect Jupyter" code)

Good point. sys.stdout.isatty() reports False in Jupyter. That should be enough.

@pradyunsg pradyunsg added C: cli Command line interface related things (optparse, option grouping etc) C: download About fetching data from PyPI and other sources UX User experience related state: needs discussion This needs some more discussion type: question User question labels Aug 6, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Aug 6, 2020
@pradyunsg
Copy link
Member

pradyunsg commented Aug 6, 2020

Now that the Jupyter-related side-discussion has concluded...

The situation that this issue is for: after dependency resolution, pip has a list of links that it needs to download, before proceeding to installation. It needs to perform these downloads, while showing progress to the end user.

I basically see 2 options for how to show progress:

  1. Show progress for each download separately, kinda how New and faster installer implementation python-poetry/poetry#2595 does things. Rich would not be what we use, but there's almost certainly another library that implements this.
  2. Have a single progress bar, that shows the combined progress, with some indication when a file is fully downloaded.

@pfmoore
Copy link
Member

pfmoore commented Aug 6, 2020

How many downloads are we talking about here? I can imagine having two or three independent progress bars being OK, but if we have 40 I doubt anyone would be impressed... Conversely, why do we need to indicate when a file is fully downloaded if we have a single progress bar? Why not just show progress of the overall task? Is it because we can't accurately determine "progress" at that level?

This is probably something that the UX studies are already looking at, for pip's existing progress reporting. Do they have any insights?

As a user, what I mostly want to know is:

  • Up front, how long will I have to wait for this task to finish (in terms of time, not some arbitrary measure like bytes)?
  • As the job progresses, does that initial estimate change?
  • Some visual indication that pip's still doing something - or if pip isn't, then what it's waiting for.

I'm not really interested in details like "see how clever pip is, it's worked out that you need FORTY files when you thought you only needed one!!!" 🙂

Given that the only options I have are to sit it out or kill the process, I'd probably also like some postmortem information, but only if I kill the job, that gives me information that would help me to work out what options I have to speed up the process. Honestly, I don't know what options there are, though. Maybe:

  • Download some particularly costly files and host them locally
  • Adjust my install command to omit optional stuff that costs time I'm not willing to spend
  • Split the install into parts, so I can do the most essential bit now and the less important bits later
  • Fix my network speed
  • Even just "tough, do the install some time when you can afford that long of a wait"

@McSinyx
Copy link
Contributor Author

McSinyx commented Aug 6, 2020

Good point. sys.stdout.isatty() reports False in Jupyter. That should be enough.

It indeed is. I was worrying that we'd have to dump thousands of lines to the output 😅

As a user, what I mostly want to know is:

  • Up front, how long will I have to wait for this task to finish (in terms of time, not some arbitrary measure like bytes)?
  • As the job progresses, does that initial estimate change?
  • Some visual indication that pip's still doing something - or if pip isn't, then what it's waiting for.

[...] Given that the only options I have are to sit it out or kill the process

This sums up my UX as well and IMHO e.g. apt is doing a good job with it. Unless someone is strongly against it, I'll go with the single progress bar first (since it's also easier to prototype). If the UX research team find out a better alternative (e.g. multiple bars), I'll be happy to iterate the UI.

Regarding the postmortem information, it seems to be something super nice to have, given in parallel download the choice of number of connections may significantly affect the speed. The content of such message should be taken care once we've got the implementation working, and it might deserve a separate tracking ticket.

@McSinyx
Copy link
Contributor Author

McSinyx commented Aug 20, 2020

The implementation using a single progress bar is now available at GH-8771 for review.

@pradyunsg
Copy link
Member

This should become more feasible once we migrate all our progress bars and spinners to rich: #10461

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: cli Command line interface related things (optparse, option grouping etc) C: download About fetching data from PyPI and other sources state: needs discussion This needs some more discussion type: question User question UX User experience related
Projects
None yet
Development

No branches or pull requests

5 participants