Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common : fix parallel shard download interleaving output #6831

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

TevinWang
Copy link

@TevinWang TevinWang commented Apr 22, 2024

Motivation

Currently when downloading model shards, the progress output shown is interleaved, which is difficult to interpret. To fix this, we case on whether the download is a shard, and create a global table that stores the progress information for each model shard. We continuously update a progress bar for each of the shards in llama_load_model_from_url until all downloads are complete.
Closes #6537

Changes

    • Add boolean value to llama_download_file to indicate whether a download is a shard
    • Add curl property to use a custom progress callback
    • Create a progressTable global to store progress information
    • Print information to console every second as progress bar output while shards are downloading
    • Buffer the download finished/file saved output and print to stderr once ALL downloads complete to prevent interleaving output

Before and after:

Screen Shot 2024-04-21 at 6 53 58 PM Screen Shot 2024-04-28 at 8 17 52 PM

Testing:

Tested using the following command on CPU:

./main --hf-repo ggml-org/models \
       --hf-file tinyllamas/split/stories15M-00001-of-00003.gguf \
       --model models/test-model-00001-of-00003.gguf \
       -ngl 0 \
       --prompt "I believe the meaning of life is"

@TevinWang TevinWang changed the title Fix parallel shard download interleaving output common : fix parallel shard download interleaving output Apr 22, 2024
@phymbert
Copy link
Collaborator

Well, I would prefer to properly implement a global progress callback rather than completely remove progression.

@TevinWang
Copy link
Author

@phymbert implemented a global progress callback and updated the PR description!

double received_bytes;
};

std::map<std::string, shard_file_progress> progress_table;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must be done without global variables.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it something easily possible you think @TevinWang ?

@mofosyne mofosyne added bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix labels May 9, 2024
@phymbert
Copy link
Collaborator

@TevinWang do you need help on the global variable issue ?

@ggerganov do you confirm the issue with the proposed approach? IMHO this contribution is valuable as the console output is currently hard to read

@ggerganov
Copy link
Owner

Yes, the globals should be avoided

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

common: download from URL, improve parallel download progress status
5 participants