-
Notifications
You must be signed in to change notification settings - Fork 364
Use a spinner for model loading information #161
Conversation
llama-cli/Cargo.toml
Outdated
clap = {version = "4.1.8", features = ["derive"]} | ||
color-eyre = {version = "0.6.2", default-features = false} | ||
env_logger = "0.10.0" | ||
log = "0.4" | ||
num_cpus = "1.15.0" | ||
once_cell = "1.17.1" | ||
rustyline = "11.0.0" | ||
spinners = "4.1.0" | ||
spinoff = { version = "0.7.0", default-features = false, features = ["dots2"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are pretty similar, but spinoff
is much more versatile as it allows updating the message
Switch from annoyingly long `log` messages to a spinner during model loading. This has the downside of always printing, rather than being controlled by `RUST_LOG`, but also means we only get a single line of output that's updated as loading progresses.
llama-cli/src/cli_args.rs
Outdated
.success(&format!( | ||
"Loaded {tensor_count} tensors from '{}' ({}) after {}ms", | ||
file.to_string_lossy(), | ||
bytesize::to_string(byte_size, false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bytesize
will handle differently sized models more aesthetically than using a fixed 1MB divisor.
llama-rs/src/loader_common.rs
Outdated
@@ -105,7 +105,7 @@ pub enum LoadProgress<'a> { | |||
/// The path to the model part. | |||
file: &'a Path, | |||
/// The number of bytes in the part. | |||
byte_size: usize, | |||
byte_size: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to reverting this, but it seems reasonable to match the type of https://doc.rust-lang.org/std/fs/struct.Metadata.html#method.len.
llama-rs/src/loader2.rs
Outdated
@@ -190,7 +196,7 @@ pub(crate) fn load( | |||
|
|||
(load_progress_callback)(LoadProgress::PartLoaded { | |||
file: &path, | |||
byte_size: 0, | |||
byte_size: filesize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the argument for leaving this 0 is that we don't know the exact size of the tensors since there's other stuff in the file too. It seems reasonable to fall back to the whole file in that case, but the other decent option would be to change the type to Option<u64>
and not print tensor size if we get None
.
Looks good to me! We're going to merge #162 first, but I'll fix up this PR and merge it in afterwards :) |
Thanks for the PR! Definitely nice to see the spinner, especially for non-mmappable models :) |
Switch from annoyingly long
log
messages to a spinner during model loading.This has the downside of always printing, rather than being controlled by
RUST_LOG
, but also means we only get a single line of output that's updated as loading progresses.An example of the end result: