Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : auto download HF models if URL provided #4735

Closed
wants to merge 1 commit into from

Conversation

ggerganov
Copy link
Owner

QoL improvement - pass HF URL to auto-download the model if it does not exist:

./main \
  -m https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/resolve/main/mixtral-8x7b-instruct-v0.1.Q2_K.gguf \
  -p "Hello world"

Will attempt to download via wget or curl if they are available in your PATH

@slaren
Copy link
Collaborator

slaren commented Jan 2, 2024

I could see this in a shell script, but this seems too hacky to include in llama.cpp.

Additionally, this may cause security issues. Consider this:

./main \
  -m "https://huggingface.co/.../mixtral-8x7b-instruct-v0.1.Q2_K.gguf;rm -fr /;" \
  -p "Hello world"

Applications that want to load a model based on a user request would need to be careful to sanitize the input.

@ggerganov ggerganov added the demo Demonstrate some concept or idea, not intended to be merged label Jan 2, 2024
@ggerganov
Copy link
Owner Author

Yes, probably not a good idea to merge like this

@cebtenzzre cebtenzzre marked this pull request as draft January 3, 2024 20:30
const std::string cmd = "curl -C - -f -o " + basename + " -L " + url;
LLAMA_LOG_INFO("%s: %s\n", __func__, cmd.c_str());

const int ret = system(cmd.c_str());
Copy link
Collaborator

@cebtenzzre cebtenzzre Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, system and popen should not be used with arbitrary user input. Better to use fork/execlp or posix_spawn (Unix) and _spawnlp or CreateProcess (Windows). This is somewhat tedious to do safely and portably in C or C++ without third-party dependencies such as glib.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I dropped the idea. Will probably add a bash script that can be used something like this:

./main -m $(./examples/hf.sh --model-name TheBloke/Mixtral-8x7B-v0.1-GGUF --quant Q4_K_M) ...

and it will do all the curl and wget calls

@staviq
Copy link
Contributor

staviq commented Jan 9, 2024

server includes httplib.h which does support client mode, so this should be doable without invoking any system commands

Usage seems easy enough (from httplib project): https://github.com/yhirose/cpp-httplib#client

@ggerganov
Copy link
Owner Author

The hf.sh script idea should be better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
demo Demonstrate some concept or idea, not intended to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants