-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting started documentation #1198
base: master
Are you sure you want to change the base?
Getting started documentation #1198
Conversation
It appears this file is only used during tests as of now. Removing it from the model folder makes it more flexible for how users are loading their model data into the project (e.g. are they using a docker bind-mounts, are they using symlinks, are they DLing models directly into this folder?) By moving this, the instructions for getting started can be safely simplified to: $ rm models/.gitkeep $ rm -r models $ ln -s /mnt/c/ai/models/LLaMA $(pwd)/models I think it's a good idea because the model files are quite large, and be useful across multiple projects so symlinks shine in this use case without creating too much confusion for the onboardee..
This minor (though time consuming) change, 1) Moves the models/gmml-vocab.bin file into the test folder. 2) changes the order in which information is presented to the user 3) recommends using symlinks to link training data into the right place in the repo 4) adds some clarification around the importance of the model weights 1 is handy because it enables 'automation' towards 3, e.g. the command rm -r models/ can safely be symlinked into the models folder and the commands to do so are clearly listed and described in the README.md 2 is ultimately the only important aspect of this change. The readme currently must be read in full by the user, cached, and then returned to in order to follow along with all the steps in the documentation. 3 is (I think) handy because these files are pretty huge and not exclusive to this repo. Symlinks shine in this as that many symlinks can be created across multiple projects and all point to the same source location. If researchers were copying/ pasting these to each project, it would get out of hand fast I think. 4 seems valuable, the AI world looks really opaque to people just getting started. I did my best to be accurate with my statements in the hops that it makes it more possible for humans to become more aware of this technology and what's happening to the internet and the world.
README.md
Outdated
|
||
In order to build llama.cpp you have three different options. | ||
These commands are specific to Ubuntu linux but OS specific varients are just a google away given this handy dependency list. Also, if you're using your windows gaming machine, some users have reported great success in using [WSL2](https://github.com/ggerganov/llama.cpp/issues/103#issuecomment-1470440202) to install Ubuntu within Windows and following the linux build instructions to run this project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't speak for which OS the author had in mind, but these commands work on Debian and any Debian based distros, including Ubuntu, the infamous Kali, and even Raspberry Pi OS (previously Raspbian).
Also, while WSL2 works great for building this project in a Windows environment (I didn't even have to edit any files like the guy in that link), it's not as easy to get BLAS support working and the speed difference alone is probably worth building a native Windows app (at least for anyone with a modern video card).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I was interested in the higher parameter models and found that I needed the below snippet to use more of my system's resources.
[wsl2]
memory=51GB
processors=10
swap=4GB
I was struggling between keeping the content concise and as minimal as possible so balanced here. I'm open to changing things of course.
@@ -198,6 +206,8 @@ In order to build llama.cpp you have three different options. | |||
zig build -Drelease-fast | |||
``` | |||
|
|||
Don't forget to install the Python dependencies (e.g. `python -m pip install -r requirements.txt`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not saying this shouldn't be here, but it's already mentioned above and these "requirements" aren't for llama.cpp, but rather for converting between formats (and not necessary to use llama.cpp as a lib, for quantization, or inference). If it really needs to be in here twice, it might be better to put it with the convert.py code below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I threw it in at the last minute because it was hard to move the note out of the "prep the data" section without including it in each of the installation workflows.
It seems as though 3-4 install workflows are emerging and should be supported in the near future, one for Linux, Mac, Windows, and Docker. I have a sense that each section should be "complete" and stand on their own, and also include the latest "best approach" so yes, BLAS as you mentioned above. I hadn't updated to the new BLAS workflow so it's a bit out-of-date, but I think that should be included in the install workflows that support it.
README.md
Outdated
### Aquiring Setting up the 7b model weights | ||
|
||
You can use this system to conduct research on an AI chatbot vaguely comparable to ChatGPT-3 and it will even run on your local machine without needing massive amounts of hardware. But to do so you **must** install the Alpaca 7b model weights into the models folder. | ||
|
||
Because these resources belong to Facebook, their official path to obtaining the data should be followed. While it's true that most researchers using the Alpaca weights obtained them from a magnet link to a torrent file, linking or sharing that magnet link should not be done in this repo due to the questionability of violating FaceBook's IP rights and also (not to be an alarmist here) the potential for the popularization of these weights to cause harm. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up until your edit, the 7B parameter models mentioned were the official LLaMA models from FaceBook. Suddenly this section starts talking about those 7B parameter models as if they were the Alpaca weights. This makes it sound like llama.cpp was designed for Alpaca. Perhaps you aren't familiar with the differences, but they are not the same thing.
Quick aside, the Alpaca models do offer a somewhat ChatGPT like experience (with -ins
), but it's weird to single them out at this point. There are others that work as well or better than the Alpaca models.
Also, by merging this "acquiring" section into the 7B example, it makes it sound like people should specifically acquire 7B-sized models when that was just the usage example. This works with all size LLaMA models.
And then why are we saying "most researchers" are using models they got from Torrents. I guess it's possible, but do we even know if that's true? I feel like if they were going to write a paper about it, they'd go through the proper channels to acquire the models.
I also don't think we need to warn people about harm. There are lots of free, open source models being released lately, this seems to unfairly point fingers at LLaMA. (Or Alpaca?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is also not a requirement to install any models. 7B is not a requirement, why do you think so, @TheNotary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DannyDaemonic, @SlyEcho, you caught me being a confused newcomer here, I'll update it to make it correct information, my mistakes!
I was hoping to include "some model" to get the project to do something useful and exciting out of the box. I think the draw to this project is that people (and AI/ scrapers of course) are writing articles, creating youtubes, etc. all beaming with excitement about how you can "run your own ChatGPT locally."
I think the question then is, should instructions around the alpaca weights be kept outside of the Getting Started section? I have a slight fear that this could cause confusion in people finding this repo given all the hype, but I won't disagree if that's the consensus. Respectfully, please understand that I'd want to hear from the original author who thought to include the LLaMA docs in the first place before making a departure in content to the README.
Regardless of which model users install first, I feel as though one model should be included, just as an example to get going with. Do you agree, and if so, which model install should be described?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, just one minor comment. Regarding most researchers using a torrent, I don't think it's clear the extend to which Meta is making their content available publicly. There are more reports of people filling out the form and then not hearing back from them so I believe on top of the massive popularity of the magnet link, I think it's a safe assumption to make even if it can't be verified with 100% certainty.
README.md
Outdated
|
||
#### Putting the Model Weights in the Right Spot | ||
|
||
This guide will assume that you've downloaded the files to an arbitrary folder, `/mnt/c/ai/models/LLaMA` using some responsible means described above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This path is oddly specific. If we do need a specific path, it might be best to use something a little more platform agnostic and, at the very least, drop the /mnt/c
part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, it's specific to using WSL2. I had to do a little research to find the path to my Windows C drive, but I'll correct this with /path/to/LLaMA
(assuming documenting the LLaMA folder's name is of value per above).
rm models/.gitkeep | ||
|
||
# Now that the folder is empty, this command can safely remove the models/ folder or errors if something is still there | ||
rm -r models/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This directory contains ggml-vocab.bin
. If we're going to recommend people delete that directory as part of "getting started" this file needs to be moved. Although maybe that's your intention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it necessary to delete the directory, and why is it even committed to git in that case? ggml-vocab.bin
is a model file, it contains the tokenization data.
Also, loading the models over the 9P file sharing that WSL2 uses is super slow (it basically doesn't work for mmap at all)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not deleted, it was (I hope) carefully moved to tests/
.
I think the ggml-vocab.bin
file was once used more generally in the preparation of data phase, but is now only used by the tests? I didn't want to get too involved with figuring out a long term-solution about managing project-internal bin files such as including a download required bin files during testing so just moved it into tests/
but am open to that discussion.
Also, thanks and noted @SlyEcho I'll do some benchmarks and see if they're still having bad performance but in most cases symlinks are transparent in linux/ mac.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SlyEcho Holy cow, thanks for this catch about performance, I'm very grateful. Between running WSL2 over p9 vs the native build, native is almost 10x faster. As a linux/ mac devloper who also has a gaming machine, I liked the comfort that WSL2 provided me when onboarding, but yeah, maybe there should be a remark on 9P file sharing.
In my opinion the changes proposed to the readme are very welcome. I do believe that these changes belong in the wiki more than in the readme though. We can have a wiki with build instructions per-platform, so the readme stays focused and additional details for less experienced users will be available on the wiki. We can link to the wiki pages from the readme. |
|
||
In order to build llama.cpp you have three different options. | ||
These commands are specific to Ubuntu linux but OS specific varients are just a google away given this handy dependency list. Also, if you're using your windows gaming machine, some users have reported great success in using [WSL2](https://github.com/ggerganov/llama.cpp/issues/103#issuecomment-1470440202) to install Ubuntu within Windows and following the linux build instructions to run this project, but the CMAKE path is really easy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These commands are specific to Ubuntu linux but OS specific varients are just a google away given this handy dependency list. Also, if you're using your windows gaming machine, some users have reported great success in using [WSL2](https://github.com/ggerganov/llama.cpp/issues/103#issuecomment-1470440202) to install Ubuntu within Windows and following the linux build instructions to run this project, but the CMAKE path is really easy. | |
These commands are specific to Ubuntu linux but OS-specific variants are just a google away given this handy dependency list. Also, if you're using your windows gaming machine, some users have reported great success in using [WSL2](https://github.com/ggerganov/llama.cpp/issues/103#issuecomment-1470440202) to install Ubuntu within Windows and following the linux build instructions to run this project, but the CMAKE path is really easy. |
Hi, great work on this project. As I went through the documentation, I noticed a few things that slowed me down. I merged my notes into the existing readme in case it helps any. Thanks again for sharing this!
This minor (though time consuming) change...
Moves the models/gmml-vocab.bin file into the test folder.
changes the order in which information is presented to the user
recommends using symlinks to link training data into the right place in the repo
adds some clarification around the importance of the model weights
1 is handy because it enables 'automation' towards 3, e.g. the command rm -r models/ can safely be symlinked into the models folder and the commands to do so are clearly listed and described in the README.md According to my research (and please correct me if I'm wrong) this bin file is only used in the tests currently, so is non-disruptive in it's new home.
2 is ultimately the only important aspect of this change. The readme currently must be read in full by the user, cached, and then returned to in order to follow along with all the steps in the documentation.
3 is (I think) handy because these files are pretty huge and not exclusive to this repo. Symlinks shine in this as that many symlinks can be created across multiple projects and all point to the same source location. If researchers were copying/ pasting these to each project, it would get out of hand fast I think.
4 seems valuable, the AI world looks really opaque to people just getting started. I did my best to be accurate with my statements in the hops that it makes it more possible for humans to become more aware of this technology and what's happening to the internet and the world.