-
Notifications
You must be signed in to change notification settings - Fork 154
ollama_en
Ollama is a cross-platform (macOS, Windows, Linux) large model chat program capable of loading GGUF format models (from llama.cpp). Here's a brief guide on how to use it. For additional functionalities, please explore and consult the official manual.
Go to the official page to download the software for your platform: Ollama Download
-
macOS: After downloading, drag it into the "Applications" folder.
-
Windows preview: Download and run the .exe file.
-
Linux: Execute the following command:
curl -fsSL https://ollama.com/install.sh | sh
For other platforms, refer to: Ollama GitHub
Write a Modelfile
in a text editor with the following content:
FROM /your-path-to-ggml/ggml-model-q8_0.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ .Response }}<|eot_id|>"""
SYSTEM """You are a helpful assistant. 你是一个乐于助人的助手。"""
PARAMETER temperature 0.2
PARAMETER num_keep 24
PARAMETER stop <|start_header_id|>
PARAMETER stop <|end_header_id|>
PARAMETER stop <|eot_id|>
Here:
-
FROM
specifies the path to the GGUF file. Since this is for interactive chatting, use the Instruct model. -
TEMPLATE
defines the format for the Llama-3-Instruct instruction template. -
SYSTEM
defines the system instructions (currently set to empty). -
PARAMETER
sets several hyperparameters, for a detailed list see: Ollama Modelfile Documentation
Run the following command in the terminal to create a model instance named llama3-zh-inst, loading the Modelfile
configuration:
ollama create llama3-zh-inst -f Modelfile
You should see a process log output like this:
transferring model data
creating model layer
creating template layer
creating system layer
creating parameters layer
creating config layer
using already created layer sha256:f2a44c6358e8e0a60337f8a1b31f55f457558eeefd4f344272e44b0e73a86a32
using already created layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f
writing layer sha256:b821abf159071cfc90f0941b5ca7ef721f229cfcfadcf95b5c58d0ceb3e773c7
writing layer sha256:dc4ec177268acc3382fc6c3a395e577bf13e9e0340dd313a75f62df95c48bc1d
writing manifest
success
When it outputs success
, it indicates the creation is complete.
Enter the following command to start the chat program:
ollama run llama3-zh-inst
Type your prompts after >>>
; to end the chat, type /bye
.
For other uses of Ollama, please refer to the official documentation: Ollama CLI Reference
- Model Reconstruction
- Model Quantization, Inference and Deployment
- System Performance
- Training Scripts
- FAQ