ollama_en

Using Ollama for Chatting

Ollama is a cross-platform (macOS, Windows, Linux) large model chat program capable of loading GGUF format models (from llama.cpp). Here's a brief guide on how to use it. For additional functionalities, please explore and consult the official manual.

Step 1: Download the Application for Your Platform

Go to the official page to download the software for your platform: Ollama Download

⚠️ You must use v0.1.33 or above versions to fully support Llama-3 series.

Step 2: Install Ollama

macOS: After downloading, drag it into the "Applications" folder.
Windows preview: Download and run the .exe file.

Linux: Execute the following command:

curl -fsSL https://ollama.com/install.sh | sh

For other platforms, refer to: Ollama GitHub

Step 3: Create a `Modelfile`

Write a Modelfile in a text editor with the following content:

FROM /your-path-to-ggml/ggml-model-q8_0.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""
SYSTEM """You are a helpful assistant. 你是一个乐于助人的助手。"""
PARAMETER temperature 0.2
PARAMETER num_keep 24
PARAMETER stop <|start_header_id|>
PARAMETER stop <|end_header_id|>
PARAMETER stop <|eot_id|>

Here:

FROM specifies the path to the GGUF file. Since this is for interactive chatting, use the Instruct model.
TEMPLATE defines the format for the Llama-3-Instruct instruction template.
SYSTEM defines the system instructions (currently set to empty).
PARAMETER sets several hyperparameters, for a detailed list see: Ollama Modelfile Documentation

Step 4: Create a Model Instance

Run the following command in the terminal to create a model instance named llama3-zh-inst, loading the Modelfile configuration:

ollama create llama3-zh-inst -f Modelfile

You should see a process log output like this:

transferring model data
creating model layer
creating template layer
creating system layer
creating parameters layer
creating config layer
using already created layer sha256:f2a44c6358e8e0a60337f8a1b31f55f457558eeefd4f344272e44b0e73a86a32
using already created layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f
writing layer sha256:b821abf159071cfc90f0941b5ca7ef721f229cfcfadcf95b5c58d0ceb3e773c7
writing layer sha256:dc4ec177268acc3382fc6c3a395e577bf13e9e0340dd313a75f62df95c48bc1d
writing manifest
success

When it outputs success, it indicates the creation is complete.

Step 5: Start Chatting

Enter the following command to start the chat program:

ollama run llama3-zh-inst

Type your prompts after >>>; to end the chat, type /bye.

For other uses of Ollama, please refer to the official documentation: Ollama CLI Reference

中文文档

English Docs

Model Reconstruction
Model Quantization, Inference and Deployment
System Performance
Training Scripts
- Pre-training Scripts
- Instruction Fine-tuning Scripts
FAQ

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ollama_en

Using Ollama for Chatting

Step 1: Download the Application for Your Platform

Step 2: Install Ollama

Step 3: Create a `Modelfile`

Step 4: Create a Model Instance

Step 5: Start Chatting

中文文档

English Docs

Clone this wiki locally

ollama_en

Using Ollama for Chatting

Step 1: Download the Application for Your Platform

Step 2: Install Ollama

Step 3: Create a Modelfile

Step 4: Create a Model Instance

Step 5: Start Chatting

中文文档

English Docs

Clone this wiki locally

Step 3: Create a `Modelfile`