Skip to content

ollama_en

ymcui edited this page May 8, 2024 · 7 revisions

Using Ollama for Chatting

Ollama is a cross-platform (macOS, Windows, Linux) large model chat program capable of loading GGUF format models (from llama.cpp). Here's a brief guide on how to use it. For additional functionalities, please explore and consult the official manual.

Step 1: Download the Application for Your Platform

Go to the official page to download the software for your platform: Ollama Download

⚠️ You must use v0.1.33 or above versions to fully support Llama-3 series.

image

Step 2: Install Ollama

  • macOS: After downloading, drag it into the "Applications" folder.

  • Windows preview: Download and run the .exe file.

  • Linux: Execute the following command:

    curl -fsSL https://ollama.com/install.sh | sh
    

For other platforms, refer to: Ollama GitHub

Step 3: Create a Modelfile

Write a Modelfile in a text editor with the following content:

FROM /your-path-to-ggml/ggml-model-q8_0.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""
SYSTEM """You are a helpful assistant. 你是一个乐于助人的助手。"""
PARAMETER temperature 0.2
PARAMETER num_keep 24
PARAMETER stop <|start_header_id|>
PARAMETER stop <|end_header_id|>
PARAMETER stop <|eot_id|>

Here:

  • FROM specifies the path to the GGUF file. Since this is for interactive chatting, use the Instruct model.
  • TEMPLATE defines the format for the Llama-3-Instruct instruction template.
  • SYSTEM defines the system instructions (currently set to empty).
  • PARAMETER sets several hyperparameters, for a detailed list see: Ollama Modelfile Documentation

Step 4: Create a Model Instance

Run the following command in the terminal to create a model instance named llama3-zh-inst, loading the Modelfile configuration:

ollama create llama3-zh-inst -f Modelfile

You should see a process log output like this:

transferring model data
creating model layer
creating template layer
creating system layer
creating parameters layer
creating config layer
using already created layer sha256:f2a44c6358e8e0a60337f8a1b31f55f457558eeefd4f344272e44b0e73a86a32
using already created layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f
writing layer sha256:b821abf159071cfc90f0941b5ca7ef721f229cfcfadcf95b5c58d0ceb3e773c7
writing layer sha256:dc4ec177268acc3382fc6c3a395e577bf13e9e0340dd313a75f62df95c48bc1d
writing manifest
success

When it outputs success, it indicates the creation is complete.

Step 5: Start Chatting

Enter the following command to start the chat program:

ollama run llama3-zh-inst

Type your prompts after >>>; to end the chat, type /bye.

For other uses of Ollama, please refer to the official documentation: Ollama CLI Reference

Clone this wiki locally