Large Language Model (LLM) configuration

Large Language Model (LLM) configuration

LLM API Keys

We expect that you have configured the environment variables required for the LLM you are attempting to use. For example:

OpenAI service requires: OPENAI_API_KEY=my-secret-api-key-value
IBM BAM service requires: GENAI_KEY=my-secret-api-key-value

IBM BAM Service

The development team has been using the IBM BAM service to aid development and testing:

IBM Big AI Model (BAM) laboratory is where IBM Research designs, builds, and iterates on what’s next in foundation models. Our goal is to help accelerate the transition from research to product. Come experiment with us.

Warning

In order to use this service an individual needs to obtain a w3id from IBM. The kai development team is unable to help obtaining this access.

Login to https://bam.res.ibm.com/.
To access via an API you can look at ‘Documentation’ after logging into https://bam.res.ibm.com/. You will see a field embedded in the 'Documentation' section where you can generate/obtain an API Key.
Ensure you have exported the key via export GENAI_KEY=my-secret-api-key-value.

Related client tooling:

IBM Generative AI Python SDK
LangChain integration

OpenAI Service

If you have a valid API Key for OpenAI you may use this with Kai.

Follow the directions from OpenAI here.
Ensure you have exported the key via export OPENAI_API_KEY=my-secret-api-key-value

Selecting a Model

We offer configuration choices of several models via config.toml which line up to choices we know about from kai/model_provider.py.

To change which llm you are targeting, open config.toml and change the [models] section to one of the following:

IBM-Served granite

[models]
  provider = "ChatIBMGenAI"

  [models.args]
  model_id = "ibm/granite-13b-chat-v2"

IBM-Served mistral

[models]
  provider = "ChatIBMGenAI"

  [models.args]
  model_id = "mistralai/mixtral-8x7b-instruct-v01"

IBM-Served codellama

[models]
  provider = "ChatIBMGenAI"

  [models.args]
  model_id = "meta-llama/llama-2-13b-chat"

IBM-Served llama3

  # Note:  llama3 complains if we use more than 2048 tokens
  # See:  https://github.com/konveyor-ecosystem/kai/issues/172
[models]
  provider = "ChatIBMGenAI"

  [models.args]
  model_id = "meta-llama/llama-3-70b-instruct"
  parameters.max_new_tokens = 2048

IBM-Served Ollama

[models]
  provider = "ChatOllama"

  [models.args]
  model = "mistral"

OpenAI GPT 4

[models]
  provider = "ChatOpenAI"

  [models.args]
  model = "gpt-4"

OpenAI GPT 3.5

[models]
  provider = "ChatOpenAI"

  [models.args]
  model = "gpt-3.5-turbo"

OpenAI API Compatible Alternatives

In general Kai will work with OpenAI Compatible API alternatives. Two examples are Podman Desktop and Oobabooga Text generation web UI. Once your alternative is installed all that is necessary is to export OPENAI_API_BASE in addition to OPENAI_API_KEY.

Podman Desktop

Install

Installation will vary depending on your operating system and distribution and is documented on the Podman Desktop website.

https://podman-desktop.io/docs/installation

Configuration

Start Podman Desktop
Navigate to the Extensions
Select the Catalog
Search for Podman AI Lab
Install the Podman AI Lab Extension
Navigate to the AI Lab
Under Models select Catalog
Download one or more models
Navigate to Services
Click New Model Service
Select a model to serve and click Create Service
On the Service details page note the server URL to use with Kai
Export the URL, for example export OPENAI_API_BASE="http://localhost:35841/v1"
Note that the Podman Desktop service endpoint is not passworded, but the OpenAI library expects OPENAI_API_KEY to be set. In this case the value does not matter.
Adjust your config.toml settings if necessary

[models]
  provider = "ChatOpenAI"

[models.args]
  model = "mistral-7b-instruct-v0-2"

OpenShift AI

OpenShift AI also provides an OpenAI compatible API with vLLM
The vLLM runtime can be added to your cluster if not already available by following these instructions
Export the URL, for example export OPENAI_API_BASE=https://mistralaimistral-7b-instruct-v02-kyma-workshop.apps.cluster.example.com/v1"
When vLLM serves models it does so from the /mnt/models/ directory in the container, and this is where the model name is taken from, so in all cases use '/mnt/models/` for the model name.
Adjust your config.toml

[models]
  provider = "ChatOpenAI"

[models.args]
  model = "/mnt/models/"

Known Issues

We have experienced problems due to the model context being too short for our inputs with some models. It is currently possibly, though somewhat difficult to workaround this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm_selection.md

llm_selection.md

Large Language Model (LLM) configuration

LLM API Keys

IBM BAM Service

OpenAI Service

Selecting a Model

IBM-Served granite

IBM-Served mistral

IBM-Served codellama

IBM-Served llama3

IBM-Served Ollama

OpenAI GPT 4

OpenAI GPT 3.5

OpenAI API Compatible Alternatives

Podman Desktop

Install

Configuration

OpenShift AI

Known Issues

Files

llm_selection.md

Latest commit

History

llm_selection.md

File metadata and controls

Large Language Model (LLM) configuration

LLM API Keys

IBM BAM Service

OpenAI Service

Selecting a Model

IBM-Served granite

IBM-Served mistral

IBM-Served codellama

IBM-Served llama3

IBM-Served Ollama

OpenAI GPT 4

OpenAI GPT 3.5

OpenAI API Compatible Alternatives

Podman Desktop

Install

Configuration

OpenShift AI

Known Issues