Support Gemma 2 Model Family for Offline Chat #855

debanjum · 2024-07-17T21:46:17Z

Overview

Gemma 2 is a new open model family by Google. They've released a 9B, 29B param model. A 2B model is also expected.
It performs really well on the Chatbot arena and shows good performance when testing within Khoj as well.
Llama.cpp support for Gemma 2 architecture seems to have stabilized
If Gemma 2 performs well in further testing, it can be made the default offline chat model for Khoj
- Once the 2B param model is released, the model size to download can be automatically chosen based on (V)RAM available

Major

Support Gemma 2 for Offline Chat
Improve and fix chat model prompts for better, consistent context

Minor

Fix and improve offline chat actor, director tests
Improve offline chat truncation to consider chat message delimiter tokens

- Pass system message as the first user chat message as Gemma 2 doesn't support system messages - Use gemma-2 chat format - Pass chat model name to generic, extract questions chat actors Used to figure out chat template to use for model For generic chat actor argument was anyway available but not being passed, which is confusing

- Add day of week to system prompt of openai, anthropic, offline chat models - Pass more context to offline chat system prompt to - ask follow-up questions - know where to find information about khoj (itself) - Fix output mode selection prompt. Log error if model does not select valid option from list of valid output modes provided - Use consistent names for question, answers passed to extract_questions_offline prompt - Log which model extracts question, what the offline chat model sees as context. Similar to debug log shown for openai models

- Use updated references schema with compiled key - Enable director tests that are now expected to pass and that do pass (with Gemma 2 at least)

src/khoj/processor/conversation/anthropic/anthropic_chat.py

src/khoj/processor/conversation/prompts.py

src/khoj/processor/conversation/utils.py

src/khoj/processor/conversation/offline/chat_model.py

debanjum requested a review from sabaimran July 17, 2024 21:53

debanjum force-pushed the support-gemma-2-for-offline-chat branch from 0346016 to 08956a4 Compare July 17, 2024 22:01

debanjum added 3 commits July 18, 2024 03:43

Improve offline chat truncation to consider message separator tokens

b0ee785

Fix and improve offline chat actor, director tests

e9f86e3

- Use updated references schema with compiled key - Enable director tests that are now expected to pass and that do pass (with Gemma 2 at least)

debanjum force-pushed the support-gemma-2-for-offline-chat branch from 08956a4 to e9f86e3 Compare July 17, 2024 22:28

sabaimran reviewed Jul 22, 2024

View reviewed changes

sabaimran reviewed Jul 23, 2024

View reviewed changes

src/khoj/processor/conversation/offline/chat_model.py Show resolved Hide resolved

debanjum merged commit 498fe24 into master Jul 23, 2024
6 checks passed

debanjum deleted the support-gemma-2-for-offline-chat branch July 30, 2024 09:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Gemma 2 Model Family for Offline Chat #855

Support Gemma 2 Model Family for Offline Chat #855

debanjum commented Jul 17, 2024 •

edited

Loading

Support Gemma 2 Model Family for Offline Chat #855

Support Gemma 2 Model Family for Offline Chat #855

Conversation

debanjum commented Jul 17, 2024 • edited Loading

Overview

Major

Minor

debanjum commented Jul 17, 2024 •

edited

Loading