Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Gemma 2 Model Family for Offline Chat #855

Merged
merged 4 commits into from
Jul 23, 2024

Conversation

debanjum
Copy link
Member

@debanjum debanjum commented Jul 17, 2024

Overview

  • Gemma 2 is a new open model family by Google. They've released a 9B, 29B param model. A 2B model is also expected.
  • It performs really well on the Chatbot arena and shows good performance when testing within Khoj as well.
  • Llama.cpp support for Gemma 2 architecture seems to have stabilized
  • If Gemma 2 performs well in further testing, it can be made the default offline chat model for Khoj
    • Once the 2B param model is released, the model size to download can be automatically chosen based on (V)RAM available

Major

  • Support Gemma 2 for Offline Chat
  • Improve and fix chat model prompts for better, consistent context

Minor

  • Fix and improve offline chat actor, director tests
  • Improve offline chat truncation to consider chat message delimiter tokens

- Pass system message as the first user chat message as Gemma 2
  doesn't support system messages
- Use gemma-2 chat format
- Pass chat model name to generic, extract questions chat actors
  Used to figure out chat template to use for model
  For generic chat actor argument was anyway available but not being
  passed, which is confusing
@debanjum debanjum requested a review from sabaimran July 17, 2024 21:53
@debanjum debanjum force-pushed the support-gemma-2-for-offline-chat branch from 0346016 to 08956a4 Compare July 17, 2024 22:01
debanjum added 3 commits July 18, 2024 03:43
- Add day of week to system prompt of openai, anthropic, offline chat models
- Pass more context to offline chat system prompt to
  - ask follow-up questions
  - know where to find information about khoj (itself)
- Fix output mode selection prompt. Log error if model does not select
  valid option from list of valid output modes provided
- Use consistent names for question, answers passed to
  extract_questions_offline prompt

- Log which model extracts question, what the offline chat model sees
  as context. Similar to debug log shown for openai models
- Use updated references schema with compiled key
- Enable director tests that are now expected to pass and that do pass
  (with Gemma 2 at least)
@debanjum debanjum force-pushed the support-gemma-2-for-offline-chat branch from 08956a4 to e9f86e3 Compare July 17, 2024 22:28
@debanjum debanjum merged commit 498fe24 into master Jul 23, 2024
6 checks passed
@debanjum debanjum deleted the support-gemma-2-for-offline-chat branch July 30, 2024 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants