add missing files

twinnydotdev · Apr 30, 2024 · 88afbaf · 88afbaf
1 parent f525b8e
commit 88afbaf
Show file tree

Hide file tree

Showing 10 changed files with 274 additions and 0 deletions.
diff --git a/src/assets/twinny-chat.gif b/src/assets/twinny-chat.gif
diff --git a/src/assets/twinny-code-completion.gif b/src/assets/twinny-code-completion.gif
diff --git a/src/assets/twinny-menu.png b/src/assets/twinny-menu.png
diff --git a/src/content/docs/general/chat.md b/src/content/docs/general/chat.md
@@ -0,0 +1,12 @@
+---
+title: Chat
+description: Chat with Twinny
+---
+
+## Open side panel
+
+To use Twinny chat open the side panel using the extension icon or by using the keyboard shortcut `CTRL+SHIFT+Z CTRL+SHIFT+T`.  You can then start typing.  Twinny will retain the chat history between sessions.  You can find the chat history by clicking on the history icon on the side panel.
+
+## Context and code selection
+
+When you highlight/select code in your editor Twinny will use that as the context for the chat message.  If you have not selected any code then it will use the message alone and any previous messages. You can also right click on selected code and select a Twinny option to refactor, explain and other actions. 
diff --git a/src/content/docs/general/fill-in-middle.md b/src/content/docs/general/fill-in-middle.md
@@ -0,0 +1,10 @@
+---
+title: Fill in the middle 
+description: Fill in the middle  
+---
+
+To use Twinny to fill in the middle of a code snippet, just start typing in the editor and Twinny will autocomplete for you. Very similar to how Github Copilot works.
+
+If you prefer to trigger code completion manually turn off automatic inline code completion in the settings menu which can be found at the top of the Twinny side panel, and then use the keyboard shortcut `ALT+\` to trigger code completion.
+
+Github Copilot and Twinny share the same keyboard shortcuts, so might interfere with each other enable and disable them as needed.
diff --git a/src/content/docs/general/keyboard-shortcuts.md b/src/content/docs/general/keyboard-shortcuts.md
@@ -0,0 +1,13 @@
+---
+title: Keyboard shortcuts
+description: Keyboard shortcuts for Twinny.
+
+---
+
+| Shortcut                    | Description                                      |
+| ----------------------------| -------------------------------------------------|
+| `ALT+\`                     | Trigger inline code completion                   |
+| `CTRL+SHIFT+/`              | Stop the inline code generation                  | 
+| `Tab`                       | Accept the inline code generated                 |
+| `CTRL+SHIFT+Z CTRL+SHIFT+T` | Open Twinny sidebar                              |
+| `CTRL+SHIFT+Z CTRL+SHIFT+G` | Generate commit messages from staged changes   |
diff --git a/src/content/docs/general/providers.md b/src/content/docs/general/providers.md
@@ -0,0 +1,137 @@
+---
+title: Inference providers
+description: Inference providers are a way to connect Twinny with external models and services. 
+---
+
+These example configurations serve as a starting point. Individual adjustments may be required depending on your specific hardware and software environments.
+
+
+> Note: Twinny chat (not auto-complete) should be compatible with any API which adheres to the OpenAI API specification.
+
+
+### Ollama (recommended and configured by default)
+
+#### Auto-complete
+
+- **Hostname:** `localhost`
+- **Port:** `11434`
+- **Path:** `/api/generate`
+- **Model Name:** `codellama:7b-code`
+- **FIM Template:** `codellama`
+
+#### Chat
+
+- **Hostname:** `localhost`
+- **Port:** `11434`
+- **Path:** `/v1/chat/completions`
+- **Model Name:** `codellama:7b-instruct` 
+
+### Open WebUI using Ollama
+
+Open WebUI can be used a proxy Ollama, simply configure the endpoint to match what is served by OpenWeb UI.
+
+#### Auto-complete
+
+- **Hostname:** `localhost`
+- **Port:** The port OpenWebUI is serving, typically `8080` or `3000`.
+- **Path:** `/ollama/api/generate`
+- **Model Name:** `codellama:7b-code`
+- **FIM Template:** Select a template that matches the model, such as `codellama` for `codellama:7b-code` or `deepseek` for `deepseek-coder`.
+
+#### Chat
+
+- **Hostname:** `localhost`
+- **Port:** The port OpenWebUI is serving, typically `8080` or `3000`.
+- **Path:** `/ollama/v1/chat/completions`
+- **Model Name:** `codellama:7b-instruct` or any effective instruct model.
+
+### LM Studio
+
+#### Auto-complete
+
+- **Hostname:** `localhost`
+- **Port:** `1234`
+- **Path:** `/v1/completions`
+- **Model Name:** Base model such as `codellama-7b.Q5_K_M.gguf`
+- **LM Studio Preset:** CodeLlama Completion
+- **FIM Template:** Select a template that matches the model, such as `codellama` for `CodeLlama-7B-GGUF` or `deepseek` for `deepseek-coder:6.7b-base-q5_K_M`.
+
+#### Chat
+
+- **Hostname:** `localhost`
+- **Port:** `1234`
+- **Path:** `/v1/chat/completions`
+- **Model Name:** `codellama:7b-instruct` or your preferred instruct model.
+- **LM Studio Preset:** Default or `CodeLlama Instruct`
+
+### LiteLLM
+
+#### Auto-complete
+
+LiteLLM technically supports auto-complete using the `custom-template` FIM template, and by editing the `fim.hbs` file, however result will vary depending on your model and setup.
+
+#### Chat
+
+- **Hostname:** `localhost`
+- **Port:** `4000`
+- **Path:** `/v1/chat/completions`
+
+Start LiteLLM with the following command:
+
+```bash
+litellm --model gpt-4-turbo
+```
+
+### Llama.cpp
+
+#### Auto-complete
+
+Start Llama.cpp in the terminal with this Docker command:
+
+For example using Docker and `codellama-7b.Q5_K_M.gguf`
+
+```bash
+docker run -p 8080:8080 --gpus all --network bridge -v /path/to/your/models:/models local/llama.cpp:full-cuda --server -m /models/codellama-7b.Q5_K_M.gguf -c 2048 -ngl 43 -mg 1 --port 8080 --host 0.0.0.0
+```
+
+Configure your provider settings as follows:
+
+- **Hostname:** `localhost`
+- **Port:** `8080`
+- **Path:** `/completion`
+- **FIM Template:** Select a template that matches the model, such as `codellama` for `CodeLlama-7B-GGUF` or `deepseek` for `deepseek-coder:6.7b-base-q5_K_M`.
+
+#### Chat
+
+The performance of chat functionalities with Llama.cpp has been mixed. If you obtain favorable results, please share them by opening an issue or a pull request.
+
+- **Hostname:** `localhost`
+- **Port:** `8080`
+- **Path:** `/completion`
+- **Model Name:** `CodeLlama-7B-GGUF` or any other strong instruct model.
+
+
+### Oobabooga
+
+```bash
+bash start_linux.sh --api --listen
+```
+
+#### Auto-complete
+
+Navigate to `http://0.0.0.0:7860/` and load your model:
+
+- **Hostname:** `localhost`
+- **Port:** `5000`
+- **Path:** `/v1/completions`
+- **Model Name:** `CodeLlama-7B-GGUF` or another effective instruct model.
+- **FIM Template:** Select a template that matches the model, such as `codellama` for `CodeLlama-7B-GGUF` or `deepseek` for `deepseek-coder:6.7b-base-q5_K_M`.
+
+#### Chat
+
+Chat functionality has not been successful on Linux with Oobabooga:
+
+- **Hostname:** `localhost`
+- **Port:** `5000`
+- **Path:** `/v1/chat/completions`
+- **Model Name:** `CodeLlama-7B-GGUF`
diff --git a/src/content/docs/general/quick-start.md b/src/content/docs/general/quick-start.md
@@ -0,0 +1,30 @@
+---
+title: Quick start
+description: A quick start guide for using Twinny.
+---
+
+## Prerequisites
+
+Before you start using Twinny you need to have access to an inference provider.  An inference provider is a local or cloud hosted server that runs the AI models.
+
+The recommended way to do this is to use [Ollama](https://ollama.com/).  Ollama makes it easy to run your models locally and exposes them as an OpenAI compatible API.  Performance will depend on your hardware and chosen model, see Ollama's documentation for more information.
+
+## Installing the extension
+
+1. Install the Visual Studio Code extension [here](https://marketplace.visualstudio.com/items?itemName=rjmacarthy.Twinny) or for VSCodium [here](https://open-vsx.org/extension/rjmacarthy/Twinny).
+
+## Installing Ollama as an inference provider
+
+1. Visit [Install Ollama](https://ollama.com/) and follow the instructions to install Ollama on your machine.
+2. Choose a model from the list of models available on Ollama.  The recommended models are [codellama:7b-instruct](https://ollama.com/library/codellama:instruct) for chat and [codellama:7b-code](https://ollama.com/library/codellama:code) for fill-in-middle.
+
+```sh
+ollama run codellama:7b-instruct
+ollama run codellama:7b-code
+```
+
+Once both the extension and Ollama are installed you can start using Twinny.
+
+1. Open VS code (if already open a restart might be needed) and press `CTRL+SHIFT+Z CTRL+SHIFT+T` to open the side panel.
+
+You should see the 🤖 icon indicating that Twinny is ready to use. The icon will change to a spinner when Twinny is making a call to the inference provider.
diff --git a/src/content/docs/general/support-twinny.md b/src/content/docs/general/support-twinny.md
@@ -0,0 +1,12 @@
+---
+title: Support Twinny
+description: Support Twinny by donating to the project.
+---
+
+Thanks for using Twinny!
+
+This project is and will always be free and open source. If you find it helpful, please consider showing your appreciation with a small donation <3
+
+Please send Bitcoin to:
+
+`1PVavNkMmBmUz8nRYdnVXiTgXrAyaxfehj`
diff --git a/src/content/docs/general/supported-models.md b/src/content/docs/general/supported-models.md
@@ -0,0 +1,60 @@
+---
+title: Supported models
+description: A list of supported models for Twinny.
+---
+
+Twinny is a configurable extension/interface which means many models are technically supported. However, not all models work well with Twinny in certain scenarios.  The following is a list of the models that have been tested and found to work well with Twinny.  If you find a model that works but is not listed here, please let us know so we can add it to the list or open a pull request to add it.
+
+### Chat
+
+In theory any chat model which is trained for instructing will work with Twinny.  The following are some example of models recommended for chat.
+
+
+- [`llama3`](https://ollama.com/library/llama3)
+- [`codellama:7b-instruct`](https://ollama.com/library/codellama:instruct)
+- [`phind-codellama`](https://ollama.com/library/phind-codellama)
+- [`mistral`](https://ollama.com/library/mistral)
+
+### Fill-in-middle
+
+Only certain models support fill in the middle due to their training data.  The following are some example of models recommended for fill in the middle.  If you find a model that works but is not listed here, please let us know so we can add it to the list or open a pull request to add it.
+
+#### Codellama models
+
+`code` versions of codellama models.
+
+- [`codellama:code`](https://ollama.com/library/codellama:code)
+- [`codellama:13b-code`](https://ollama.com/library/codellama:13b-code)
+
+Note: The _34b_ version of codellama does not work well with fill in the middle.
+
+#### Deepseek Coder models
+
+`base` versions of deepseek-coder models.
+
+- [`deepseek-coder:base`](https://ollama.com/library/deepseek-coder:base)
+
+Note: Models which are not base versions do not work well with fill in the middle.
+
+#### Starcoder models
+
+`base` versions of starcoder models. The default and base models are the same.
+
+- [`starcoder`](https://ollama.com/library/starcoder)
+- [`starcoder2`](https://ollama.com/library/starcoder2)
+
+Note: Starcoder2 doesn't always stop when it is finished.  Lowering the temperature and upping the repeat penalty helps with this issue.
+
+#### Stablecode models
+
+`code` versions of stablecode models.
+
+- [`stable-code:3b-code`](https://ollama.com/library/stable-code:3b-code)
+
+#### Codegemma models
+
+`code` versions of codegemma models.
+
+- [`codegemma`](https://ollama.com/library/codegemma:7b-code)
+
+Note: CodeGemma doesn't always stop when it is finished.  Lowering the temperature and upping the repeat penalty helps with this issue.