Skip to content

Commit

Permalink
add missing files
Browse files Browse the repository at this point in the history
  • Loading branch information
rjmacarthy committed Apr 30, 2024
1 parent f525b8e commit 88afbaf
Show file tree
Hide file tree
Showing 10 changed files with 274 additions and 0 deletions.
Binary file added src/assets/twinny-chat.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/assets/twinny-code-completion.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/assets/twinny-menu.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions src/content/docs/general/chat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: Chat
description: Chat with Twinny
---

## Open side panel

To use Twinny chat open the side panel using the extension icon or by using the keyboard shortcut `CTRL+SHIFT+Z CTRL+SHIFT+T`. You can then start typing. Twinny will retain the chat history between sessions. You can find the chat history by clicking on the history icon on the side panel.

## Context and code selection

When you highlight/select code in your editor Twinny will use that as the context for the chat message. If you have not selected any code then it will use the message alone and any previous messages. You can also right click on selected code and select a Twinny option to refactor, explain and other actions.
10 changes: 10 additions & 0 deletions src/content/docs/general/fill-in-middle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Fill in the middle
description: Fill in the middle
---

To use Twinny to fill in the middle of a code snippet, just start typing in the editor and Twinny will autocomplete for you. Very similar to how Github Copilot works.

If you prefer to trigger code completion manually turn off automatic inline code completion in the settings menu which can be found at the top of the Twinny side panel, and then use the keyboard shortcut `ALT+\` to trigger code completion.

Github Copilot and Twinny share the same keyboard shortcuts, so might interfere with each other enable and disable them as needed.
13 changes: 13 additions & 0 deletions src/content/docs/general/keyboard-shortcuts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
title: Keyboard shortcuts
description: Keyboard shortcuts for Twinny.

---

| Shortcut | Description |
| ----------------------------| -------------------------------------------------|
| `ALT+\` | Trigger inline code completion |
| `CTRL+SHIFT+/` | Stop the inline code generation |
| `Tab` | Accept the inline code generated |
| `CTRL+SHIFT+Z CTRL+SHIFT+T` | Open Twinny sidebar |
| `CTRL+SHIFT+Z CTRL+SHIFT+G` | Generate commit messages from staged changes |
137 changes: 137 additions & 0 deletions src/content/docs/general/providers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
---
title: Inference providers
description: Inference providers are a way to connect Twinny with external models and services.
---

These example configurations serve as a starting point. Individual adjustments may be required depending on your specific hardware and software environments.


> Note: Twinny chat (not auto-complete) should be compatible with any API which adheres to the OpenAI API specification.

### Ollama (recommended and configured by default)

#### Auto-complete

- **Hostname:** `localhost`
- **Port:** `11434`
- **Path:** `/api/generate`
- **Model Name:** `codellama:7b-code`
- **FIM Template:** `codellama`

#### Chat

- **Hostname:** `localhost`
- **Port:** `11434`
- **Path:** `/v1/chat/completions`
- **Model Name:** `codellama:7b-instruct`

### Open WebUI using Ollama

Open WebUI can be used a proxy Ollama, simply configure the endpoint to match what is served by OpenWeb UI.

#### Auto-complete

- **Hostname:** `localhost`
- **Port:** The port OpenWebUI is serving, typically `8080` or `3000`.
- **Path:** `/ollama/api/generate`
- **Model Name:** `codellama:7b-code`
- **FIM Template:** Select a template that matches the model, such as `codellama` for `codellama:7b-code` or `deepseek` for `deepseek-coder`.

#### Chat

- **Hostname:** `localhost`
- **Port:** The port OpenWebUI is serving, typically `8080` or `3000`.
- **Path:** `/ollama/v1/chat/completions`
- **Model Name:** `codellama:7b-instruct` or any effective instruct model.

### LM Studio

#### Auto-complete

- **Hostname:** `localhost`
- **Port:** `1234`
- **Path:** `/v1/completions`
- **Model Name:** Base model such as `codellama-7b.Q5_K_M.gguf`
- **LM Studio Preset:** CodeLlama Completion
- **FIM Template:** Select a template that matches the model, such as `codellama` for `CodeLlama-7B-GGUF` or `deepseek` for `deepseek-coder:6.7b-base-q5_K_M`.

#### Chat

- **Hostname:** `localhost`
- **Port:** `1234`
- **Path:** `/v1/chat/completions`
- **Model Name:** `codellama:7b-instruct` or your preferred instruct model.
- **LM Studio Preset:** Default or `CodeLlama Instruct`

### LiteLLM

#### Auto-complete

LiteLLM technically supports auto-complete using the `custom-template` FIM template, and by editing the `fim.hbs` file, however result will vary depending on your model and setup.

#### Chat

- **Hostname:** `localhost`
- **Port:** `4000`
- **Path:** `/v1/chat/completions`

Start LiteLLM with the following command:

```bash
litellm --model gpt-4-turbo
```

### Llama.cpp

#### Auto-complete

Start Llama.cpp in the terminal with this Docker command:

For example using Docker and `codellama-7b.Q5_K_M.gguf`

```bash
docker run -p 8080:8080 --gpus all --network bridge -v /path/to/your/models:/models local/llama.cpp:full-cuda --server -m /models/codellama-7b.Q5_K_M.gguf -c 2048 -ngl 43 -mg 1 --port 8080 --host 0.0.0.0
```

Configure your provider settings as follows:

- **Hostname:** `localhost`
- **Port:** `8080`
- **Path:** `/completion`
- **FIM Template:** Select a template that matches the model, such as `codellama` for `CodeLlama-7B-GGUF` or `deepseek` for `deepseek-coder:6.7b-base-q5_K_M`.

#### Chat

The performance of chat functionalities with Llama.cpp has been mixed. If you obtain favorable results, please share them by opening an issue or a pull request.

- **Hostname:** `localhost`
- **Port:** `8080`
- **Path:** `/completion`
- **Model Name:** `CodeLlama-7B-GGUF` or any other strong instruct model.


### Oobabooga

```bash
bash start_linux.sh --api --listen
```

#### Auto-complete

Navigate to `http://0.0.0.0:7860/` and load your model:

- **Hostname:** `localhost`
- **Port:** `5000`
- **Path:** `/v1/completions`
- **Model Name:** `CodeLlama-7B-GGUF` or another effective instruct model.
- **FIM Template:** Select a template that matches the model, such as `codellama` for `CodeLlama-7B-GGUF` or `deepseek` for `deepseek-coder:6.7b-base-q5_K_M`.

#### Chat

Chat functionality has not been successful on Linux with Oobabooga:

- **Hostname:** `localhost`
- **Port:** `5000`
- **Path:** `/v1/chat/completions`
- **Model Name:** `CodeLlama-7B-GGUF`
30 changes: 30 additions & 0 deletions src/content/docs/general/quick-start.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: Quick start
description: A quick start guide for using Twinny.
---

## Prerequisites

Before you start using Twinny you need to have access to an inference provider. An inference provider is a local or cloud hosted server that runs the AI models.

The recommended way to do this is to use [Ollama](https://ollama.com/). Ollama makes it easy to run your models locally and exposes them as an OpenAI compatible API. Performance will depend on your hardware and chosen model, see Ollama's documentation for more information.

## Installing the extension

1. Install the Visual Studio Code extension [here](https://marketplace.visualstudio.com/items?itemName=rjmacarthy.Twinny) or for VSCodium [here](https://open-vsx.org/extension/rjmacarthy/Twinny).

## Installing Ollama as an inference provider

1. Visit [Install Ollama](https://ollama.com/) and follow the instructions to install Ollama on your machine.
2. Choose a model from the list of models available on Ollama. The recommended models are [codellama:7b-instruct](https://ollama.com/library/codellama:instruct) for chat and [codellama:7b-code](https://ollama.com/library/codellama:code) for fill-in-middle.

```sh
ollama run codellama:7b-instruct
ollama run codellama:7b-code
```

Once both the extension and Ollama are installed you can start using Twinny.

1. Open VS code (if already open a restart might be needed) and press `CTRL+SHIFT+Z CTRL+SHIFT+T` to open the side panel.

You should see the 🤖 icon indicating that Twinny is ready to use. The icon will change to a spinner when Twinny is making a call to the inference provider.
12 changes: 12 additions & 0 deletions src/content/docs/general/support-twinny.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: Support Twinny
description: Support Twinny by donating to the project.
---

Thanks for using Twinny!

This project is and will always be free and open source. If you find it helpful, please consider showing your appreciation with a small donation <3

Please send Bitcoin to:

`1PVavNkMmBmUz8nRYdnVXiTgXrAyaxfehj`
60 changes: 60 additions & 0 deletions src/content/docs/general/supported-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: Supported models
description: A list of supported models for Twinny.
---

Twinny is a configurable extension/interface which means many models are technically supported. However, not all models work well with Twinny in certain scenarios. The following is a list of the models that have been tested and found to work well with Twinny. If you find a model that works but is not listed here, please let us know so we can add it to the list or open a pull request to add it.

### Chat

In theory any chat model which is trained for instructing will work with Twinny. The following are some example of models recommended for chat.


- [`llama3`](https://ollama.com/library/llama3)
- [`codellama:7b-instruct`](https://ollama.com/library/codellama:instruct)
- [`phind-codellama`](https://ollama.com/library/phind-codellama)
- [`mistral`](https://ollama.com/library/mistral)

### Fill-in-middle

Only certain models support fill in the middle due to their training data. The following are some example of models recommended for fill in the middle. If you find a model that works but is not listed here, please let us know so we can add it to the list or open a pull request to add it.

#### Codellama models

`code` versions of codellama models.

- [`codellama:code`](https://ollama.com/library/codellama:code)
- [`codellama:13b-code`](https://ollama.com/library/codellama:13b-code)

Note: The _34b_ version of codellama does not work well with fill in the middle.

#### Deepseek Coder models

`base` versions of deepseek-coder models.

- [`deepseek-coder:base`](https://ollama.com/library/deepseek-coder:base)

Note: Models which are not base versions do not work well with fill in the middle.

#### Starcoder models

`base` versions of starcoder models. The default and base models are the same.

- [`starcoder`](https://ollama.com/library/starcoder)
- [`starcoder2`](https://ollama.com/library/starcoder2)

Note: Starcoder2 doesn't always stop when it is finished. Lowering the temperature and upping the repeat penalty helps with this issue.

#### Stablecode models

`code` versions of stablecode models.

- [`stable-code:3b-code`](https://ollama.com/library/stable-code:3b-code)

#### Codegemma models

`code` versions of codegemma models.

- [`codegemma`](https://ollama.com/library/codegemma:7b-code)

Note: CodeGemma doesn't always stop when it is finished. Lowering the temperature and upping the repeat penalty helps with this issue.

0 comments on commit 88afbaf

Please sign in to comment.