Skip to content

Commit

Permalink
feat: adds initial support for gpt-4-vison (#184)
Browse files Browse the repository at this point in the history
* feat: adds support for gpt-4-vison

* chore: update gitleaksignore

* chore: mark gpt-4-vision as beta
  • Loading branch information
tbckr authored Dec 4, 2023
1 parent 29f969c commit 09b809a
Show file tree
Hide file tree
Showing 16 changed files with 632 additions and 83 deletions.
3 changes: 3 additions & 0 deletions .bash_aliases
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,6 @@ gsum() {
echo "Commit cancelled."
fi
}

# Create a alias for access to the GPT-4 Vision API
alias vision='sgpt -m "gpt-4-vision-preview"'
1 change: 1 addition & 0 deletions .gitleaksignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
f1cf67714d70ef1087887abd008d80cfd2483a9b:pkg/fs/fs_test.go:aws-access-token:92
32 changes: 32 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,8 @@ To use the OpenAI API, you must first obtain an API key.
After completing these steps, you'll have an OpenAI API key that can be used to interact with the OpenAI models through
the SGPT tool.
**Note:** Your API key is sensitive information. Do not share it with anyone.
### Querying OpenAI Models
SGPT allows you to ask simple questions and receive informative answers. For example:
Expand All @@ -190,6 +192,36 @@ The mass of the sun is approximately 1.989 x 10^30 kilograms.
If you want to stream the completion to the command line, you can add the `--stream` flag. This will stream the output
to the command line as it is generated.
## GPT-4 Vision API
SGPT additionally facilitates the utilization of the GPT-4 Vision API. Include input images using the `-i` or `--input`
flag, supporting both URLs and local images.
```shell
$ sgpt -m "gpt-4-vision-preview" -i "https://upload.wikimedia.org/wikipedia/en/c/cb/Marvin_%28HHGG%29.jpg" "what can you see on the picture?"
The image shows a figure resembling a robot with a humanoid form. It has a
$ sgpt -m "gpt-4-vision-preview" -i pkg/fs/testdata/marvin.jpg "what can you see on the picture?"
The image shows a figure resembling a robot with a sleek, metallic surface. It
```
It is also possible to combine URLs and local images:
```shell
$ sgpt -m "gpt-4-vision-preview" -i "https://upload.wikimedia.org/wikipedia/en/c/cb/Marvin_%28HHGG%29.jpg" -i pkg/fs/testdata/marvin.jpg "what is the difference between those two pictures"
The two images provided appear to be identical. Both show the same depiction of a
```
To avoid specifying the `-m "gpt-4-vision-preview"` for each request, you can streamline the process by creating a bash
alias:
```shell
alias vision='sgpt -m "gpt-4-vision-preview"'
```
For more bash examples, see [.bash_aliases](https://github.com/tbckr/sgpt/blob/main/.bash_aliases).
**Important:** The GPT-4-vision API integration is currently in beta and may change in the future.
### Chat Capabilities
SGPT provides chat functionality that enables interactive conversations with OpenAI models. You can use the `--chat`
Expand Down
32 changes: 32 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ export OPENAI_API_KEY="sk-..."
After completing these steps, you'll have an OpenAI API key that can be used to interact with the OpenAI models through
the SGPT tool.

**Note:** Your API key is sensitive information. Do not share it with anyone.

## Querying OpenAI Models

SGPT allows you to ask simple questions and receive informative answers. For example:
Expand All @@ -39,6 +41,36 @@ The mass of the sun is approximately 1.989 x 10^30 kilograms.
If you want to stream the completion to the command line, you can add the `--stream` flag. This will stream the output
to the command line as it is generated.

## GPT-4 Vision API

SGPT additionally facilitates the utilization of the GPT-4 Vision API. Include input images using the `-i` or `--input`
flag, supporting both URLs and local images.

```shell
$ sgpt -m "gpt-4-vision-preview" -i "https://upload.wikimedia.org/wikipedia/en/c/cb/Marvin_%28HHGG%29.jpg" "what can you see on the picture?"
The image shows a figure resembling a robot with a humanoid form. It has a
$ sgpt -m "gpt-4-vision-preview" -i pkg/fs/testdata/marvin.jpg "what can you see on the picture?"
The image shows a figure resembling a robot with a sleek, metallic surface. It
```

It is also possible to combine URLs and local images:

```shell
$ sgpt -m "gpt-4-vision-preview" -i "https://upload.wikimedia.org/wikipedia/en/c/cb/Marvin_%28HHGG%29.jpg" -i pkg/fs/testdata/marvin.jpg "what is the difference between those two pictures"
The two images provided appear to be identical. Both show the same depiction of a
```

To avoid specifying the `-m "gpt-4-vision-preview"` for each request, you can streamline the process by creating a bash
alias:

```shell
alias vision='sgpt -m "gpt-4-vision-preview"'
```

For more bash examples, see [.bash_aliases](https://github.com/tbckr/sgpt/blob/main/.bash_aliases).

**Important:** The GPT-4-vision API integration is currently in beta and may change in the future.

## Code Generation Capabilities

By adding the `code` command to your prompt, you can generate code based on given instructions by using the
Expand Down
29 changes: 29 additions & 0 deletions docs/usage/gpt4-vision-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# GPT-4 Vision API

SGPT additionally facilitates the utilization of the GPT-4 Vision API. Include input images using the `-i` or `--input`
flag, supporting both URLs and local images.

```shell
$ sgpt -m "gpt-4-vision-preview" -i "https://upload.wikimedia.org/wikipedia/en/c/cb/Marvin_%28HHGG%29.jpg" "what can you see on the picture?"
The image shows a figure resembling a robot with a humanoid form. It has a
$ sgpt -m "gpt-4-vision-preview" -i pkg/fs/testdata/marvin.jpg "what can you see on the picture?"
The image shows a figure resembling a robot with a sleek, metallic surface. It
```

It is also possible to combine URLs and local images:

```shell
$ sgpt-m "gpt-4-vision-preview" -i "https://upload.wikimedia.org/wikipedia/en/c/cb/Marvin_%28HHGG%29.jpg" -i pkg/fs/testdata/marvin.jpg "what is the difference between those two pictures"
The two images provided appear to be identical. Both show the same depiction of a
```

To avoid specifying the `-m "gpt-4-vision-preview"` for each request, you can streamline the process by creating a bash
alias:

```shell
alias vision='sgpt -m "gpt-4-vision-preview"'
```

For more bash examples, see [.bash_aliases](https://github.com/tbckr/sgpt/blob/main/.bash_aliases).

**Important:** The GPT-4-vision API integration is currently in beta and may change in the future.
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ require (
github.com/jarcoal/httpmock v1.3.1
github.com/muesli/mango-cobra v1.2.0
github.com/muesli/roff v0.1.0
github.com/sashabaranov/go-openai v1.17.9
github.com/sashabaranov/go-openai v1.17.10-0.20231126084528-a09cb0c528c1
github.com/spf13/cobra v1.8.0
github.com/spf13/viper v1.17.0
github.com/stretchr/testify v1.8.4
Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,8 @@ github.com/sagikazarmark/locafero v0.3.0 h1:zT7VEGWC2DTflmccN/5T1etyKvxSxpHsjb9c
github.com/sagikazarmark/locafero v0.3.0/go.mod h1:w+v7UsPNFwzF1cHuOajOOzoq4U7v/ig1mpRjqV+Bu1U=
github.com/sagikazarmark/slog-shim v0.1.0 h1:diDBnUNK9N/354PgrxMywXnAwEr1QZcOr6gto+ugjYE=
github.com/sagikazarmark/slog-shim v0.1.0/go.mod h1:SrcSrq8aKtyuqEI1uvTDTK1arOWRIczQRv+GVI1AkeQ=
github.com/sashabaranov/go-openai v1.17.9 h1:QEoBiGKWW68W79YIfXWEFZ7l5cEgZBV4/Ow3uy+5hNY=
github.com/sashabaranov/go-openai v1.17.9/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg=
github.com/sashabaranov/go-openai v1.17.10-0.20231126084528-a09cb0c528c1 h1:F07vUreAjYQtO6Ny9czwt08RlQyOOoJdYpwxSkloDZI=
github.com/sashabaranov/go-openai v1.17.10-0.20231126084528-a09cb0c528c1/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg=
github.com/sourcegraph/conc v0.3.0 h1:OQTbbt6P72L20UqAkXXuLOj79LfEanQ+YQFNpLA9ySo=
github.com/sourcegraph/conc v0.3.0/go.mod h1:Sdozi7LEKbFPqYX2/J+iBAM6HpqSLTASQIKqDmF7Mt0=
github.com/spf13/afero v1.10.0 h1:EaGW2JJh15aKOejeuJ+wpFSHnbd7GE6Wvp3TsNhb6LY=
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ nav:
- Getting Started: 'getting-started.md'
- Usage Guide:
- Query Models: 'usage/query-models.md'
- GPT-4 Vision API: 'usage/gpt4-vision-api.md'
- Chat: 'usage/chat.md'
- Docker: 'usage/docker.md'
- Personas: 'usage/personas.md'
Expand Down
Loading

0 comments on commit 09b809a

Please sign in to comment.