Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test][example] GPT-4V examples #1540

Closed
wants to merge 36 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
691f7f7
Add bugbash sample flows
guming-learning Nov 15, 2023
175466f
rename eval flow
guming-learning Nov 15, 2023
18ef49d
Add image passthrough flow
guming-learning Nov 16, 2023
0773c72
remove evaluation sample
Dec 2, 2023
43107cc
Remove on-disk image to avoid copyright issue
Dec 2, 2023
02d670c
Add default input
Dec 2, 2023
709b712
Add image type to flow input
Dec 2, 2023
11fdc7d
Add workflow for chat_with_image
Dec 3, 2023
2056946
Remove image_pass_through flow
Dec 3, 2023
b86b49c
Add doc for describe image sample flow
Dec 6, 2023
8708ddc
Add doc for image
Dec 6, 2023
3919a07
enrich guide
Dec 7, 2023
10a4d7b
Rename to "process-image-in-flow"
guming-learning Dec 7, 2023
f1e198f
Update flow readme
guming-learning Dec 7, 2023
d803c21
Polish the doc
guming-learning Dec 7, 2023
6482967
Add workflow for describe-image flow
guming-learning Dec 7, 2023
3347d93
Fix some comments
guming-learning Dec 8, 2023
e321013
fix typo
guming-learning Dec 18, 2023
a1b0e55
Add process-image-in-flow to index
guming-learning Dec 18, 2023
de7c892
Fix typo
guming-learning Dec 18, 2023
b060922
Update connection to openai_connection
guming-learning Dec 18, 2023
36deae7
Flip image horizontal instead of pass through
guming-learning Dec 18, 2023
37a7fe7
Update connection name
guming-learning Dec 18, 2023
7936b44
Fix warning in CI
guming-learning Dec 18, 2023
31ffe23
Add missing requirements.txt file
guming-learning Dec 19, 2023
d9cd6df
gpt_four connections
crazygao Dec 19, 2023
57287e2
Fix connection
crazygao Dec 19, 2023
8f84c7e
flake8 fix
crazygao Dec 19, 2023
7a1bf21
Fix chat with image
crazygao Dec 19, 2023
9339ddd
Fix Connection base yaml
crazygao Dec 19, 2023
0984a3e
Update chat with image
guming-learning Dec 20, 2023
4ed09fc
Add tool requirement
guming-learning Dec 20, 2023
a5c48bd
Merge branch 'main' into zhengfei/image-bug-bash
zhengfeiwang Dec 20, 2023
8ee6203
fix: wrong deployment name
zhengfeiwang Dec 20, 2023
f02f535
fix: make .env as a YAML
zhengfeiwang Dec 20, 2023
fa0f995
use YAML suffix
zhengfeiwang Dec 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 115 additions & 0 deletions .github/workflows/samples_flows_chat_chat_with_image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# This code is autogenerated.
# Code is generated by running custom script: python3 readme.py
# Any manual changes to this file may cause incorrect behavior.
# Any manual changes will be overwritten if the code is regenerated.

name: samples_flows_chat_chat_with_image
on:
schedule:
- cron: "16 19 * * *" # Every day starting at 3:16 BJT
pull_request:
branches: [ main ]
paths: [ examples/flows/chat/chat-with-image/**, examples/*requirements.txt, .github/workflows/samples_flows_chat_chat_with_image.yml ]
workflow_dispatch:

env:
IS_IN_CI_PIPELINE: "true"

jobs:
samples_readme_ci:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Python 3.9 environment
uses: actions/setup-python@v4
with:
python-version: "3.9"
- name: Generate config.json for canary workspace (scheduled runs only)
if: github.event_name == 'schedule'
run: echo '${{ secrets.TEST_WORKSPACE_CONFIG_JSON_CANARY }}' > ${{ github.workspace }}/examples/config.json
- name: Generate config.json for production workspace
if: github.event_name != 'schedule'
run: echo '${{ secrets.EXAMPLE_WORKSPACE_CONFIG_JSON_PROD }}' > ${{ github.workspace }}/examples/config.json
- name: Prepare requirements
working-directory: examples
run: |
if [[ -e requirements.txt ]]; then
python -m pip install --upgrade pip
pip install -r requirements.txt
fi
- name: Prepare dev requirements
working-directory: examples
run: |
python -m pip install --upgrade pip
pip install -r dev_requirements.txt
- name: Refine .env file
working-directory: examples/flows/chat/chat-with-image
run: |
AOAI_API_KEY=${{ secrets.AOAI_GPT_4V_KEY }}
AOAI_API_ENDPOINT=${{ secrets.AOAI_GPT_4V_ENDPOINT }}
AOAI_API_ENDPOINT=$(echo ${AOAI_API_ENDPOINT//\//\\/})
if [[ -e .env.example ]]; then
echo "env replacement"
sed -i -e "s/<your_AOAI_key>/$AOAI_API_KEY/g" -e "s/<your_AOAI_endpoint>/$AOAI_API_ENDPOINT/g" .env.example
mv .env.example .env
fi
- name: Create AOAI Connection from ENV file
working-directory: examples/flows/chat/chat-with-image
run: |
if [[ -e .env ]]; then
pf connection create --file .env --name aoai_gpt4v_connection
pf connection list
fi

- name: Create run.yml
working-directory: examples/flows/chat/chat-with-image
run: |
gpt_base=${{ secrets.AOAI_API_ENDPOINT_TEST }}
gpt_base=$(echo ${gpt_base//\//\\/})
if [[ -e run.yml ]]; then
sed -i -e "s/\${azure_open_ai_connection.api_key}/${{ secrets.AOAI_API_KEY_TEST }}/g" -e "s/\${azure_open_ai_connection.api_base}/$gpt_base/g" run.yml
fi
- name: Azure Login
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Extract Steps examples/flows/chat/chat-with-image/README.md
working-directory: ${{ github.workspace }}
run: |
python scripts/readme/extract_steps_from_readme.py -f examples/flows/chat/chat-with-image/README.md -o examples/flows/chat/chat-with-image
- name: Cat script
working-directory: examples/flows/chat/chat-with-image
run: |
cat bash_script.sh
- name: Run scripts against canary workspace (scheduled runs only)
if: github.event_name == 'schedule'
working-directory: examples/flows/chat/chat-with-image
run: |
export aoai_api_key=${{secrets.AOAI_GPT_4V_KEY }}
export aoai_api_endpoint=${{ secrets.AOAI_GPT_4V_ENDPOINT }}
export test_workspace_sub_id=${{ secrets.TEST_WORKSPACE_SUB_ID }}
export test_workspace_rg=${{ secrets.TEST_WORKSPACE_RG }}
export test_workspace_name=${{ secrets.TEST_WORKSPACE_NAME_CANARY }}
bash bash_script.sh
- name: Run scripts against production workspace
if: github.event_name != 'schedule'
working-directory: examples/flows/chat/chat-with-image
run: |
export aoai_api_key=${{secrets.AOAI_GPT_4V_KEY }}
export aoai_api_endpoint=${{ secrets.AOAI_GPT_4V_ENDPOINT }}
export test_workspace_sub_id=${{ secrets.TEST_WORKSPACE_SUB_ID }}
export test_workspace_rg=${{ secrets.TEST_WORKSPACE_RG }}
export test_workspace_name=${{ secrets.TEST_WORKSPACE_NAME_PROD }}
bash bash_script.sh
- name: Pip List for Debug
if : ${{ always() }}
working-directory: examples/flows/chat/chat-with-image
run: |
pip list
- name: Upload artifact
if: ${{ always() }}
uses: actions/upload-artifact@v3
with:
name: artifact
path: examples/flows/chat/chat-with-image/bash_script.sh
107 changes: 107 additions & 0 deletions .github/workflows/samples_flows_standard_describe_image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# This code is autogenerated.
# Code is generated by running custom script: python3 readme.py
# Any manual changes to this file may cause incorrect behavior.
# Any manual changes will be overwritten if the code is regenerated.

name: samples_flows_standard_describe_image
on:
schedule:
- cron: "9 19 * * *" # Every day starting at 3:9 BJT
pull_request:
branches: [ main ]
paths: [ examples/flows/standard/describe-image/**, examples/*requirements.txt, .github/workflows/samples_flows_standard_describe_image.yml ]
workflow_dispatch:

env:
IS_IN_CI_PIPELINE: "true"

jobs:
samples_readme_ci:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Python 3.9 environment
uses: actions/setup-python@v4
with:
python-version: "3.9"
- name: Prepare requirements
working-directory: examples
run: |
if [[ -e requirements.txt ]]; then
python -m pip install --upgrade pip
pip install -r requirements.txt
fi
- name: Prepare dev requirements
working-directory: examples
run: |
python -m pip install --upgrade pip
pip install -r dev_requirements.txt
- name: Refine .env file
working-directory: examples/flows/standard/describe-image
run: |
AOAI_API_KEY=${{ secrets.AOAI_GPT_4V_KEY }}
AOAI_API_ENDPOINT=${{ secrets.AOAI_GPT_4V_ENDPOINT }}
AOAI_API_ENDPOINT=$(echo ${AOAI_API_ENDPOINT//\//\\/})
if [[ -e .env.example ]]; then
echo "env replacement"
sed -i -e "s/<your_AOAI_key>/$AOAI_API_KEY/g" -e "s/<your_AOAI_endpoint>/$AOAI_API_ENDPOINT/g" .env.example
mv .env.example gpt4v.yaml
fi
- name: Create AOAI Connection from ENV file
working-directory: examples/flows/standard/describe-image
run: |
pf connection create --file gpt4v.yaml --name aoai_gpt4v_connection
pf connection list

- name: Create run.yml
working-directory: examples/flows/standard/describe-image
run: |
gpt_base=${{ secrets.AOAI_API_ENDPOINT_TEST }}
gpt_base=$(echo ${gpt_base//\//\\/})
if [[ -e run.yml ]]; then
sed -i -e "s/\${azure_open_ai_connection.api_key}/${{ secrets.AOAI_API_KEY_TEST }}/g" -e "s/\${azure_open_ai_connection.api_base}/$gpt_base/g" run.yml
fi
- name: Azure Login
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Extract Steps examples/flows/standard/describe-image/README.md
working-directory: ${{ github.workspace }}
run: |
python scripts/readme/extract_steps_from_readme.py -f examples/flows/standard/describe-image/README.md -o examples/flows/standard/describe-image
- name: Cat script
working-directory: examples/flows/standard/describe-image
run: |
cat bash_script.sh
- name: Run scripts against canary workspace (scheduled runs only)
if: github.event_name == 'schedule'
working-directory: examples/flows/standard/describe-image
run: |
export aoai_api_key=${{secrets.AOAI_GPT_4V_KEY }}
export aoai_api_endpoint=${{ secrets.AOAI_GPT_4V_ENDPOINT }}
export test_workspace_sub_id=${{ secrets.TEST_WORKSPACE_SUB_ID }}
export test_workspace_rg=${{ secrets.TEST_WORKSPACE_RG }}
export test_workspace_name=${{ secrets.TEST_WORKSPACE_NAME_CANARY }}
bash bash_script.sh
- name: Run scripts against production workspace
if: github.event_name != 'schedule'
working-directory: examples/flows/standard/describe-image
run: |
export aoai_api_key=${{secrets.AOAI_GPT_4V_KEY }}
export aoai_api_endpoint=${{ secrets.AOAI_GPT_4V_ENDPOINT }}
export test_workspace_sub_id=${{ secrets.TEST_WORKSPACE_SUB_ID }}
export test_workspace_rg=${{ secrets.TEST_WORKSPACE_RG }}
export test_workspace_name=${{ secrets.TEST_WORKSPACE_NAME_PROD }}
bash bash_script.sh
- name: Pip List for Debug
if : ${{ always() }}
working-directory: examples/flows/standard/describe-image
run: |
pip list
- name: Upload artifact
if: ${{ always() }}
uses: actions/upload-artifact@v3
with:
name: artifact
path: examples/flows/standard/describe-image/bash_script.sh
1 change: 1 addition & 0 deletions docs/how-to-guides/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,6 @@ manage-connections
manage-runs
set-global-configs
develop-a-tool/index
process-image-in-flow
faq
```
58 changes: 58 additions & 0 deletions docs/how-to-guides/process-image-in-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Process image in flow
PromptFlow defines a contract to represent image data.

## Data class
`promptflow.contracts.multimedia.Image`
Image class is a subclass of `bytes`, thus you can access the binary data by directly using the object. It has an extra attribute `source_url` to store the origin url of the image, which would be useful if you want to pass the url instead of content of image to APIs like GPT-4V model.

## Data type in flow input
Set the type of flow input to `image` and promptflow will treat it as an image.

## Reference image in prompt template
In prompt templates that support image (e.g. in OpenAI GPT-4V tool), using markdown syntax to denote that a template input is an image: `![image]({{test_image}})`. In this case, `test_image` will be substituted with base64 or source_url (if set) before sending to LLM model.

## Serialization/Deserialization
Promptflow uses a special dict to represent image.
`{"data:image/<mime-type>;<representation>": "<value>"}`

- `<mime-type>` can be html standard [mime](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types) image types. Setting it to specific type can help previewing the image correctly, or it can be `*` for unknown type.
- `<representation>` is the image serialized representation, there are 3 supported types:

- url

It can point to a public accessable web url. E.g.

{"data:image/png;url": "https://developer.microsoft.com/_devcom/images/logo-ms-social.png"}
- base64

It can be the base64 encoding of the image. E.g.

{"data:image/png;base64": "iVBORw0KGgoAAAANSUhEUgAAAGQAAABLAQMAAAC81rD0AAAABGdBTUEAALGPC/xhBQAAACBjSFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAABlBMVEUAAP7////DYP5JAAAAAWJLR0QB/wIt3gAAAAlwSFlzAAALEgAACxIB0t1+/AAAAAd0SU1FB+QIGBcKN7/nP/UAAAASSURBVDjLY2AYBaNgFIwCdAAABBoAAaNglfsAAAAZdEVYdGNvbW1lbnQAQ3JlYXRlZCB3aXRoIEdJTVDnr0DLAAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIwLTA4LTI0VDIzOjEwOjU1KzAzOjAwkHdeuQAAACV0RVh0ZGF0ZTptb2RpZnkAMjAyMC0wOC0yNFQyMzoxMDo1NSswMzowMOEq5gUAAAAASUVORK5CYII="}

- path

It can reference an image file on local disk. Both absolute path and relative path are supported, but in the cases where the serialized image representation is stored in a file, relative to the containing folder of that file is recommended, as in the case of flow IO data. E.g.

{"data:image/png;path": "./my-image.png"}

Please note that `path` representation is not supported in Deployment scenario.

## Batch Input data
Batch input data containing image can be of 2 formats:
1. The same jsonl format of regular batch input, except that some column may be seriliazed image data or composite data type (dict/list) containing images. The serialized images can only be Url or Base64. E.g.
```json
{"question": "How many colors are there in the image?", "input_image": {"data:image/png;url": "https://developer.microsoft.com/_devcom/images/logo-ms-social.png"}}
{"question": "What's this image about?", "input_image": {"data:image/png;url": "https://developer.microsoft.com/_devcom/images/404.png"}}
```
2. A folder containing a jsonl file under root path, which contains serialized image in File Reference format. The referenced file are stored in the folder and their relative path to the root path is used as path in the file reference. Here is a sample batch input, note that the name of `input.jsonl` is arbitrary as long as it's a jsonl file:
```
BatchInputFolder
|----input.jsonl
|----image1.png
|----image2.png
```
Content of `input.jsonl`
```json
{"question": "How many colors are there in the image?", "input_image": {"data:image/png;path": "image1.png"}}
{"question": "What's this image about?", "input_image": {"data:image/png;path": "image2.png"}}
```
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ This documentation site contains guides for prompt flow [sdk, cli](https://pypi.
- [Tune prompts using variants](how-to-guides/tune-prompts-with-variants.md)<br/>
- [Develop custom tool](how-to-guides/develop-a-tool/create-and-use-tool-package.md)<br/>
- [Deploy a flow](how-to-guides/deploy-a-flow/index.md)<br/>
- [Process image in flow](how-to-guides/process-image-in-flow.md)
"
```

Expand Down
4 changes: 2 additions & 2 deletions docs/reference/flow-yaml-schema-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@ The source JSON schema can be found at [Flow.schema.json](https://azuremlschemas

| Key | Type | Description | Allowed values |
|-------------------|-------------------------------------------|------------------------------------------------------|-----------------------------------------------------|
| `type` | string | The type of flow input. | `int`, `double`, `bool`, `string`, `list`, `object` |
| `type` | string | The type of flow input. | `int`, `double`, `bool`, `string`, `list`, `object`, `image` |
| `description` | string | Description of the input. | |
| `default` | int, double, bool, string, list or object | The default value for the input. | |
| `default` | int, double, bool, string, list, object, image | The default value for the input. | |
| `is_chat_input` | boolean | Whether the input is the chat flow input. | |
| `is_chat_history` | boolean | Whether the input is the chat history for chat flow. | |

Expand Down
Loading
Loading