Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move flow folder, fix tool warning and fix progress bar #2520

Merged
merged 8 commits into from
Mar 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions docs/cloud/azureai/generate-test-data-cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ This guide will help you learn how to generate test data on Azure AI, so that yo

## Prerequisites

1. Go through [local test data generation guide](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/docs/how-to-guides/generate-test-data.md) and prepare your [test data generation flow](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/gen_test_data/generate_test_data_flow/).
2. Go to the [example_gen_test_data](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data) folder and run command `pip install -r requirements_cloud.txt` to prepare local environment.
1. Go through [local test data generation guide](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/docs/how-to-guides/generate-test-data.md) and prepare your [test data generation flow](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/example_flow/).
2. Go to the [example_gen_test_data](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data) folder and run command `pip install -r requirements_cloud.txt` to prepare local environment.
3. Prepare cloud environment.
- Navigate to file [conda.yml](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/conda.yml).
- Navigate to file [conda.yml](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/conda.yml).
- For specific document file types, you may need to install extra packages:
- .docx - `pip install docx2txt`
- .pdf - `pip install pypdf`
Expand All @@ -20,8 +20,8 @@ This guide will help you learn how to generate test data on Azure AI, so that yo
5. [Create cloud connection](https://microsoft.github.io/promptflow/cloud/azureai/quick-start/index.html#create-necessary-connections)

6. Prepare config.ini
- Navigate to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data) folder.
- Run command to copy [`config.yml.example`](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/config.yml.example).
- Navigate to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data) folder.
- Run command to copy [`config.yml.example`](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/config.yml.example).
```
cp config.yml.example config.yml
```
Expand All @@ -30,7 +30,7 @@ This guide will help you learn how to generate test data on Azure AI, so that yo

## Generate test data at cloud
For handling larger test data, you can leverage the PRS component to run flow in cloud.
- Navigate to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data) folder.
- Navigate to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data) folder.
- After configuration, run the following command to generate the test data set:
```bash
python -m gen_test_data.run --cloud
Expand Down
22 changes: 11 additions & 11 deletions docs/how-to-guides/generate-test-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ By leveraging the capabilities of llm, this guide streamlines the test data gene
- The test data generator may not function effectively for non-Latin characters, such as Chinese, in certain document types. The limitation is caused by dependent text loader capabilities, such as `pypdf`.
- The test data generator may not generate meaningful questions if the document is not well-organized or contains massive code snippets/links, such as API introduction documents or reference documents.

2. Prepare local environment. Go to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data) folder and install required packages.
2. Prepare local environment. Go to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data) folder and install required packages.

```bash
pip install -r requirements.txt
Expand All @@ -35,8 +35,8 @@ By leveraging the capabilities of llm, this guide streamlines the test data gene
4. Create your AzureOpenAI or OpenAI connection by following [this doc](https://microsoft.github.io/promptflow/how-to-guides/manage-connections.html#create-a-connection).

5. Prepare test data generation setting.
- Navigate to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data) folder.
- Run command to copy [`config.yml.example`](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/config.yml.example).
- Navigate to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data) folder.
- Run command to copy [`config.yml.example`](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/config.yml.example).
```
cp config.yml.example config.yml
```
Expand All @@ -47,7 +47,7 @@ By leveraging the capabilities of llm, this guide streamlines the test data gene


## Create a test data generation flow
- Open the [sample test data generation flow](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/gen_test_data/generate_test_data_flow/) in "Prompt flow" VSCode Extension. This flow is designed to generate a pair of question and suggested answer based on the given text chunk. The flow also includes validation prompts to ensure the quality of the generated test data.
- Open the [sample test data generation flow](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/example_flow/) in "Prompt flow" VSCode Extension. This flow is designed to generate a pair of question and suggested answer based on the given text chunk. The flow also includes validation prompts to ensure the quality of the generated test data.
- Fill in node inputs including `connection`, `model` or `deployment_name`, `response_format`, `score_threshold` or other parameters. Click run button to test the flow in VSCode Extension by referring to [Test flow with VS Code Extension](https://microsoft.github.io/promptflow/how-to-guides/init-and-test-a-flow.html#visual-editor-on-the-vs-code-for-prompt-flow).

> !Note: Recommend to use `gpt-4` series models than the `gpt-3.5` for better performance.
Expand All @@ -60,17 +60,17 @@ By leveraging the capabilities of llm, this guide streamlines the test data gene

The test data generation flow contains 5 prompts, classified into two categories based on their roles: generation prompts and validation prompts. Generation prompts are used to create questions, suggested answers, etc., while validation prompts are used to verify the validity of the text chunk, generated question or answer.
- Generation prompts
- [*generate question prompt*](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/gen_test_data/generate_test_data_flow/generate_question_prompt.jinja2): frame a question based on the given text chunk.
- [*generate suggested answer prompt*](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/gen_test_data/generate_test_data_flow/generate_suggested_answer_prompt.jinja2): generate suggested answer for the question based on the given text chunk.
- [*generate question prompt*](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/example_flow/generate_question_prompt.jinja2): frame a question based on the given text chunk.
- [*generate suggested answer prompt*](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/example_flow/generate_suggested_answer_prompt.jinja2): generate suggested answer for the question based on the given text chunk.
- Validation prompts
- [*score text chunk prompt*](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/gen_test_data/generate_test_data_flow/score_text_chunk_prompt.jinja2): score 0-10 to validate if the given text chunk is worthy of framing a question. If the score is lower than `score_threshold` (a node input that is adjustable), validation fails.
- [*validate question prompt*](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/gen_test_data/generate_test_data_flow/validate_question_prompt.jinja2): validate if the generated question is good.
- [*validate suggested answer*](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data/gen_test_data/generate_test_data_flow/validate_suggested_answer_prompt.jinja2): validate if the generated suggested answer is good.
- [*score text chunk prompt*](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/example_flow/score_text_chunk_prompt.jinja2): score 0-10 to validate if the given text chunk is worthy of framing a question. If the score is lower than `score_threshold` (a node input that is adjustable), validation fails.
- [*validate question prompt*](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/example_flow/validate_question_prompt.jinja2): validate if the generated question is good.
- [*validate suggested answer*](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data/example_flow/validate_suggested_answer_prompt.jinja2): validate if the generated suggested answer is good.

If the validation fails, would lead to empty string `question`/`suggested_answer` which are removed from final output test data set.

## Generate test data
- Navigate to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/examples/gen_test_data) folder.
- Navigate to [example_gen_test_data](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/examples/gen_test_data) folder.

- After configuration, run the following command to generate the test data set:
```bash
Expand All @@ -79,4 +79,4 @@ By leveraging the capabilities of llm, this guide streamlines the test data gene

- The generated test data will be a data jsonl file. See detailed log print in console "Saved ... valid test data to ..." to find it.

If you expect to generate a large amount of test data beyond your local compute capability, you may try generating test data in cloud, please see this [guide](https://github.com/microsoft/promptflow/blob/c304f6fddb2ac64c3d7889f56fa79efa364c8f3b/docs/cloud/azureai/generate-test-data-cloud.md) for more detailed steps.
If you expect to generate a large amount of test data beyond your local compute capability, you may try generating test data in cloud, please see this [guide](https://github.com/microsoft/promptflow/blob/53a685dbff920e891ef61cacb5f2f19e761ee809/docs/cloud/azureai/generate-test-data-cloud.md) for more detailed steps.
9 changes: 4 additions & 5 deletions examples/gen_test_data/conda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,7 @@ dependencies:
- python=3.10.12
- pip=23.2.1
- pip:
- mldesigner
- configargparse
- llama_index
- docx2txt
- promptflow
- mldesigner==0.1.0b18
- llama_index==0.9.48
- docx2txt==0.8
- promptflow>=1.7.0
Loading
Loading