Skip to content

Commit

Permalink
Squashed commit of the following:
Browse files Browse the repository at this point in the history
commit 3bf2d19
Author: Pamela Fox <pamelafox@microsoft.com>
Date:   Thu Nov 2 09:10:15 2023 -0700

    Fix list file (Azure-Samples#897)

commit b3c55b0
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Wed Nov 1 06:42:42 2023 -0700

    Bump pypdf from 3.16.3 to 3.17.0 in /scripts (Azure-Samples#890)

    Bumps [pypdf](https://github.com/py-pdf/pypdf) from 3.16.3 to 3.17.0.
    - [Release notes](https://github.com/py-pdf/pypdf/releases)
    - [Changelog](https://github.com/py-pdf/pypdf/blob/main/CHANGELOG.md)
    - [Commits](py-pdf/pypdf@3.16.3...3.17.0)

    ---
    updated-dependencies:
    - dependency-name: pypdf
      dependency-type: direct:production
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Pamela Fox <pamelafox@microsoft.com>

commit cfbb90b
Author: Pamela Fox <pamelafox@microsoft.com>
Date:   Tue Oct 31 21:37:55 2023 -0700

    Add more readmes/guides (Azure-Samples#889)

    * Add more readmes/guides

    * Add image

    * Diagram added

commit 4479a2c
Author: Pamela Fox <pamelafox@microsoft.com>
Date:   Mon Oct 30 21:38:26 2023 -0700

    Handle errors better especialyl for streaming (Azure-Samples#884)

commit 0d54f84
Author: Pamela Fox <pamelafox@microsoft.com>
Date:   Mon Oct 30 17:58:09 2023 -0700

    Add exclude files (Azure-Samples#876)

commit 3647826
Author: Roderic Bos <github@rooc.nl>
Date:   Mon Oct 30 17:35:50 2023 +0100

    When using the option --storagekey for the prepdocs script the key might (Azure-Samples#866)

    contain `==` base64 padding at the end. This will fail to succesfully
    login because the script just removes the `=` signs during the split
    action. Copied the version from the app/start.ps1 which is better suited here.

    Co-authored-by: Pamela Fox <pamelafox@microsoft.com>

commit 5daa934
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Oct 30 09:17:55 2023 -0700

    Bump the github-actions group with 1 update (Azure-Samples#880)

    Bumps the github-actions group with 1 update: [actions/setup-node](https://github.com/actions/setup-node).

    - [Release notes](https://github.com/actions/setup-node/releases)
    - [Commits](actions/setup-node@v3...v4)

    ---
    updated-dependencies:
    - dependency-name: actions/setup-node
      dependency-type: direct:production
      update-type: version-update:semver-major
      dependency-group: github-actions
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit bfb3ee5
Author: Pamela Fox <pamelafox@microsoft.com>
Date:   Fri Oct 27 11:04:37 2023 -0700

    Render entire PDFs instead of single pages (Azure-Samples#840)

    * Adding anchors

    * Show whole file

    * Show whole file

    * Page number support

    * More experiments with whole file

    * Revert unintentional changes

    * Add tests

    * Remove random num

    * Add retry_total=0 to avoid unnecessary network requests

    * Add comment to explain retry_total

    * Bring back deleted file

    * Blob manager refactor after merge

    * Update coverage amount

    * Make mypy happy with explicit check of path

    * Add debug for 3.9

    * Skip in 3.9 since its silly

    * Reduce fail under percentage due to 3.9

commit c989048
Author: Pamela Fox <pamelafox@microsoft.com>
Date:   Thu Oct 26 20:01:18 2023 -0700

    add screenshot (Azure-Samples#870)

commit a64a12e
Author: Matt <57731498+mattmsft@users.noreply.github.com>
Date:   Thu Oct 26 12:49:02 2023 -0700

    Refactor prepdocs (Azure-Samples#862)

    * setting up types

    * setting up more types...

    * working on it...

    * prepdocs refactor

    * typing fixes; updating tests

    * more fixes; updating tests

    * fixing retry for embeddings

    * fixing adls gen2 list

    * more test fixes

    * fixes from manual runs

    * more fixes

    * more fixes

    * type fixes

    * more type fixes and test fixes

    * break into modules

    * openai embedding fix

    * novectors fix

    * fix id generation

    * doc strings

    * feedback from pr

    * rename feedback

    * trying to get imports to work

    * update test workflow with pamela's suggestion

    * fix ci again

    * delete __init__

    * mypy configuration

    ---------

    Co-authored-by: Matt Gotteiner <magottei@microsoft.com>

commit 94be632
Author: MaciejLitwiniec <MaciejLitwiniec@users.noreply.github.com>
Date:   Thu Oct 26 15:23:00 2023 +0200

    Updated FAQ so that it reflect PR 835 (Azure-Samples#868)

    * Updated FAQ so that it reflect PR 835

    * Update README.md

    * Update README.md

    ---------

    Co-authored-by: Pamela Fox <pamela.fox@gmail.com>

commit d7bbf9f
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Wed Oct 25 14:23:05 2023 -0700

    Bump werkzeug from 3.0.0 to 3.0.1 in /app/backend (Azure-Samples#863)

    Bumps [werkzeug](https://github.com/pallets/werkzeug) from 3.0.0 to 3.0.1.
    - [Release notes](https://github.com/pallets/werkzeug/releases)
    - [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
    - [Commits](pallets/werkzeug@3.0.0...3.0.1)

    ---
    updated-dependencies:
    - dependency-name: werkzeug
      dependency-type: indirect
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Pamela Fox <pamelafox@microsoft.com>

commit d02aa14
Author: Pamela Fox <pamelafox@microsoft.com>
Date:   Wed Oct 25 13:48:45 2023 -0700

    Message builder improvements (Azure-Samples#852)

commit 8f55988
Author: Pamela Fox <pamelafox@microsoft.com>
Date:   Tue Oct 24 19:53:07 2023 -0700

    Reorder tags to optimize for sample browser (Azure-Samples#853)

commit 16a61bf
Author: Pamela Fox <pamelafox@microsoft.com>
Date:   Mon Oct 23 17:03:59 2023 -0700

    Improve follow-up questions and pipe into context (Azure-Samples#832)

    * Add follow-up questions and parsing

    * Test breaking the e2e

    * Actually run tests

    * Fix runner

    * Add conditional

    * Fix the test

    * chat approach

commit ca01af9
Author: Anthony Shaw <anthony.p.shaw@gmail.com>
Date:   Mon Oct 23 09:27:32 2023 +1100

    Store an MD5 hash of uploaded/indexed file and check before prepdocs (Azure-Samples#835)

    * Hash the uploaded files locally and skip them if you provision a second time and they haven't changed

    * Overwrite the hash when it changes

    * Remove open mode parameter

    * fix f-string

    * reformat changes

    * Update prepdocs.py
  • Loading branch information
vishalgtingre committed Nov 5, 2023
1 parent d226448 commit 8429f98
Show file tree
Hide file tree
Showing 80 changed files with 2,392 additions and 873 deletions.
19 changes: 15 additions & 4 deletions .github/workflows/python-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
python-version: ${{ matrix.python_version }}
architecture: x64
- name: Setup node
uses: actions/setup-node@v3
uses: actions/setup-node@v4
with:
node-version: 18
- name: Build frontend
Expand All @@ -40,14 +40,25 @@ jobs:
- name: Lint with ruff
run: ruff .
- name: Check types with mypy
run: python3 -m mypy scripts/ app/
run: |
cd scripts/
python3 -m mypy .
cd ../app/
python3 -m mypy .
- name: Check formatting with black
run: black . --check --verbose
- name: Run Python tests
if: runner.os != 'Windows'
run: python3 -m pytest -s -vv --cov --cov-fail-under=78
run: python3 -m pytest -s -vv --cov --cov-fail-under=87
- name: Run E2E tests with Playwright
id: e2e
if: runner.os != 'Windows'
run: |
playwright install --with-deps
python3 -m pytest tests/e2e.py
python3 -m pytest tests/e2e.py --tracing=retain-on-failure
- name: Upload test artifacts
if: ${{ failure() && steps.e2e.conclusion == 'failure' }}
uses: actions/upload-artifact@v3
with:
name: playwright-traces
path: test-results
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -144,4 +144,6 @@ cython_debug/
# NPM
npm-debug.log*
node_modules
static/
static/

data/*.md5
14 changes: 13 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,20 @@
"editor.defaultFormatter": "esbenp.prettier-vscode",
"editor.formatOnSave": true
},
"files.exclude": {
"**/__pycache__": true,
"**/.coverage": true,
"**/.pytest_cache": true,
"**/.ruff_cache": true,
"**/.mypy_cache": true
},
"search.exclude": {
"**/node_modules": true,
"static": true
}
},
"python.testing.pytestArgs": [
"tests"
],
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true
}
14 changes: 13 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,9 +108,19 @@ playwright install --with-deps
Run the tests:

```
python3 -m pytest tests/e2e.py
python3 -m pytest tests/e2e.py --tracing=retain-on-failure
```

When a failure happens, the trace zip will be saved in the test-results folder.
You can view that using the Playwright CLI:

```
playwright show-trace test-results/<trace-zip>
```

You can also use the online trace viewer at https://trace.playwright.dev/


## <a name="style"></a> Code Style

This codebase includes several languages: TypeScript, Python, Bicep, Powershell, and Bash.
Expand All @@ -135,3 +145,5 @@ Run `black` to format a file:
```
python3 -m black <path-to-file>
```

If you followed the steps above to install the pre-commit hooks, then you can just wait for those hooks to run `ruff` and `black` for you.
80 changes: 14 additions & 66 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@
name: ChatGPT + Enterprise data
description: Chat with your data using OpenAI and Cognitive Search.
languages:
- azdeveloper
- typescript
- python
- typescript
- bicep
- azdeveloper
products:
- azure
- azure-cognitive-search
- azure-openai
- azure-cognitive-search
- azure-app-service
- azure
page_type: sample
urlFragment: azure-search-openai-demo
---
Expand Down Expand Up @@ -266,6 +266,10 @@ By default, the deployed Azure web app will only allow requests from the same or

For the frontend code, change `BACKEND_URI` in `api.ts` to point at the deployed backend URL, so that all fetch requests will be sent to the deployed backend.

For an alternate frontend that's written in Web Components and deployed to Static Web Apps, check out
[azure-search-openai-javascript](https://github.com/Azure-Samples/azure-search-openai-javascript) and its guide
on [using a different backend](https://github.com/Azure-Samples/azure-search-openai-javascript#using-a-different-backend).

## Running locally

You can only run locally **after** having successfully run the `azd up` command. If you haven't yet, follow the steps in [Azure deployment](#azure-deployment) above.
Expand All @@ -285,50 +289,22 @@ Once in the web app:
* Explore citations and sources
* Click on "settings" to try different options, tweak prompts, etc.

## Customizing the UI and data

Once you successfully deploy the app, you can start customizing it for your needs: changing the text, tweaking the prompts, and replacing the data. Consult the [app customization guide](docs/customization.md) as well as the [data ingestion guide](docs/data_ingestion.md) for more details.

## Productionizing

This sample is designed to be a starting point for your own production application,
but you should do a thorough review of the security and performance before deploying
to production. Here are some things to consider:

* **OpenAI Capacity**: The default TPM (tokens per minute) is set to 30K. That is equivalent
to approximately 30 conversations per minute (assuming 1K per user message/response).
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity`
parameters in `infra/main.bicep` to your account's maximum capacity.
You can also view the Quotas tab in [Azure OpenAI studio](https://oai.azure.com/)
to understand how much capacity you have.
* **Azure Storage**: The default storage account uses the `Standard_LRS` SKU.
To improve your resiliency, we recommend using `Standard_ZRS` for production deployments,
which you can specify using the `sku` property under the `storage` module in `infra/main.bicep`.
* **Azure Cognitive Search**: The default search service uses the `Standard` SKU
with the free semantic search option, which gives you 1000 free queries a month.
Assuming your app will experience more than 1000 questions, you should either change `semanticSearch`
to "standard" or disable semantic search entirely in the `/app/backend/approaches` files.
If you see errors about search service capacity being exceeded, you may find it helpful to increase
the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`
or manually scaling it from the Azure Portal.
* **Azure App Service**: The default app service plan uses the `Basic` SKU with 1 CPU core and 1.75 GB RAM.
We recommend using a Premium level SKU, starting with 1 CPU core.
You can use auto-scaling rules or scheduled scaling rules,
and scale up the maximum/minimum based on load.
* **Authentication**: By default, the deployed app is publicly accessible.
We recommend restricting access to authenticated users.
See [Enabling authentication](#enabling-authentication) above for how to enable authentication.
* **Networking**: We recommend deploying inside a Virtual Network. If the app is only for
internal enterprise use, use a private DNS zone. Also consider using Azure API Management (APIM)
for firewalls and other forms of protection.
For more details, read [Azure OpenAI Landing Zone reference architecture](https://techcommunity.microsoft.com/t5/azure-architecture-blog/azure-openai-landing-zone-reference-architecture/ba-p/3882102).
* **Loadtesting**: We recommend running a loadtest for your expected number of users.
You can use the [locust tool](https://docs.locust.io/) with the `locustfile.py` in this sample
or set up a loadtest with Azure Load Testing.

to production. Read through our [productionizing guide](docs/productionizing.md) for more details.

## Resources

* [Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and Cognitive Search](https://aka.ms/entgptsearchblog)
* [Azure Cognitive Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search)
* [Azure OpenAI Service](https://learn.microsoft.com/azure/cognitive-services/openai/overview)
* [Comparing Azure OpenAI and OpenAI](https://learn.microsoft.com/en-gb/azure/cognitive-services/openai/overview#comparing-azure-openai-and-openai/)
* [Comparing Azure OpenAI and OpenAI](https://learn.microsoft.com/azure/cognitive-services/openai/overview#comparing-azure-openai-and-openai/)

## Clean up

Expand All @@ -346,18 +322,6 @@ The resource group and all the resources will be deleted.
### FAQ

<details><a id="ingestion-why-chunk"></a>
<summary>Why do we need to break up the PDFs into chunks when Azure Cognitive Search supports searching large documents?</summary>

Chunking allows us to limit the amount of information we send to OpenAI due to token limits. By breaking up the content, it allows us to easily find potential chunks of text that we can inject into OpenAI. The method of chunking we use leverages a sliding window of text such that sentences that end one chunk will start the next. This allows us to reduce the chance of losing the context of the text.
</details>

<details><a id="ingestion-more-pdfs"></a>
<summary>How can we upload additional PDFs without redeploying everything?</summary>

To upload more PDFs, put them in the data/ folder and run `./scripts/prepdocs.sh` or `./scripts/prepdocs.ps1`. To avoid reuploading existing docs, move them out of the data folder. You could also implement checks to see whats been uploaded before; our code doesn't yet have such checks.
</details>

<details><a id="compare-samples"></a>
<summary>How does this sample compare to other Chat with Your Data samples?</summary>

Expand Down Expand Up @@ -397,22 +361,6 @@ Technology comparison:
In `infra/main.bicep`, change `chatGptModelName` to 'gpt-4' instead of 'gpt-35-turbo'. You may also need to adjust the capacity above that line depending on how much TPM your account is allowed.
</details>

<details><a id="chat-ask-diff"></a>
<summary>What is the difference between the Chat and Ask tabs?</summary>

The chat tab uses the approach programmed in [chatreadretrieveread.py](https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/app/backend/approaches/chatreadretrieveread.py).

- It uses the ChatGPT API to turn the user question into a good search query.
- It queries Azure Cognitive Search for search results for that query (optionally using the vector embeddings for that query).
- It then combines the search results and original user question, and asks ChatGPT API to answer the question based on the sources. It includes the last 4K of message history as well (or however many tokens are allowed by the deployed model).

The ask tab uses the approach programmed in [retrievethenread.py](https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/app/backend/approaches/retrievethenread.py).

- It queries Azure Cognitive Search for search results for the user question (optionally using the vector embeddings for that question).
- It then combines the search results and user question, and asks ChatGPT API to answer the question based on the sources.

</details>

<details><a id="azd-up-explanation"></a>
<summary>What does the `azd up` command do?</summary>

Expand Down
42 changes: 32 additions & 10 deletions app/backend/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

import aiohttp
import openai
from azure.core.exceptions import ResourceNotFoundError
from azure.identity.aio import DefaultAzureCredential
from azure.monitor.opentelemetry import configure_azure_monitor
from azure.search.documents.aio import SearchClient
Expand Down Expand Up @@ -39,6 +40,10 @@
CONFIG_BLOB_CONTAINER_CLIENT = "blob_container_client"
CONFIG_AUTH_CLIENT = "auth_client"
CONFIG_SEARCH_CLIENT = "search_client"
ERROR_MESSAGE = """The app encountered an error processing your request.
If you are an administrator of the app, view the full error in the logs. See aka.ms/appservice-logs for more information.
Error type: {error_type}
"""

bp = Blueprint("routes", __name__, static_folder="static")

Expand Down Expand Up @@ -69,9 +74,18 @@ async def assets(path):
# *** NOTE *** this assumes that the content files are public, or at least that all users of the app
# can access all the files. This is also slow and memory hungry.
@bp.route("/content/<path>")
async def content_file(path):
async def content_file(path: str):
# Remove page number from path, filename-1.txt -> filename.txt
if path.find("#page=") > 0:
path_parts = path.rsplit("#page=", 1)
path = path_parts[0]
logging.info("Opening file %s at page %s", path)
blob_container_client = current_app.config[CONFIG_BLOB_CONTAINER_CLIENT]
blob = await blob_container_client.get_blob_client(path).download_blob()
try:
blob = await blob_container_client.get_blob_client(path).download_blob()
except ResourceNotFoundError:
logging.exception("Path not found: %s", path)
abort(404)
if not blob.properties or not blob.properties.has_key("content_settings"):
abort(404)
mime_type = blob.properties["content_settings"]["content_type"]
Expand All @@ -83,6 +97,10 @@ async def content_file(path):
return await send_file(blob_file, mimetype=mime_type, as_attachment=False, attachment_filename=path)


def error_dict(error: Exception) -> dict:
return {"error": ERROR_MESSAGE.format(error_type=type(error))}


@bp.route("/ask", methods=["POST"])
async def ask():
if not request.is_json:
Expand All @@ -100,14 +118,18 @@ async def ask():
request_json["messages"], context=context, session_state=request_json.get("session_state")
)
return jsonify(r)
except Exception as e:
logging.exception("Exception in /ask")
return jsonify({"error": str(e)}), 500
except Exception as error:
logging.exception("Exception in /ask: %s", error)
return jsonify(error_dict(error)), 500


async def format_as_ndjson(r: AsyncGenerator[dict, None]) -> AsyncGenerator[str, None]:
async for event in r:
yield json.dumps(event, ensure_ascii=False) + "\n"
try:
async for event in r:
yield json.dumps(event, ensure_ascii=False) + "\n"
except Exception as e:
logging.exception("Exception while generating response stream: %s", e)
yield json.dumps(error_dict(e))


@bp.route("/chat", methods=["POST"])
Expand All @@ -134,9 +156,9 @@ async def chat():
response = await make_response(format_as_ndjson(result))
response.timeout = None # type: ignore
return response
except Exception as e:
logging.exception("Exception in /chat")
return jsonify({"error": str(e)}), 500
except Exception as error:
logging.exception("Exception in /chat: %s", error)
return jsonify(error_dict(error)), 500


# Send MSAL.js settings to the client UI
Expand Down
Loading

0 comments on commit 8429f98

Please sign in to comment.