Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add section about productionizing #577

Merged
merged 9 commits into from
Sep 1, 2023
Merged

Add section about productionizing #577

merged 9 commits into from
Sep 1, 2023

Conversation

pamelafox
Copy link
Collaborator

Purpose

This README adds a section with tips for putting this template into production, based on customer experiences.

Does this introduce a breaking change?

[ ] Yes
[X] No

Pull Request Type

What kind of change does this Pull Request introduce?

[ ] Bugfix
[ ] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[X] Documentation content changes
[ ] Other... Please describe:

Copy link
Collaborator

@charris-msft charris-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I LOVE THIS!!!

README.md Outdated
* **OpenAI Capacity**: The default TPM (tokens per minute) is set to 30K. That is equivalent
to approximately 30 conversations per minute (assuming 1K per user message/response).
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity` parameters in `infra/main.bicep` to your account's maximum capacity.
You can also view the Quotas tab from Azure OpenAI studio to understand how much capacity you have.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"from" > "in"

README.md Outdated
You can also view the Quotas tab from Azure OpenAI studio to understand how much capacity you have.
* **Azure Storage**: The default storage account uses the `Standard_LRS` SKU.
We recommend using `Standard_ZRS` for production deployments,
which you can specify using the `sku` property in `infra/main.bicep`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...sku property under module storage in ...

README.md Outdated
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity` parameters in `infra/main.bicep` to your account's maximum capacity.
You can also view the Quotas tab from Azure OpenAI studio to understand how much capacity you have.
* **Azure Storage**: The default storage account uses the `Standard_LRS` SKU.
We recommend using `Standard_ZRS` for production deployments,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To improve your resiliency we recommend...

README.md Show resolved Hide resolved
README.md Outdated
which you can specify using the `sku` property in `infra/main.bicep`.
* **Azure Cognitive Search**: The default search service uses the `Standard` SKU
with the free semantic search option. You should either change `semanticSearch` to "standard"
or disable semantic search entirely in the approaches files. If you see errors about search service capacity being exceeded, you may find it helpful to increase the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the /app/backend/approaches files.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's in infra/core/search/search-services.bicep, or manually scaling it from the Azure Portal.

Copy link
Collaborator

@chuwik chuwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of nits, otherwise looking good!

README.md Outdated
* **OpenAI Capacity**: The default TPM (tokens per minute) is set to 30K. That is equivalent
to approximately 30 conversations per minute (assuming 1K per user message/response).
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity` parameters in `infra/main.bicep` to your account's maximum capacity.
You can also view the Quotas tab from Azure OpenAI studio to understand how much capacity you have.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to use a link for Azure OpenAI studio

README.md Outdated
which you can specify using the `sku` property in `infra/main.bicep`.
* **Azure Cognitive Search**: The default search service uses the `Standard` SKU
with the free semantic search option. You should either change `semanticSearch` to "standard"
or disable semantic search entirely in the approaches files. If you see errors about search service capacity being exceeded, you may find it helpful to increase the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's in infra/core/search/search-services.bicep, or manually scaling it from the Azure Portal.

@pamelafox pamelafox merged commit 9cdbc1c into Azure-Samples:main Sep 1, 2023
6 checks passed
HughRunyan pushed a commit to RMI/RMI_chatbot that referenced this pull request Mar 26, 2024
* Remove defaults for getenv

* Remove print

* missing output

* readme section

* Update README with productionizing tips

* Add networking section

* Review feedback from comments
ratkinsoncinz pushed a commit to cinzlab/azure-search-openai-demo that referenced this pull request Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants