-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add section about productionizing #577
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I LOVE THIS!!!
README.md
Outdated
* **OpenAI Capacity**: The default TPM (tokens per minute) is set to 30K. That is equivalent | ||
to approximately 30 conversations per minute (assuming 1K per user message/response). | ||
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity` parameters in `infra/main.bicep` to your account's maximum capacity. | ||
You can also view the Quotas tab from Azure OpenAI studio to understand how much capacity you have. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"from" > "in"
README.md
Outdated
You can also view the Quotas tab from Azure OpenAI studio to understand how much capacity you have. | ||
* **Azure Storage**: The default storage account uses the `Standard_LRS` SKU. | ||
We recommend using `Standard_ZRS` for production deployments, | ||
which you can specify using the `sku` property in `infra/main.bicep`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...sku
property under module storage
in ...
README.md
Outdated
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity` parameters in `infra/main.bicep` to your account's maximum capacity. | ||
You can also view the Quotas tab from Azure OpenAI studio to understand how much capacity you have. | ||
* **Azure Storage**: The default storage account uses the `Standard_LRS` SKU. | ||
We recommend using `Standard_ZRS` for production deployments, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To improve your resiliency we recommend...
README.md
Outdated
which you can specify using the `sku` property in `infra/main.bicep`. | ||
* **Azure Cognitive Search**: The default search service uses the `Standard` SKU | ||
with the free semantic search option. You should either change `semanticSearch` to "standard" | ||
or disable semantic search entirely in the approaches files. If you see errors about search service capacity being exceeded, you may find it helpful to increase the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the /app/backend/approaches files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's in infra/core/search/search-services.bicep
, or manually scaling it from the Azure Portal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of nits, otherwise looking good!
README.md
Outdated
* **OpenAI Capacity**: The default TPM (tokens per minute) is set to 30K. That is equivalent | ||
to approximately 30 conversations per minute (assuming 1K per user message/response). | ||
You can increase the capacity by changing the `chatGptDeploymentCapacity` and `embeddingDeploymentCapacity` parameters in `infra/main.bicep` to your account's maximum capacity. | ||
You can also view the Quotas tab from Azure OpenAI studio to understand how much capacity you have. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to use a link for Azure OpenAI studio
README.md
Outdated
which you can specify using the `sku` property in `infra/main.bicep`. | ||
* **Azure Cognitive Search**: The default search service uses the `Standard` SKU | ||
with the free semantic search option. You should either change `semanticSearch` to "standard" | ||
or disable semantic search entirely in the approaches files. If you see errors about search service capacity being exceeded, you may find it helpful to increase the number of replicas by changing `replicaCount` in `infra/core/search/search-services.bicep`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's in infra/core/search/search-services.bicep
, or manually scaling it from the Azure Portal.
* Remove defaults for getenv * Remove print * missing output * readme section * Update README with productionizing tips * Add networking section * Review feedback from comments
Purpose
This README adds a section with tips for putting this template into production, based on customer experiences.
Does this introduce a breaking change?
Pull Request Type
What kind of change does this Pull Request introduce?