title | titleSuffix | description | services | author | manager | ms.service | ms.subservice | ms.topic | ms.date | ms.author | ms.custom |
---|---|---|---|---|---|---|---|---|---|---|---|
How to use private endpoints with Speech service |
Azure Cognitive Services |
Learn how to use Speech service with private endpoints provided by Azure Private Link |
cognitive-services |
alexeyo26 |
nitinme |
cognitive-services |
speech-service |
conceptual |
04/07/2021 |
alexeyo |
devx-track-azurepowershell |
Azure Private Link lets you connect to services in Azure by using a private endpoint. A private endpoint is a private IP address that's accessible only within a specific virtual network and subnet.
This article explains how to set up and use Private Link and private endpoints with Speech Services in Azure Cognitive Services. This article then describes how to remove private endpoints later, but still use the Speech resource.
Note
Before you proceed, review how to use virtual networks with Cognitive Services.
Setting up a Speech resource for the private endpoint scenarios requires performing the following tasks:
This article describes the usage of the private endpoints with Speech service. Usage of the VNet service endpoints is described here.
Private endpoints require a custom subdomain name for Cognitive Services. Use the following instructions to create one for your Speech resource.
Warning
A Speech resource with a custom domain name enabled uses a different way to interact with Speech service. You might have to adjust your application code for both of these scenarios: with private endpoint and without private endpoint.
When you turn on a custom domain name, the operation is not reversible. The only way to go back to the regional name is to create a new Speech resource.
If your Speech resource has a lot of associated custom models and projects created via Speech Studio, we strongly recommend trying the configuration with a test resource before you modify the resource used in production.
To create a custom domain name by using the Azure portal, follow these steps:
-
Go to the Azure portal and sign in to your Azure account.
-
Select the required Speech resource.
-
In the Resource Management group on the left pane, select Networking.
-
On the Firewalls and virtual networks tab, select Generate Custom Domain Name. A new right panel appears with instructions to create a unique custom subdomain for your resource.
-
In the Generate Custom Domain Name panel, enter a custom domain name. Your full custom domain will look like:
https://{your custom name}.cognitiveservices.azure.com
.Remember that after you create a custom domain name, it cannot be changed.
After you've entered your custom domain name, select Save.
-
After the operation finishes, in the Resource management group, select Keys and Endpoint. Confirm that the new endpoint name of your resource starts this way:
https://{your custom name}.cognitiveservices.azure.com
.
To create a custom domain name by using PowerShell, confirm that your computer has PowerShell version 7.x or later with the Azure PowerShell module version 5.1.0 or later. To see the versions of these tools, follow these steps:
-
In a PowerShell window, enter:
$PSVersionTable
Confirm that the
PSVersion
value is 7.x or later. To upgrade PowerShell, follow the instructions at Installing various versions of PowerShell. -
In a PowerShell window, enter:
Get-Module -ListAvailable Az
If nothing appears, or if that version of the Azure PowerShell module is earlier than 5.1.0, follow the instructions at Install the Azure PowerShell module to upgrade.
Before you proceed, run Connect-AzAccount
to create a connection with Azure.
Check whether the custom domain that you want to use is available. The following code confirms that the domain is available by using the Check Domain Availability operation in the Cognitive Services REST API.
Tip
The following code will not work in Azure Cloud Shell.
$subId = "Your Azure subscription Id"
$subdomainName = "custom domain name"
# Select the Azure subscription that contains the Speech resource.
# You can skip this step if your Azure account has only one active subscription.
Set-AzContext -SubscriptionId $subId
# Prepare the OAuth token to use in the request to the Cognitive Services REST API.
$Context = Get-AzContext
$AccessToken = (Get-AzAccessToken -TenantId $Context.Tenant.Id).Token
$token = ConvertTo-SecureString -String $AccessToken -AsPlainText -Force
# Prepare and send the request to the Cognitive Services REST API.
$uri = "https://management.azure.com/subscriptions/" + $subId + `
"/providers/Microsoft.CognitiveServices/checkDomainAvailability?api-version=2017-04-18"
$body = @{
subdomainName = $subdomainName
type = "Microsoft.CognitiveServices/accounts"
}
$jsonBody = $body | ConvertTo-Json
Invoke-RestMethod -Method Post -Uri $uri -ContentType "application/json" -Authentication Bearer `
-Token $token -Body $jsonBody | Format-List
If the desired name is available, you'll see a response like this:
isSubdomainAvailable : True
reason :
type :
subdomainName : my-custom-name
If the name is already taken, then you'll see the following response:
isSubdomainAvailable : False
reason : Sub domain name 'my-custom-name' is already used. Please pick a different name.
type :
subdomainName : my-custom-name
To turn on a custom domain name for the selected Speech resource, use the Set-AzCognitiveServicesAccount cmdlet.
Warning
After the following code runs successfully, you'll create a custom domain name for your Speech resource. Remember that this name cannot be changed.
$resourceGroup = "Resource group name where Speech resource is located"
$speechResourceName = "Your Speech resource name"
$subdomainName = "custom domain name"
# Select the Azure subscription that contains the Speech resource.
# You can skip this step if your Azure account has only one active subscription.
$subId = "Your Azure subscription Id"
Set-AzContext -SubscriptionId $subId
# Set the custom domain name to the selected resource.
# WARNING: THIS CANNOT BE CHANGED OR UNDONE!
Set-AzCognitiveServicesAccount -ResourceGroupName $resourceGroup `
-Name $speechResourceName -CustomSubdomainName $subdomainName
[!INCLUDE azure-cli-prepare-your-environment.md]
This section requires the latest version of the Azure CLI. If you're using Azure Cloud Shell, the latest version is already installed.
Check whether the custom domain that you want to use is free. Use the Check Domain Availability method from the Cognitive Services REST API.
Copy the following code block, insert your preferred custom domain name, and save to the file subdomain.json
.
{
"subdomainName": "custom domain name",
"type": "Microsoft.CognitiveServices/accounts"
}
Copy the file to your current folder or upload it to Azure Cloud Shell and run the following command. Replace xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
with your Azure subscription ID.
az rest --method post --url "https://management.azure.com/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.CognitiveServices/checkDomainAvailability?api-version=2017-04-18" --body @subdomain.json
If the desired name is available, you'll see a response like this:
{
"isSubdomainAvailable": true,
"reason": null,
"subdomainName": "my-custom-name",
"type": null
}
If the name is already taken, then you'll see the following response:
{
"isSubdomainAvailable": false,
"reason": "Sub domain name 'my-custom-name' is already used. Please pick a different name.",
"subdomainName": "my-custom-name",
"type": null
}
To use a custom domain name with the selected Speech resource, use the az cognitiveservices account update command.
Select the Azure subscription that contains the Speech resource. If your Azure account has only one active subscription, you can skip this step. Replace xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
with your Azure subscription ID.
az account set --subscription xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Set the custom domain name to the selected resource. Replace the sample parameter values with the actual ones and run the following command.
Warning
After successful execution of the following command, you'll create a custom domain name for your Speech resource. Remember that this name cannot be changed.
az cognitiveservices account update --name my-speech-resource-name --resource-group my-resource-group-name --custom-domain my-custom-name
We recommend using the private DNS zone attached to the virtual network with the necessary updates for the private endpoints. You can create a private DNS zone during the provisioning process. If you're using your own DNS server, you might also need to change your DNS configuration.
Decide on a DNS strategy before you provision private endpoints for a production Speech resource. And test your DNS changes, especially if you use your own DNS server.
Use one of the following articles to create private endpoints. These articles use a web app as a sample resource to make available through private endpoints.
- Create a private endpoint by using the Azure portal
- Create a private endpoint by using Azure PowerShell
- Create a private endpoint by using Azure CLI
Use these parameters instead of the parameters in the article that you chose:
Setting | Value |
---|---|
Resource type | Microsoft.CognitiveServices/accounts |
Resource | <your-speech-resource-name> |
Target sub-resource | account |
DNS for private endpoints: Review the general principles of DNS for private endpoints in Cognitive Services resources. Then confirm that your DNS configuration is working correctly by performing the checks described in the following sections.
This check is required.
Follow these steps to test the custom DNS entry from your virtual network:
-
Log in to a virtual machine located in the virtual network to which you've attached your private endpoint.
-
Open a Windows command prompt or a Bash shell, run
nslookup
, and confirm that it successfully resolves your resource's custom domain name.C:\>nslookup my-private-link-speech.cognitiveservices.azure.com Server: UnKnown Address: 168.63.129.16 Non-authoritative answer: Name: my-private-link-speech.privatelink.cognitiveservices.azure.com Address: 172.28.0.10 Aliases: my-private-link-speech.cognitiveservices.azure.com
-
Confirm that the IP address matches the IP address of your private endpoint.
Perform this check only if you've turned on either the All networks option or the Selected Networks and Private Endpoints access option in the Networking section of your resource.
If you plan to access the resource by using only a private endpoint, you can skip this section.
-
Log in to a computer attached to a network that's allowed to access the resource.
-
Open a Windows command prompt or Bash shell, run
nslookup
, and confirm that it successfully resolves your resource's custom domain name.C:\>nslookup my-private-link-speech.cognitiveservices.azure.com Server: UnKnown Address: fe80::1 Non-authoritative answer: Name: vnetproxyv1-weu-prod.westeurope.cloudapp.azure.com Address: 13.69.67.71 Aliases: my-private-link-speech.cognitiveservices.azure.com my-private-link-speech.privatelink.cognitiveservices.azure.com westeurope.prod.vnet.cog.trafficmanager.net
Note
The resolved IP address points to a virtual network proxy endpoint, which dispatches the network traffic to the private endpoint for the Cognitive Services resource. The behavior will be different for a resource with a custom domain name but without private endpoints. See this section for details.
A Speech resource with a custom domain interacts with Speech Services in a different way. This is true for a custom-domain-enabled Speech resource both with and without private endpoints. Information in this section applies to both scenarios.
Follow instructions in this section to adjust existing applications and solutions to use a Speech resource with a custom domain name and a private endpoint turned on.
A Speech resource with a custom domain name and a private endpoint turned on uses a different way to interact with Speech Services. This section explains how to use such a resource with the Speech Services REST APIs and the Speech SDK.
Note
A Speech resource without private endpoints that uses a custom domain name also has a special way of interacting with Speech Services. This way differs from the scenario of a Speech resource that uses a private endpoint. This is important to consider because you may decide to remove private endpoints later. See Adjust an application to use a Speech resource without private endpoints later in this article.
We'll use my-private-link-speech.cognitiveservices.azure.com
as a sample Speech resource DNS name (custom domain) for this section.
Speech service has REST APIs for Speech-to-text and Text-to-speech. Consider the following information for the private-endpoint-enabled scenario.
Speech-to-text has two REST APIs. Each API serves a different purpose, uses different endpoints, and requires a different approach when you're using it in the private-endpoint-enabled scenario.
The Speech-to-text REST APIs are:
- Speech-to-text REST API v3.0, which is used for Batch transcription and Custom Speech. v3.0 is a successor of v2.0
- Speech-to-text REST API for short audio, which is used for online transcription
Usage of the Speech-to-text REST API for short audio and the Text-to-speech REST API in the private endpoint scenario is the same. It's equivalent to the Speech SDK case described later in this article.
Speech-to-text REST API v3.0 uses a different set of endpoints, so it requires a different approach for the private-endpoint-enabled scenario.
The next subsections describe both cases.
Usually, Speech resources use Cognitive Services regional endpoints for communicating with the Speech-to-text REST API v3.0. These resources have the following naming format:
{region}.api.cognitive.microsoft.com
.
This is a sample request URL:
https://westeurope.api.cognitive.microsoft.com/speechtotext/v3.0/transcriptions
Note
See this article for Azure Government and Azure China endpoints.
After you turn on a custom domain for a Speech resource (which is necessary for private endpoints), that resource will use the following DNS name pattern for the basic REST API endpoint:
{your custom name}.cognitiveservices.azure.com
That means that in our example, the REST API endpoint name will be:
my-private-link-speech.cognitiveservices.azure.com
And the sample request URL needs to be converted to:
https://my-private-link-speech.cognitiveservices.azure.com/speechtotext/v3.0/transcriptions
This URL should be reachable from the virtual network with the private endpoint attached (provided the correct DNS resolution).
After you turn on a custom domain name for a Speech resource, you typically replace the host name in all request URLs with the new custom domain host name. All other parts of the request (like the path /speechtotext/v3.0/transcriptions
in the earlier example) remain the same.
Tip
Some customers develop applications that use the region part of the regional endpoint's DNS name (for example, to send the request to the Speech resource deployed in the particular Azure region).
A custom domain for a Speech resource contains no information about the region where the resource is deployed. So the application logic described earlier will not work and needs to be altered.
The Speech-to-text REST API for short audio and the Text-to-speech REST API use two types of endpoints:
- Cognitive Services regional endpoints for communicating with the Cognitive Services REST API to obtain an authorization token
- Special endpoints for all other operations
Note
See this article for Azure Government and Azure China endpoints.
The detailed description of the special endpoints and how their URL should be transformed for a private-endpoint-enabled Speech resource is provided in this subsection about usage with the Speech SDK. The same principle described for the SDK applies for the Speech-to-text REST API for short audio and the Text-to-speech REST API.
Get familiar with the material in the subsection mentioned in the previous paragraph and see the following example. The example describes the Text-to-speech REST API. Usage of the Speech-to-text REST API for short audio is fully equivalent.
Note
When you're using the Speech-to-text REST API for short audio and Text-to-speech REST API in private endpoint scenarios, use a subscription key passed through the Ocp-Apim-Subscription-Key
header. (See details for Speech-to-text REST API for short audio and Text-to-speech REST API)
Using an authorization token and passing it to the special endpoint via the Authorization
header will work only if you've turned on the All networks access option in the Networking section of your Speech resource. In other cases you will get either Forbidden
or BadRequest
error when trying to obtain an authorization token.
Text-to-speech REST API usage example
We'll use West Europe as a sample Azure region and my-private-link-speech.cognitiveservices.azure.com
as a sample Speech resource DNS name (custom domain). The custom domain name my-private-link-speech.cognitiveservices.azure.com
in our example belongs to the Speech resource created in the West Europe region.
To get the list of the voices supported in the region, perform the following request:
https://westeurope.tts.speech.microsoft.com/cognitiveservices/voices/list
See more details in the Text-to-speech REST API documentation.
For the private-endpoint-enabled Speech resource, the endpoint URL for the same operation needs to be modified. The same request will look like this:
https://my-private-link-speech.cognitiveservices.azure.com/tts/cognitiveservices/voices/list
See a detailed explanation in the Construct endpoint URL subsection for the Speech SDK.
Using the Speech SDK with a custom domain name and private-endpoint-enabled Speech resources requires you to review and likely change your application code.
We'll use my-private-link-speech.cognitiveservices.azure.com
as a sample Speech resource DNS name (custom domain) for this section.
Usually in SDK scenarios (as well as in the Speech-to-text REST API for short audio and Text-to-speech REST API scenarios), Speech resources use the dedicated regional endpoints for different service offerings. The DNS name format for these endpoints is:
{region}.{speech service offering}.speech.microsoft.com
An example DNS name is:
westeurope.stt.speech.microsoft.com
All possible values for the region (first element of the DNS name) are listed in Speech service supported regions. (See this article for Azure Government and Azure China endpoints.) The following table presents the possible values for the Speech service offering (second element of the DNS name):
DNS name value | Speech service offering |
---|---|
commands |
Custom Commands |
convai |
Conversation Transcription |
s2s |
Speech Translation |
stt |
Speech-to-text |
tts |
Text-to-speech |
voice |
Custom Voice |
So the earlier example (westeurope.stt.speech.microsoft.com
) stands for a Speech-to-text endpoint in West Europe.
Private-endpoint-enabled endpoints communicate with Speech service via a special proxy. Because of that, you must change the endpoint connection URLs.
A "standard" endpoint URL looks like:
{region}.{speech service offering}.speech.microsoft.com/{URL path}
A private endpoint URL looks like:
{your custom name}.cognitiveservices.azure.com/{speech service offering}/{URL path}
Example 1. An application is communicating by using the following URL (speech recognition using the base model for US English in West Europe):
wss://westeurope.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US
To use it in the private-endpoint-enabled scenario when the custom domain name of the Speech resource is my-private-link-speech.cognitiveservices.azure.com
, you must modify the URL like this:
wss://my-private-link-speech.cognitiveservices.azure.com/stt/speech/recognition/conversation/cognitiveservices/v1?language=en-US
Notice the details:
- The host name
westeurope.stt.speech.microsoft.com
is replaced by the custom domain host namemy-private-link-speech.cognitiveservices.azure.com
. - The second element of the original DNS name (
stt
) becomes the first element of the URL path and precedes the original path. So the original URL/speech/recognition/conversation/cognitiveservices/v1?language=en-US
becomes/stt/speech/recognition/conversation/cognitiveservices/v1?language=en-US
.
Example 2. An application uses the following URL to synthesize speech in West Europe by using a custom voice model:
https://westeurope.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId=974481cc-b769-4b29-af70-2fb557b897c4
The following equivalent URL uses a private endpoint, where the custom domain name of the Speech resource is my-private-link-speech.cognitiveservices.azure.com
:
https://my-private-link-speech.cognitiveservices.azure.com/voice/cognitiveservices/v1?deploymentId=974481cc-b769-4b29-af70-2fb557b897c4
The same principle in Example 1 is applied, but the key element this time is voice
.
Follow these steps to modify your code:
-
Determine the application endpoint URL:
- Turn on logging for your application and run it to log activity.
- In the log file, search for
SPEECH-ConnectionUrl
. In matching lines, thevalue
parameter contains the full URL that your application used to reach Speech Services.
Example:
(114917): 41ms SPX_DBG_TRACE_VERBOSE: property_bag_impl.cpp:138 ISpxPropertyBagImpl::LogPropertyAndValue: this=0x0000028FE4809D78; name='SPEECH-ConnectionUrl'; value='wss://westeurope.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?traffictype=spx&language=en-US'
So the URL that the application used in this example is:
wss://westeurope.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US
-
Create a
SpeechConfig
instance by using a full endpoint URL:-
Modify the endpoint that you just determined, as described in the earlier Construct endpoint URL section.
-
Modify how you create the instance of
SpeechConfig
. Most likely, your application is using something like this:var config = SpeechConfig.FromSubscription(subscriptionKey, azureRegion);
This won't work for a private-endpoint-enabled Speech resource because of the host name and URL changes that we described in the previous sections. If you try to run your existing application without any modifications by using the key of a private-endpoint-enabled resource, you'll get an authentication error (401).
To make it work, modify how you instantiate the
SpeechConfig
class and use "from endpoint"/"with endpoint" initialization. Suppose we have the following two variables defined:subscriptionKey
contains the key of the private-endpoint-enabled Speech resource.endPoint
contains the full modified endpoint URL (using the type required by the corresponding programming language). In our example, this variable should contain:wss://my-private-link-speech.cognitiveservices.azure.com/stt/speech/recognition/conversation/cognitiveservices/v1?language=en-US
Create a
SpeechConfig
instance:var config = SpeechConfig.FromEndpoint(endPoint, subscriptionKey);
auto config = SpeechConfig::FromEndpoint(endPoint, subscriptionKey);
SpeechConfig config = SpeechConfig.fromEndpoint(endPoint, subscriptionKey);
import azure.cognitiveservices.speech as speechsdk speech_config = speechsdk.SpeechConfig(endpoint=endPoint, subscription=subscriptionKey)
SPXSpeechConfiguration *speechConfig = [[SPXSpeechConfiguration alloc] initWithEndpoint:endPoint subscription:subscriptionKey];
-
Tip
The query parameters specified in the endpoint URI are not changed, even if they're set by other APIs. For example, if the recognition language is defined in the URI as query parameter language=en-US
, and is also set to ru-RU
via the corresponding property, the language setting in the URI is used. The effective language is then en-US
.
Parameters set in the endpoint URI always take precedence. Other APIs can override only parameters that are not specified in the endpoint URI.
After this modification, your application should work with the private-endpoint-enabled Speech resources. We're working on more seamless support of private endpoint scenarios.
In this article, we've pointed out several times that enabling a custom domain for a Speech resource is irreversible. Such a resource will use a different way of communicating with Speech service, compared to the ones that are using regional endpoint names.
This section explains how to use a Speech resource with a custom domain name but without any private endpoints with the Speech Services REST APIs and Speech SDK. This might be a resource that was once used in a private endpoint scenario, but then had its private endpoints deleted.
Remember how a custom domain DNS name of the private-endpoint-enabled Speech resource is resolved from public networks. In this case, the IP address resolved points to a proxy endpoint for a virtual network. That endpoint is used for dispatching the network traffic to the private-endpoint-enabled Cognitive Services resource.
However, when all resource private endpoints are removed (or right after the enabling of the custom domain name), the CNAME record of the Speech resource is reprovisioned. It now points to the IP address of the corresponding Cognitive Services regional endpoint.
So the output of the nslookup
command will look like this:
C:\>nslookup my-private-link-speech.cognitiveservices.azure.com
Server: UnKnown
Address: fe80::1
Non-authoritative answer:
Name: apimgmthskquihpkz6d90kmhvnabrx3ms3pdubscpdfk1tsx3a.cloudapp.net
Address: 13.93.122.1
Aliases: my-private-link-speech.cognitiveservices.azure.com
westeurope.api.cognitive.microsoft.com
cognitiveweprod.trafficmanager.net
cognitiveweprod.azure-api.net
apimgmttmdjylckcx6clmh2isu2wr38uqzm63s8n4ub2y3e6xs.trafficmanager.net
cognitiveweprod-westeurope-01.regional.azure-api.net
Compare it with the output from this section.
Speech-to-text REST API v3.0 usage is fully equivalent to the case of private-endpoint-enabled Speech resources.
In this case, usage of the Speech-to-text REST API for short audio and usage of the Text-to-speech REST API have no differences from the general case, with one exception. (See the following note.) You should use both APIs as described in the speech-to-text REST API for short audio and Text-to-speech REST API documentation.
Note
When you're using the Speech-to-text REST API for short audio and Text-to-speech REST API in custom domain scenarios, use a subscription key passed through the Ocp-Apim-Subscription-Key
header. (See details for Speech-to-text REST API for short audio and Text-to-speech REST API)
Using an authorization token and passing it to the special endpoint via the Authorization
header will work only if you've turned on the All networks access option in the Networking section of your Speech resource. In other cases you will get either Forbidden
or BadRequest
error when trying to obtain an authorization token.
Using the Speech SDK with custom-domain-enabled Speech resources without private endpoints is equivalent to the general case as described in the Speech SDK documentation.
In case you have modified your code for using with a private-endpoint-enabled Speech resource, consider the following.
In the section on private-endpoint-enabled Speech resources, we explained how to determine the endpoint URL, modify it, and make it work through "from endpoint"/"with endpoint" initialization of the SpeechConfig
class instance.
However, if you try to run the same application after having all private endpoints removed (allowing some time for the corresponding DNS record reprovisioning), you'll get an internal service error (404). The reason is that the DNS record now points to the regional Cognitive Services endpoint instead of the virtual network proxy, and the URL paths like /stt/speech/recognition/conversation/cognitiveservices/v1?language=en-US
won't be found there.
You need to roll back your application to the standard instantiation of SpeechConfig
in the style of the following code:
var config = SpeechConfig.FromSubscription(subscriptionKey, azureRegion);
For pricing details, see Azure Private Link pricing.