-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For Azure, support multiple storage accounts and secondary endpoints (which are readonly) #13228
Conversation
…to ES master. Enhancements as discussed on https://github.com/craigwi/elasticsearch-cloud-azure/releases/tag/v2.7.1-craigwi: 1.supports multiple storage accounts; cloud.azure.storage.account and cloud.azure.storage.key may now be an array of accounts / keys; the arrays must be the same length; the account at an index must match the key at the same index. 2.an Azure repository specification ("type" : "azure") allows for two new settings. "account" specifies the name of the account to be used and must be one of the items in cloud.azure.storage.account. If "account" is not specified, the first item in the list of accounts is used. The other new setting "location_mode" may be used to specify the endpoint. This defaults to "primary_only" and may also be "primary_then_secondary", "secondary_only" or "secondary_then_primary". 3.when a repository is registered using "secondary_only" or "secondary_then_primary" as the "location_mode", the verification of the repository is limited to checking that the container specified exists; in particular the tests-* files are not created because the secondary endpoint is read only. NOTE: for a given storage account, only one location_mode can be active at a time. An example showing settings in elasticsearch.yml: cloud.azure.storage.account: [ "azstorageaccount1", "mystorage2" ] cloud.azure.storage.key: [ "", "" ] A sample repository specification using the secondary endpoint: { "type": "azure", "settings": { "account" : "mystorage2", "container": "snapshots-20150701", "location_mode": "secondary_only"}}
Hi @craigwi. Thank you for bringing that PR here. As I wrote in elastic/elasticsearch-cloud-azure#93 (comment), I think we should do it a bit differently unless I'm missing something. Pasting the discussion here: Note that you also raised a valid point which is that we need to support in Let say that we can now create something like: cloud:
azure:
storage:
azure1:
account: your_azure_storage_account1
key: your_azure_storage_key1
default: true
azure2:
account: your_azure_storage_account2
key: your_azure_storage_key2
azure3:
account: your_azure_storage_account3
key: your_azure_storage_key3 Then when we create the repo, we can specify which credentials we want to use:
I know that we need to make one of those repo I think that you'll be able to define with this something really similar with what you are trying to achieve here. elasticsearch.ymlInstead of: cloud.azure.storage.account: [ "azstorageaccount1", "mystorage2" ]
cloud.azure.storage.key: [ "", "" ] define: cloud.azure.storage.azure1.account: "azstorageaccount1"
cloud.azure.storage.azure1.key: ""
cloud.azure.storage.azure2.account: "mystorage2"
cloud.azure.storage.azure2.key: "" UsageInstead of: PUT _snapshot/myrepo
{
"type": "azure",
"settings": {
"account" : "mystorage2",
"container": "snapshots-20150701",
"location_mode": "secondary_only"
}
} Define: PUT _snapshot/myrepo
{
"type": "azure",
"settings": {
"credentials" : "azure2",
"container": "snapshots-20150701",
"readonly": true,
"location_mode": "secondary_only"
}
} That said, I think we can easily auto detect that we are using secondary endpoint here so we automatically set I'm also wondering if we should not prefer an easier setting like While reading the Azure Storage Replication documentation, I was also wondering if we really need this flag?
If I understand it correctly, it means to me that the azure client basically adds In that case, does something like the following would work? cloud.azure.storage.azure1.account: "myaccount"
cloud.azure.storage.azure1.key: ""
cloud.azure.storage.azure2.account: "myaccount-secondary"
cloud.azure.storage.azure2.key: "" PUT _snapshot/myrepo
{
"type": "azure",
"settings": {
"credentials" : "azure2",
"container": "snapshots-20150701",
"readonly": true
}
} I did not check. May be What do you think? @imotov @skearns64 @ppf2 Feel free to add also your thoughts here! |
I think this answers my last question. The end point is different but the account must be kept as is, right? |
Hi David,
To support those combinations, which I see no reason not to support, one passes the account AND the location mode. The client library uses the secondary endpoint as required. That is, the concept in Azure is ONE storage account and key with the endpoints derived indirectly from the use case (and potentially other settings). My conclusion on this: location_mode, as I have implemented, is the correct way in Azure to use primary and secondary endpoints. The Java client library supports this and my proposed solution supports this. Regarding the “readonly” support for repositories, I like the feature! While secondary endpoints in Azure are NECESSARILY readonly, we should enable the use of a primary endpoint that it should be accessed readonly. That is, the concept of location_mode and readonly-ness are mostly orthogonal. As noted elsewhere, location mode cases #2, #3 and #4 above are implicitly readonly and should be treated as if “readonly”: true was set. My conclusion on this: the new “readonly” setting does not eliminate the need for “location_mode”. As for the configuration in yml, independent of the above points, I am fine with either the azure1, azure2 approach or the arrays approach. I actually started with the approach you suggested, but found the array approach more like the rest of the settings in the yml file and super simple to implement. It is clear that in general there might be lots of settings per storage account; cf. https://azure.microsoft.com/en-us/documentation/articles/storage-configure-connection-string. However, it is extremely rare that one would setting them differently for different storage account in one deployment of ES. Thus it would be reasonable to set, for example, the blob endpoint once for use with all storage accounts. My conclusion on this: either approach is fine. Let me know what you think. Craig. |
Hi Craig, spoke with @dadoonet and @skearns64 , we will take the PR from here and modify as needed. Thx for the contribution! |
Sounds good. Thanks for letting me know. |
I'm closing this one in favor of #13779 |
Follow up for elastic#13228. This commit adds support for a secondary storage account: ```yml cloud: azure: storage: my_account1: account: your_azure_storage_account1 key: your_azure_storage_key1 default: true my_account2: account: your_azure_storage_account2 key: your_azure_storage_key2 ``` When creating a repository, you can choose which azure account you want to use for it: ```sh curl -XPUT localhost:9200/_snapshot/my_backup1?pretty -d '{ "type": "azure" }' curl -XPUT localhost:9200/_snapshot/my_backup2?pretty -d '{ "type": "azure", "settings": { "account" : "my_account2", "location_mode": "secondary_only" } }' ``` `location_mode` supports `primary_only` or `secondary_only`. Defaults to `primary_only`. Note that if you set it to `secondary_only`, it will force `read_only` to true.
Follow up for #13228. This commit adds support for a secondary storage account: ```yml cloud: azure: storage: my_account1: account: your_azure_storage_account1 key: your_azure_storage_key1 default: true my_account2: account: your_azure_storage_account2 key: your_azure_storage_key2 ``` When creating a repository, you can choose which azure account you want to use for it: ```sh curl -XPUT localhost:9200/_snapshot/my_backup1?pretty -d '{ "type": "azure" }' curl -XPUT localhost:9200/_snapshot/my_backup2?pretty -d '{ "type": "azure", "settings": { "account" : "my_account2", "location_mode": "secondary_only" } }' ``` `location_mode` supports `primary_only` or `secondary_only`. Defaults to `primary_only`. Note that if you set it to `secondary_only`, it will force `read_only` to true. (cherry picked from commit 79a4d9c) # Conflicts: # docs/plugins/repository-azure.asciidoc # plugins/cloud-azure/src/main/java/org/elasticsearch/plugin/cloud/azure/CloudAzurePlugin.java # plugins/cloud-azure/src/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreTests.java # plugins/repository-azure/src/main/java/org/elasticsearch/cloud/azure/AzureRepositoryModule.java
…to ES master.
Enhancements as discussed on https://github.com/craigwi/elasticsearch-cloud-azure/releases/tag/v2.7.1-craigwi:
1.supports multiple storage accounts; cloud.azure.storage.account and cloud.azure.storage.key may now be an array of accounts / keys; the arrays must be the same length; the account at an index must match the key at the same index.
2.an Azure repository specification ("type" : "azure") allows for two new settings. "account" specifies the name of the account to be used and must be one of the items in cloud.azure.storage.account. If "account" is not specified, the first item in the list of accounts is used. The other new setting "location_mode" may be used to specify the endpoint. This defaults to "primary_only" and may also be "primary_then_secondary", "secondary_only" or "secondary_then_primary".
3.when a repository is registered using "secondary_only" or "secondary_then_primary" as the "location_mode", the verification of the repository is limited to checking that the container specified exists; in particular the tests-* files are not created because the secondary endpoint is read only.
NOTE: for a given storage account, only one location_mode can be active at a time.
An example showing settings in elasticsearch.yml:
cloud.azure.storage.account: [ "azstorageaccount1", "mystorage2" ]
cloud.azure.storage.key: [ "", "" ]
A sample repository specification using the secondary endpoint:
{ "type": "azure", "settings": { "account" : "mystorage2", "container": "snapshots-20150701",
"location_mode": "secondary_only"}}