Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read-Only RA-GRS Repository #90

Closed
pickypg opened this issue Jun 1, 2015 · 3 comments
Closed

Read-Only RA-GRS Repository #90

pickypg opened this issue Jun 1, 2015 · 3 comments

Comments

@pickypg
Copy link
Member

pickypg commented Jun 1, 2015

Azure provides the concept of geo-redundant storage, which asynchronously copies storage from one region to another. They call this RA-GRS. Copied from the link:

When you enable read-only access to your data in the secondary region, your data is available on a secondary endpoint, in addition to the primary endpoint for your storage account. The secondary endpoint is similar to the primary endpoint, but appends the suffix -secondary to the account name. For example, if your primary endpoint for the Blob service is myaccount.blob.core.windows.net, then your secondary endpoint is myaccount-secondary.blob.core.windows.net. The access keys for your storage account are the same for both the primary and secondary endpoints.

It would be great if a snapshot repository could be supported this way. It works pretty simply, when you create the original repository in your local region, it will automatically create a blob with the same name, plus "-secondary". The problem is that, as far as I know, there is no way to know when all of the copying has been completed (therefore making any restore somewhat risky). It's also read-only, which naturally prevents new snapshots from being written the reading cluster, but that's not its purpose.

@pickypg pickypg changed the title Read-Only Repository Read-Only RA-GRS Repository Jun 1, 2015
@craigwi
Copy link

craigwi commented Jun 1, 2015

Our use case for this is an active / passive pair of clusters in which the active cluster snapshots periodically and the passive cluster restores those snapshots as soon as they are ready. We use a custom marker file to know when a snapshot is done. The biggest cost here is the copy of the snapshot files from the storage account in the active cluster to the storage account in the passive cluster. We do the copy to ensure that in a super disaster we have a backup in the passive datacenter and to avoid restoring from a storage account on the otherwise of the country.

As indicated by this feature request, we use read geo-redundant storage and so Azure is async copying the files as they are written in the active cluster. Also, the passive cluster can become the active cluster at pretty much any time during failover.

In order for this to be most useful, the cloud azure plugin needs to support 2 pairs of storage account + key combinations, one of which would be used when the cluster is active to write snapshots and one to use when the cluster is passive to read from the remote -secondary account. When the cluster role switches, the usage of the storage accounts switch.

This means that the use of the secondary endpoint needs to be specified when the snapshot repository is set; for example:

{
"type" : "azure",
"settings" :
{
"container" : "" + snapshotContainerName + "",
"compress" : "true",
"secondary" : "true"
}
}

Thanks for considering this,
Craig.

@craigwi
Copy link

craigwi commented Jun 15, 2015

I built a prototype of this which works well, at least for our needs. I sent the code to our ES support contact Chris (@pickypg). You can reach me directly at craigwi@microsoft.com if you have any questions.

@dadoonet
Copy link
Member

I believe this has been implemented now with elastic/elasticsearch#13779

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants