-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forbid updating a repository bucket/location if there are mounted searchable snapshots using it #89696
Comments
Pinging @elastic/es-distributed (Team:Distributed) |
That sounds like the desired behaviour? The repository UUID identifies the repository in terms of its contents, so if you change the repository contents then it deserves a new UUID. Really the same thing should happen if you swap the repository out from under ES by fiddling with symlinks or mounting a different filesystem too, but we don't have a good way to detect that so I think we'd report it as a repository corruption. |
If we allow it to continue to get a new UUID, what should happen to the mounted searchable snapshots that referred to the old UUID (and may not be valid anymore since the new repos may have different content)? Is it OK that they become Unassigned, and would we expect users to then unmount/delete them manually? |
It's a bit tricky and I don't see a good general solution (i.e. one which covers extrinsic things like adjusting symlinks or mounting a different filesystem). I mean the data really just isn't there any more, so it does seem reasonable to fail the shards and require user intervention to fix things up. I can see some value in preventing changes to the location of a repository while it contains mounted snapshots, but I can also think of cases where it might be useful to be able to update the location of a repository in this state too. For instance, maybe the user really wants to move the same repository elsewhere in their filesystem? Or for blob-store-like repositories, maybe they want to rename the underlying bucket (not possible in proper S3, but I think you can achieve this in MinIO etc). Also what do we even mean by "location"? Certainly this includes the filesystem path for FS repositories and the bucket and base path for S3, but maybe also |
We could perhaps try and read the repository UUID from the new location and permit the change only if it is the same as the old one. But we wouldn't do that if the user said |
Is this scenario representative? I think it isn't. I would prefer to reject changing the repository settings (we can argue which ones) when there are mounted searchable snapshots, that would make users aware of potential issues with the change. |
I agree with @fcofdez that it's less confusing for users to prohibit them changing a repository with mounted snapshots. I also agree with @DaveCTurner that there should be a way for somebody to "move" the repository without necessarily needing to unmount all snapshots and then re-mount them. For the moment, in https://github.com/elastic/elasticsearch/blob/v8.4.0/server/src/main/java/org/elasticsearch/repositories/RepositoriesService.java#L216 the |
I think this isn't guaranteed - users can change the repository contents out from under us anyway, because they control the blob store/filesystem/whatever.
It won't be that common, but nor do I think it rare enough that we can forbid it entirely. I'm ok with rejecting this kind of thing by default but I do think we should have an escape hatch. Maybe |
Thinking a bit more @fcofdez and @DaveCTurner ... I see that Now about the escape hatch, to me, it means that the user can "move" or "re-register" a repository with different details, and the content is hopefully unchanged, in order to avoid having the user unmount all snapshots and re-mount them. In order to do that, the new repos needs to have the same UUID, else the searchable snapshots will be unassigned and complain. Now, the current behavior of
The use case @DaveCTurner has in mind could fit the first bullet point, since it will attempt to use the same UUID. As long as contents have not changed, it should in theory continue working (even if the user e.g. moved the repo from AWS to GCP). And now I wonder why the 2nd bullet point exists at all? The second bullet point could be done equivalently by deleting the repository and creating a new one with the same name. Thus, my belief is that What do you think? |
That's a good point (aside: we should have an escape hatch to permit unregistering a repository even if it's in use, ignoring the consequences, but that's a separate issue). So let's focus on the case of having the same underlying data (i.e. the same repo UUID) at a different location. I believe we should check it does really have the same UUID by default (i.e. if If |
My understanding is that this behaviour is allowed to recover from corrupted repositories.
I'm still not convinced that this operation is common enough to add all this complexity to an already quite complex piece of our code. Currently we don't have a way to clone the repository contents into another bucket safely, I think we've discussed this in the past. But I might be missing some context. |
A repository backup would let you do this. |
Isn't that also possible by equivalently deleting the repository and creating a new one with the same name? So far, I believe having the |
In that case we'll be changing the current verify semantics which it's fine I guess. My concern here is that we're somehow incentivising messing around with the underlying repository contents and that's been problematic in the past, but if we think the benefits outweigh the risks, we can proceed with this approach. |
But there's still room for improvement on that area #54944 |
The
I share the concern. On the positive note, what we are discussing is allowing somebody to "move" a repository. But only move it, rather than messing with its contents. We should add this word of caution in the documentation of the PUT API when it's used on an existing repository.
Thanks for the info! But at least repository backup seems to have sufficient wording as to how to manually take a safe backup. |
I meant this section within the PUT /_snapshot/my_repository API docs. |
I see. I think that section is for new repositories typically, rather than registering on top of an existing repository. I think this is the main difference. For existing repositories, we would extend the PUT API's verification process to check that the UUID/generations are there as well. I don't think we would do anything backwards-incompatible here, since the existing "verify" semantics will be the same, but we would need to extend the documentation to mention the additional check that the PUT API will do in both pages (the one @fcofdez you mentioned, and the PUT API page). |
I linked a support case here. I'm +1 on giving some way to change repository details, perhaps with verification, to administrators while searchable snapshots are mounted. I'm not sure what updates are allowed today - it seems clear we'd want to permit updating (removing) The location update is also compelling. We provide a repository analyzer API so that folks can try to use searchable snapshots with other s3-like blockstores. It's not unreasonable for an on-prem administrator to migrate data from one device to another and want to point Elasticsearch to the new device with minimal disruption, or maybe when a system goes down. Maybe they want to update from |
When an existing repository bucket/location is updated and points to an empty directory, the
RepositoryMetadata
is cleared, that generates a new repository UUID. If there are mounted searchable snapshots using the previous repository, those cannot be allocated since the old data is not available in the new repository location, additionally the repository uuid in the mounted searchable snapshot is different. The following test reproduces the issue:The text was updated successfully, but these errors were encountered: