Cleaning up missing containers on Ozone #4689

xBis7 · 2023-05-09T17:07:09Z

xBis7
May 9, 2023
Collaborator

There is insufficient functionality in Ozone for cleaning up missing containers. Once a container is missing then it stays in the system forever.

I have been looking into providing a way to address this issue and I have written a document. If anyone is interested in this, please take a look at the doc and put comments on it. I'd appreciate any feedback before finalizing the implementation.

doc: Ozone Missing Containers Cleanup

arp7 · 2023-05-18T22:15:35Z

arp7
May 18, 2023
Collaborator

Thanks for sharing this doc @xBis7

Do you have a jira for this?

6 replies

devmadhuu May 23, 2023
Collaborator

@xBis7 - Just had a quick glance at doc for missing container cleanup, so as I understood , you are suggesting to take actions from Recon and send commands to SCM for cleaning missing containers. When SCM or Recon receives container reports from DNs, and if a container is not mentioned in any of the periodic report, then SCM/Recon removes from system. And Recon is based on eventual consistency model, so how this cleanup command or background service will help, Am I missing something here ?

xBis7 May 23, 2023
Collaborator Author

@devmadhuu Thanks for reviewing the doc.

you are suggesting to take actions from Recon and send commands to SCM for cleaning missing containers.

I'm saying that Recon should be able to pick up any changes in the SCM. Container changes usually get reported from the datanodes but that's not possible in case the datanode that has the container is dead. There is one task in Recon that checks regularly the SCM and that's the ContainerHealthTask, so we can use that to act upon picking up the change.

Once, the ContainerHealthTask picks up that the container can no longer be found in the SCM, it can remove the container from Recon's tables.

The user will send a command that will clean up OM and SCM and Recon will pick up the change in SCM and act accordingly.

xBis7 May 23, 2023
Collaborator Author

When SCM or Recon receives container reports from DNs, and if a container is not mentioned in any of the periodic report, then SCM/Recon removes from system.

That's true for replicas. The container is not removed completely. We can still find it in Recon-scm.db and in SCM's container state map.

sumitagrawl May 24, 2023
Collaborator

@xBis7
When all the DNs are not available, the replica count for container will be "0". In this state, no action is taken. IMO, there should not be any automatic action because it may cause dataloss. This needs special attention from cluster admin, to check, either recover DNs or cleanup the container.

Recon also, should not perform any specific action automatically and should report as missing container, so that its visible to take some action.

I think below can be action points:

From recon, can get all keys referring to the container (seems api available in recon)
removing those keys from OM with already existing CLI or commands
delete container (I think available but not exposed using CLI) can be exposed and use same to delete

Further, May need check if deleted state of container is getting sync to Recon.

xBis7 May 24, 2023
Collaborator Author

@sumitagrawl Thanks for the comment.

IMO, there should not be any automatic action because it may cause dataloss.

I agree.

I think below can be action points:

From recon, can get all keys referring to the container (seems api available in recon)

removing those keys from OM with already existing CLI or commands

delete container (I think available but not exposed using CLI) can be exposed and use same to delete

Further, May need check if deleted state of container is getting sync to Recon.

This is what the doc says. The only difference in the doc is that this should happen as a background process because it's too time and resource consuming for the client to wait for it to finish.

e.g. key deletion also happens as a background service, check KeyDeletingService, the client marks the keys for deletion and the background service picks it up and asynchronously deletes the keys.

From recon, can get all keys referring to the container (seems api available in recon)

That was the first implementation, but then we decided that we can't trust Recon, so we started getting the keys directly from the OM. We will go back to using Recon and do some testing to make sure it works as expected.

Recon also, should not perform any specific action automatically and should report as missing container, so that its visible to take some action.

Once the CLI command deletes the container in the SCM, Recon should pick up that the container no longer exists in the SCM. Only then, Recon should take action.

xBis7 · 2023-05-24T10:27:18Z

xBis7
May 24, 2023
Collaborator Author

@devmadhuu @sumitagrawl To sum up the doc. The current idea about cleaning up a missing container is

There is a CLI command ozone admin om containerCleanup --container-id=

The user provides a container ID
The command puts the ID in a missingContainersForCleanup table
There is a background service that checks the table
If there is an ID in the table, the service picks it up
The service
- gets from Recon all keys for the ID
- deletes the keys in the OM
- deletes the container in the SCM

Recon's periodic health task picks up that the unhealthy container no longer exists in the SCM and cleans up the tables.

Health task doesn't see that the container is deleted but that it can't be found, it's lost entirely from the SCM and there is a mismatch.

We can't update a container's state when the datanode that has the container isn't available but we can have an implementation for updating the state just in the SCM, instead of deleting the container.

The background service idea is similar to KeyDeletingService. This process is too long for the client to wait it to finish and a lot can change in the meantime. This process should happen async in the server.

Community sync

What I got from the community meeting discussion is that before deleting the data from Recon we should move them to new tables so that the user can inspect them at a later point in time. This way the data will still be available and the UI will appear normal again.

I also got that we might need to reconsider key deletion. Keys can spread across multiple containers and their remaining blocks might still be recoverable.

13 replies

xBis7 May 24, 2023
Collaborator Author

@devmadhuu

then did you check if DeadNodeHandler will be removing the container from SCM container store

Yes I have and it doesn't. The container is still there. DeadNodeHandler calls ContainerManager.removeContainerReplica() and removes container replicas but not the container itself.

ContainerManager.deleteContainer() isn't used. We will add tests to verify it works as expected since no one has used it before.

devmadhuu May 24, 2023
Collaborator

@devmadhuu

then did you check if DeadNodeHandler will be removing the container from SCM container store

Yes I have and it doesn't. The container is still there. DeadNodeHandler calls ContainerManager.removeContainerReplica() and removes container replicas but not the container itself.

ContainerManager.deleteContainer() isn't used. We will add tests to verify it works as expected since no one has used it before.

So if we takes care in DeadNodeHandler on SCM and Recon side, then containers will be removed from memory held by ContainerManager and once container gets removed from SCM DB, then above periodic task "ReconStorageContainerManagerFacade#syncWithSCMContainerInfo" which runs at periodic intervals (5 mins I guess) can be modified for deletion of container at Recon by overriding deleteContainer as you suggested, then why we would like to keep this manual work by user to add each container Id manually to delete list for background service. Anyways "ReconStorageContainerManagerFacade#syncWithSCMContainerInfo" is also a kind of background sync only...
Pls let me know If I am missing something here....

xBis7 May 24, 2023
Collaborator Author

@devmadhuu

then why we would like to keep this manual work by user to add each container Id manually to delete list for background service.

We don't want all missing containers to be deleted automatically in the SCM because that will make them unrecoverable and the user might not want that. If the datanode can be recovered then so does the container. By automatically deleting it, the user doesn't get the chance to try to recover the data.

The SCM is the source of truth for the datanodes. If a dead datanodes comes back, it will check the containers it has and then it will check if they exist in the SCM. If the containers cannot be found in the SCM, they will be considered unknown.

This is from the doc, regarding unknown containers

Flag hdds.scm.unknown-container.action can be used to specify what will happen to the containers. 
The default value is WARN but if set to DELETE, then all the unknown containers will be cleaned up 
from the system. WARN keeps logging that there is an unknown container but takes no action on it.

If the datanode is recoverable and all of its containers have automatically been deleted in the SCM, then I don't think the containers can be restored. We have to take further actions for the SCM to recognize them again and I don't think at this point there is a way in Ozone for the user to do that.

IMO, we should give the user the option to decide whether the datanode is recoverable and whether he wants to delete the container or not.

devmadhuu May 24, 2023
Collaborator

Ok @xBis7 for patiently answering all doubts , pls invite @dombizita @kerneltime and @smengcl @swagle to get their opinions as well.

xBis7 May 24, 2023
Collaborator Author

@devmadhuu Thanks for spending the time to look into this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleaning up missing containers on Ozone #4689

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 19 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Cleaning up missing containers on Ozone #4689

xBis7 May 9, 2023 Collaborator

Replies: 2 comments · 19 replies

arp7 May 18, 2023 Collaborator

devmadhuu May 23, 2023 Collaborator

xBis7 May 23, 2023 Collaborator Author

xBis7 May 23, 2023 Collaborator Author

sumitagrawl May 24, 2023 Collaborator

xBis7 May 24, 2023 Collaborator Author

xBis7 May 24, 2023 Collaborator Author

Community sync

xBis7 May 24, 2023 Collaborator Author

devmadhuu May 24, 2023 Collaborator

xBis7 May 24, 2023 Collaborator Author

devmadhuu May 24, 2023 Collaborator

xBis7 May 24, 2023 Collaborator Author

xBis7
May 9, 2023
Collaborator

Replies: 2 comments 19 replies

arp7
May 18, 2023
Collaborator

devmadhuu May 23, 2023
Collaborator

xBis7 May 23, 2023
Collaborator Author

xBis7 May 23, 2023
Collaborator Author

sumitagrawl May 24, 2023
Collaborator

xBis7 May 24, 2023
Collaborator Author

xBis7
May 24, 2023
Collaborator Author

xBis7 May 24, 2023
Collaborator Author

devmadhuu May 24, 2023
Collaborator

xBis7 May 24, 2023
Collaborator Author

devmadhuu May 24, 2023
Collaborator

xBis7 May 24, 2023
Collaborator Author