Ozone not freeing up storage in my machine #7239

mmgspr · 2024-09-25T11:12:06Z

mmgspr
Sep 25, 2024

Hi all!

I was testing if there were upload limits with ozone so I tried to upload the ubuntu iso (6GB file) to my ozone deployment (version 1.4 and deployed using docker compose).

I uploaded the file using the aws cli api:

And after it uploaded correctly I tried deleting it:

It seemed like it went well, the list-objects call didn't show up the file.

But after checking the recon server the 3 datanodes showed up that ozone was using 5.7GB of storage.

I did the same test again (uploading and deleting) and the recon server marked 11GB of storage used by ozone.

I checked the storage in my machine using df -h and it was indeed less free storage than expected.

I have ozone deployed using docker, so I tried removing one datanode container, deleting its files and deploying it back and it showed up 0GB in the recon server.

I don't really know how deletions work in ozone but looking for information I saw that the SCM is expected to mark the containers to be removed and the garbage collector should remove them if they are closed. But checking the container list with ozone admin container list it lists 15 containers, the 15 of them are closed but none of them marked to be deleted.

I thought it could be some communication problem between the SCM and the datanodes/containers because of we have it with High Availability, but it also happened to us in other machine where we have a single-deployment.

Does anyone know if there is any configuration property to add to my docker-config or my ozone-site.xml so that the delete works? I'm kinda lost here and couldn't find anything related to this in the documentation.

Thank you very much in advance!

mmgspr · 2024-09-25T15:17:44Z

mmgspr
Sep 25, 2024
Author

If this helps, I'm getting this (multiple times) in the datanode container logs after trying to delete a file:

datanode-1 | 2024-09-25 15:01:52 INFO ContainerDeletionChoosingPolicyTemplate:85 - Chosen 0/5000 blocks from 0 candidate containers.

0 replies

aswinshakil · 2024-09-25T19:43:10Z

aswinshakil
Sep 25, 2024
Collaborator

Deletes are asynchronous in Ozone. Once the file is deleted, Ozone Manager marks the file as deleted. This deletion is propagated to the SCM and then the SCM sends a delete request to Datanode.

So it takes some time for the relevant background services to run and reclaim the blocks,

Background Service for Key Deletion:

Ozone Manager

KeyDeletingService
DirectoryDeletingService

SCM

SCMBlockDeletingService

DataNode

BlockDeletingService

Typically each of these services by default runs every 60sec. So for the first deletes to propagate to the DN it would take a minimum of 3 minutes or maybe even more depending on the amount of data. For small test cluster, the data should be deleted fairly quickly.

Question:

How long did you wait for the data deletion to happen?
Did you see some Error in OM/SCM/DNs logs during the time of deletion?

0 replies

errose28 · 2024-09-27T20:15:13Z

errose28
Sep 27, 2024
Collaborator

+1 to Aswin's suggestion. Additionally, could you share which version of Ozone you are running? I think 1.4.0 should have most/all of the known deletion improvements. See this comment for a list. There were two issues HDDS-11492 and HDDS-11491 identified recently, but those apply to dense filesystem hierarchies which doesn't look to be relevant here.

You may actually have to wait up to 10 minutes for the space to get reclaimed. This is because SCM will read the block deletes from RocksDB, and they will not be flushed down from Ratis until every 10 minutes according to the default value of ozone.scm.ha.dbtransactionbuffer.flush.interval. This functionality was added in HDDS-8508 and prior to this it used to take way longer for SCM to see new block delete requests. You can try setting this config to a lower value for testing purposes.

datanode-1 | 2024-09-25 15:01:52 INFO ContainerDeletionChoosingPolicyTemplate:85 - Chosen 0/5000 blocks from 0 candidate containers.

This means the datanodes do not have any block delete operations to process. The deletions could still be running through OM or SCM, or it could be the issue fixed in HDDS-7156 if you are using a version earlier than 1.4.0.

On a large cluster constantly undergoing operations, space reclamation taking a few minutes usually works out fine. However on a small test cluster it can definitely be confusing behavior. I think we will have a set of tasks to improve Ozone's deletion flow coming up soon, which will likely involve adding Grafana dashboards that can help track deletions end to end through the system.

0 replies

mmgspr · 2024-09-30T11:59:30Z

mmgspr
Sep 30, 2024
Author

Hi!

First of all, thank you very much both for your answers.

How long did you wait for the data deletion to happen?

Being honest, long enough hahaha. I sometimes waited for half an hour before giving up or in the case of the HA deployment, it was left with the file deleted for a whole weekend and it didn't free up the space. (What I mean by file deleted in this case is the key. I just sent the delete-object command using the aws s3api)

Did you see some Error in OM/SCM/DNs logs during the time of deletion?

Not at all :(. In fact, I saw the OM message telling the SCM the key to be deleted and which blocks store the data for that key. But it doesn't go further (no communication between SCM and datanodes for the deletion)

Additionally, could you share which version of Ozone you are running?

I'm using the 1.4 version deployed using docker compose.

These days I kept doing tests. I'm now running a fresh deployment in my computer using as well the docker compose and here it seems to work but not always. During my tests I noticed that restarting the containers usually helps. I'll now explain my last test:

I deployed the containers using docker compose(fresh install, no old data in the containers and no custom config in the docker-config file), created the bucket, uploaded a big file (ubuntu iso (5.8GB)) and waited in the logs until it marked the containers are closed and proceeded to delete the file. The delete arrived properly in the logs:

This (as you can see in the logs) was done at 08:58:07. I waited for a bit more than half an hour and nothing happened, so I decided to restart the containers as it had helped me in other tests. This was done at 09:36:

And it helped, 6 minutes later ozone started deleting the file:

As I said in the first comment we also have a deployment with high availability in 3 different machines. It was left for the whole weekend with the file deleted and in this case the file is still in the machines:

I'm sorry for not giving much useful information. Sometimes it works, sometimes it doesn't and I'm still trying to figure out whats going on. I'll keep posting here the results of my tests if I find out something.

I'll write here the steps to reproduce my last test in case anyone can check if it works for them:

The test was made in a ubuntu 24 desktop pc and using the default ozone image (version 1.4)

1- Download the docker-compose.yaml and docker-config:
docker run apache/ozone cat docker-compose.yaml > docker-compose.yaml && docker run apache/ozone cat docker-config > docker-config
2- Run the containers: (I did it using 2 consoles, but can be launched in detached mode)
docker compose up
3- (on another console) Create the bucket and Upload the file (in my case I used the ubuntu iso so that I had a big file to close some containers by himself)
aws s3api --endpoint http://localhost:9878 create-bucket --bucket practice
aws s3api --endpoint http://localhost:9878 put-object --bucket practice --key ubuntu.iso --body ubuntu-24.04.1-desktop-amd64.iso
4- Delete the file:
aws s3api --endpoint http://localhost:9878 delete-object --bucket practice --key ubuntu.iso
5- Keep an eye at the logs until the datanode deletes the blocks or, if enough time had passed and it didn't delete the blocks, restart the containers and see if it works.

Thank you very much in advance!

0 replies

mmgspr · 2024-10-23T10:52:43Z

mmgspr
Oct 23, 2024
Author

Hi!

After a break, I gave it a try again and I've made some progress.

I made some tests again in my local deployment (uploaded around 1.5GB of files and then deleted them). It wasn't freeing up space doesn't matter how much I waited. But then I discovered this command:
ozone admin containerbalancer start
This solved the issues, the moment I sent the command it started deleting the blocks that were marked to be deleted.

Then I made the same test in my deployment in 3 different machines with High Availability (which had 6 GB of space that wasn't being freed up) and the same, the moment I used the command it starting deleting the blocks. This was left in this state for around 20 days and it hadn't freed up space until I used the command, so I'm guessing it isn't a problem of not giving it enough time.

But I'm guessing this is not expected to be done manually, right?
Is there any configuration that would activate this to be done automatically?

Thank you very much!

4 replies

errose28 Oct 29, 2024
Collaborator

Thanks for continuing to look into this, this is a very interesting discovery! This actually helps narrow down the problem:

Data stored in SCM is first written to the Raft log when it goes through Ratis for consensus, and transactions that must be immediately visible are stored in memory. Eventually everything is flushed into RocksDB in the background. If there is a crash/restart before the data makes it to RocksDB, the log would be replayed onto the DB so no data would be lost. SCM's block deleting service picks up blocks by iterating the DB, so if entries have not yet flushed, they will not be picked up.

The container balancer stores its configuration in RocksDB and it passes through Ratis so that the balancer will keep running despite restarts and leader changes. It turns out that when the balancer writes its configuration, it manually flushes the buffer. This would push all the previous delete entries down to the DB where the block deleting service can pick them up.

Most requests don't manually flush the buffer since it doesn't need to happen for every request as explained above. The balancer request is unique in this regard, and I'm actually not sure why it flushes explicitly (@siddhantsangwan might be able to fill in some context here). Since new deletions won't be processed until they are flushed, in HDDS-8508 we made it such that the log should be flushed every 10 minutes as I mentioned previously. Perhaps this is not working in this case. Can you let me know:

Are you using three SCMs or just one?
If using one SCM, is Ratis enabled on the server side (ozone.scm.ratis.enable=true)?
Can you try setting "ozone.scm.ha.dbtransactionbuffer.flush.interval to 1s (1 second) and see if the deletions go through on their own?

Currently I'm suspecting there might be an issue with HDDS-8508.

mmgspr Nov 4, 2024
Author

Hi! Thank you for your answer.

Are you using three SCMs or just one?

In the HA deployment I'm using 3 SCMs and in the local deployment just 1. Doesn't work in both cases.

If using one SCM, is Ratis enabled on the server side (ozone.scm.ratis.enable=true)?

I don't have it enabled in the local deployment (the one with 1 SCM)

Can you try setting "ozone.scm.ha.dbtransactionbuffer.flush.interval to 1s (1 second) and see if the deletions go through on their own?

It didn't help :/

It seems to me that the SCMBlockDeletingService is not working properly because as soon as I execute the containerbalancer start command it detects the blocks to be deleted. But I don't really know how it works so it might be another service the one in charge of telling the SCMBlockDeletingService to do his job.

errose28 Nov 4, 2024
Collaborator

Ok thanks for the info. I filed HDDS-11643 to track the issue. Indeed it looks SCM's time based snapshot flush which would make the block deletions visible to the block deleting service is not working properly. Starting the container balancer ends up forcing a flush so it is kind of a coincidental workaround. The block deleting service looks to be running, but it won't see the deletion entries in RocksDB until the flush occurs.

mmgspr Nov 4, 2024
Author

Great,

Thank you very much for your time!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ozone not freeing up storage in my machine #7239

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Ozone not freeing up storage in my machine #7239

mmgspr Sep 25, 2024

Replies: 5 comments · 4 replies

mmgspr Sep 25, 2024 Author

aswinshakil Sep 25, 2024 Collaborator

Background Service for Key Deletion:

errose28 Sep 27, 2024 Collaborator

mmgspr Sep 30, 2024 Author

mmgspr Oct 23, 2024 Author

errose28 Oct 29, 2024 Collaborator

mmgspr Nov 4, 2024 Author

errose28 Nov 4, 2024 Collaborator

mmgspr Nov 4, 2024 Author

mmgspr
Sep 25, 2024

Replies: 5 comments 4 replies

mmgspr
Sep 25, 2024
Author

aswinshakil
Sep 25, 2024
Collaborator

errose28
Sep 27, 2024
Collaborator

mmgspr
Sep 30, 2024
Author

mmgspr
Oct 23, 2024
Author

errose28 Oct 29, 2024
Collaborator

mmgspr Nov 4, 2024
Author

errose28 Nov 4, 2024
Collaborator

mmgspr Nov 4, 2024
Author