Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't auto recovery when encounter ENOSPC error from the filesystem #10134

Open
caipengbo opened this issue Jun 8, 2022 · 2 comments
Open

Can't auto recovery when encounter ENOSPC error from the filesystem #10134

caipengbo opened this issue Jun 8, 2022 · 2 comments
Labels
bug Confirmed RocksDB bugs up-for-grabs Up for grabs

Comments

@caipengbo
Copy link
Contributor

caipengbo commented Jun 8, 2022

As explained in the wiki Background Error Handling section, ENOSPC error from the filesystem will be automatically recovered(This was introduced by @zhichao-cao in #8376).

In PR #8376, @ajkr raised a question in #8376 (comment) : if other errors encounter the same problem.

Yes, I encountered the EDQUOT Disk Quota Exceeded (POSIx.1-2001) Error in the production environment, which will also cause same problems like #8376 that will not be able to automatically restore after a Background Error occurs.

I encountered this error code because my service is deployed in a container. Now that more and more services are using container technology, I think this problem will become more common.

Can we regard EDQUOT as a NoSpace Error, which seems to be the same property as NoSpace?

@akankshamahajan15
Copy link
Contributor

@anand1976 Do you have context on this one?

ShooterIT pushed a commit to apache/kvrocks that referenced this issue Jun 10, 2022
In #229, the issue where RocksDB could not recover from the no Space background
error was fixed. This problem RocksDB at facebook/rocksdb#8376 has been repaired,
but the issue has not been thoroughly solved, The same problem will still occur
when an EDQUOT Disk Quota Exceeded error is encountered (see the detailed in 
facebook/rocksdb#10134).

RocksDB cannot recover from this problem and must be restarted. This problem is
more likely to occur when kvrocks is deployed in container.

In order to handle all versions of RocksDB, we manually resume DB when we encounter
two retryable io errors: No space left on device and Disk Quota Exceeded.

For the Disk Quota Exceeded error, RocksDB did not expose a friendly interface,
so we did a string match.
@ajkr ajkr added bug Confirmed RocksDB bugs up-for-grabs Up for grabs labels Jun 21, 2022
@meilihao
Copy link

I met too. First, rocksdb 7.10.2 reports "Disk quota exceeded", and then zfs filesystem expanded successfully, but rocksdb still reports this error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed RocksDB bugs up-for-grabs Up for grabs
Projects
None yet
Development

No branches or pull requests

5 participants
@meilihao @ajkr @caipengbo @akankshamahajan15 and others