-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
riak_kv_bitcask_backend infinite loop recovery [JIRA: RIAK-1821] #1120
Comments
Hey @krestenkrab, did you see these:
I believe you are hitting the same problem. Let me know if it looks like a different or if the new PRs help or not. They have not been reviewed yet, but we will try to fix this issue for our next release. |
Indeed. Even our data files are corrupt for some reason. Here's an example of a corrupt We've seen this problem twice the last week, so there must be something wrong with the OS/file system we're running this on.
|
Wait, that's interesting. There was a change in Bitcask for Riak 2.0 that deals with data resurrection when merging tombstones. The new code will actually re-open old files and append tombstones to it during a merge, which is a first for Bitcask. It's tricky enough that you could be seeing a bug related to it. I can see you are using Bitcask with this new scheme on, since your tombstone has the "bitcask_tombstone2" token on it. Could you send a few of these corrupt data files our way and give us some context on the cluster activity? |
Cluster activity is simple, it's a single node 64/partition system running plain open source Riak, I'll see if I can get the exact version from ops. I'll mail you a bad bit cask file. |
I noticed this text in Kafka's documentation End of section 5.5, "Guarantees"
This might be what caused zeros to appear at the end of the data file. |
We have an installation in which a disk corruption causes a riak that will not restart.
We get an infinite number of these messages (see below). The same file seem to be retried again and again. Dunno how the
<<>>
key made it into the file, but it can't be converted usingriak_kv_bitcask_backend:key_transform_to_1/1
, and that seems to cause an infinite superviser restart or something.The text was updated successfully, but these errors were encountered: