Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riak_kv_bitcask_backend infinite loop recovery [JIRA: RIAK-1821] #1120

Closed
krestenkrab opened this issue May 21, 2015 · 5 comments
Closed

riak_kv_bitcask_backend infinite loop recovery [JIRA: RIAK-1821] #1120

krestenkrab opened this issue May 21, 2015 · 5 comments

Comments

@krestenkrab
Copy link
Contributor

We have an installation in which a disk corruption causes a riak that will not restart.

We get an infinite number of these messages (see below). The same file seem to be retried again and again. Dunno how the <<>> key made it into the file, but it can't be converted using riak_kv_bitcask_backend:key_transform_to_1/1, and that seems to cause an infinite superviser restart or something.

console.log:2015-05-21 10:15:54.881 [warning] <0.572.0> Hintfile '/var/lib/riak/bitcask/799258707915337533392640142891717276374338109440/1974.bitcask.hint' invalid
console.log:2015-05-21 10:15:54.882 [error] <0.572.0> scan_key_files: error function_clause @ [{riak_kv_bitcask_backend,key_transform_to_1,[<<>>],[{file,"src/riak_kv_bitcask_backend.erl"},{line,99}]},{bitcask,'-scan_key_files/5-fun-0-',7,[{file,"src/bitcask.erl"},{line,1182}]},{bitcask_fileops,fold_keys_int_loop,5,[{file,"src/bitcask_fileops.erl"},{line,595}]},{bitcask_fileops,fold_file_loop,8,[{file,"src/bitcask_fileops.erl"},{line,720}]},{bitcask_fileops,fold_keys_loop,4,[{file,"src/bitcask_fileops.erl"},{line,575}]},{bitcask,scan_key_files,5,[{file,"src/bitcask.erl"},{line,1190}]},{bitcask,init_keydir_scan_key_files,4,[{file,"src/bitcask.erl"},{line,1283}]},{bitcask,init_keydir,4,[{file,"src/bitcask.erl"},{line,1235}]}]
@Basho-JIRA Basho-JIRA changed the title riak_kv_bitcask_backend infinite loop recovery riak_kv_bitcask_backend infinite loop recovery [JIRA: RIAK-1821] May 21, 2015
@engelsanchez
Copy link
Contributor

Hey @krestenkrab, did you see these:

I believe you are hitting the same problem. Let me know if it looks like a different or if the new PRs help or not. They have not been reviewed yet, but we will try to fix this issue for our next release.

@krestenkrab
Copy link
Contributor Author

Indeed. Even our data files are corrupt for some reason. Here's an example of a corrupt .data file with just a single tombstone and a single key in it.

We've seen this problem twice the last week, so there must be something wrong with the OS/file system we're running this on.

00000000  3b 86 87 8c 55 5d 1d 41  00 47 00 00 00 16 02 00  |;...U].A.G......|
00000010  09 6c 61 73 74 2d 73 65  65 6e 34 38 3a 39 61 3a  |.last-seen48:9a:|
00000020  33 39 3a 62 61 3a 34 64  3a 36 62 3a 30 30 3a 38  |39:ba:4d:6b:00:8|
00000030  62 3a 39 30 3a 37 38 3a  37 62 3a 35 31 3a 38 33  |b:90:78:7b:51:83|
00000040  3a 64 63 3a 39 64 3a 63  30 3a 65 63 3a 32 35 3a  |:dc:9d:c0:ec:25:|
00000050  64 64 3a 34 37 62 69 74  63 61 73 6b 5f 74 6f 6d  |dd:47bitcask_tom|
00000060  62 73 74 6f 6e 65 32 00  00 07 90 2d 95 f8 ec 55  |bstone2....-...U|
00000070  5d 1d 41 00 47 00 00 00  d1 02 00 09 6c 61 73 74  |].A.G.......last|
00000080  2d 73 65 65 6e 34 38 3a  39 61 3a 33 39 3a 62 61  |-seen48:9a:39:ba|
00000090  3a 34 64 3a 36 62 3a 30  30 3a 38 62 3a 39 30 3a  |:4d:6b:00:8b:90:|
000000a0  37 38 3a 37 62 3a 35 31  3a 38 33 3a 64 63 3a 39  |78:7b:51:83:dc:9|
000000b0  64 3a 63 30 3a 65 63 3a  32 35 3a 64 64 3a 34 37  |d:c0:ec:25:dd:47|
000000c0  35 01 00 00 00 25 83 6c  00 00 00 01 68 02 6d 00  |5....%.l....h.m.|
000000d0  00 00 08 23 09 fe f9 68  42 16 e8 68 02 62 00 00  |...#...hB..h.b..|
000000e0  c9 01 6e 05 00 41 99 d1  ce 0e 6a 00 00 00 01 00  |..n..A....j.....|
000000f0  00 00 14 01 7b 22 74 69  6d 65 22 3a 31 34 33 32  |....{"time":1432|
00000100  31 36 35 36 39 35 7d 00  00 00 86 00 00 05 98 00  |165695}.........|
00000110  02 87 41 00 02 e7 2b 15  47 54 73 38 6f 4a 6f 58  |..A...+.GTs8oJoX|
00000120  6f 63 68 52 31 33 47 6a  73 74 56 4f 31 00 00 00  |ochR13GjstVO1...|
00000130  00 0c 01 58 2d 52 69 61  6b 2d 4d 65 74 61 00 00  |...X-Riak-Meta..|
00000140  00 03 00 83 6a 00 00 00  06 01 69 6e 64 65 78 00  |....j.....index.|
00000150  00 00 03 00 83 6a 00 00  00 0d 01 63 6f 6e 74 65  |.....j.....conte|
00000160  6e 74 2d 74 79 70 65 00  00 00 15 00 83 6b 00 10  |nt-type......k..|
00000170  61 70 70 6c 69 63 61 74  69 6f 6e 2f 6a 73 6f 6e  |application/json|
00000180  00 00 00 06 01 4c 69 6e  6b 73 00 00 00 03 00 83  |.....Links......|
00000190  6a 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |j...............|
000001a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000320  00 00 00                                          |...|
00000323

@engelsanchez
Copy link
Contributor

Wait, that's interesting. There was a change in Bitcask for Riak 2.0 that deals with data resurrection when merging tombstones. The new code will actually re-open old files and append tombstones to it during a merge, which is a first for Bitcask. It's tricky enough that you could be seeing a bug related to it. I can see you are using Bitcask with this new scheme on, since your tombstone has the "bitcask_tombstone2" token on it. Could you send a few of these corrupt data files our way and give us some context on the cluster activity?

basho/bitcask#156

@krestenkrab
Copy link
Contributor Author

Cluster activity is simple, it's a single node 64/partition system running plain open source Riak, I'll see if I can get the exact version from ops.

I'll mail you a bad bit cask file.

@krestenkrab
Copy link
Contributor Author

I noticed this text in Kafka's documentation End of section 5.5, "Guarantees"

Note that two kinds of corruption must be handled: truncation in which an unwritten block is lost due to a crash, and corruption in which a nonsense block is ADDED to the file. The reason for this is that in general the OS makes no guarantee of the write order between the file inode and the actual block data so in addition to losing written data the file can gain nonsense data if the inode is updated with a new size but a crash occurs before the block containing that data is not written. The CRC detects this corner case, and prevents it from corrupting the log (though the unwritten messages are, of course, lost).

This might be what caused zeros to appear at the end of the data file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants