Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nil value returned due to movekey version mismatch #995

Closed
simon-xe-wang opened this issue Aug 20, 2019 · 10 comments
Closed

Nil value returned due to movekey version mismatch #995

simon-xe-wang opened this issue Aug 20, 2019 · 10 comments
Labels
area/data-loss Issues related to data loss or corruption. priority/P1 Serious issue that requires eventual attention (can wait a bit) status/more-info-needed The issue has been sent back to the reporter asking for clarifications

Comments

@simon-xe-wang
Copy link

What version of Go are you using (go version)?

$ go version

What version of Badger are you using?

v1.5.5

Does this issue reproduce with the latest master?

What are the hardware specifications of the machine (RAM, OS, Disk)?

What did you do?

  1. keep writing keys
    each round writes 1.5 million keys (768 bytes for key and 1024 for value) and repeatedly update those keys. all in same goroutine

  2. vlog gc every 5 minutes in another goroutine and read keys to verify.
    I can upload test program if needed.

What did you expect to see?

What did you see instead?

after a while, in my case from 3 hours to 15 hours, I got nil value returned. which is DL issue.

@simon-xe-wang
Copy link
Author

After tracing in code. the reason it returns nil value is that the original key has new version v2 in lsm tree. move key only has older version v1, v2 is lost.

see iterator.go: yieldItemValue
if vs.Version != item.Version() {
return nil, nil, nil
}

@manishrjain
Copy link
Contributor

@ibrahim can you look into this?

@jarifibrahim
Copy link
Contributor

@SimonXW Can you share the test program?

@jarifibrahim jarifibrahim added area/data-loss Issues related to data loss or corruption. priority/P1 Serious issue that requires eventual attention (can wait a bit) status/more-info-needed The issue has been sent back to the reporter asking for clarifications labels Aug 21, 2019
@simon-xe-wang
Copy link
Author

main.txt

@simon-xe-wang
Copy link
Author

Just uploaded test code. rename it to .go.

@simon-xe-wang simon-xe-wang reopened this Aug 21, 2019
@jarifibrahim
Copy link
Contributor

@SimonXW I ran your program twice. The first time it kept running for 10 hours and then again for 3 hours. I didn't see any nil values. Can you try running the program again and confirm the issue?

@simon-xe-wang
Copy link
Author

I ran it 3/4 times and got nil value every time. but from 3 hours to 15 hours. Also which version are you running on? I ran it on 1.5.4 + the patch for #907.

You may have to run it longer.

@jarifibrahim
Copy link
Contributor

I tried it on v1.5.5 (as mentioned in the issue). I'll try it once again with 1.5.4.

@markadev
Copy link

markadev commented Sep 3, 2019

Hi, I'm a colleague of @SimonXW. I think that we were hitting a bug in 1.5 that was fixed with this commit: af99e5f

After upgrading to 1.6.0, I no longer see the nil value problem with this test program any more.

@jarifibrahim
Copy link
Contributor

Thanks @markadev. Since this issue is no longer reproducible on 1.6.0 I'm going to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/data-loss Issues related to data loss or corruption. priority/P1 Serious issue that requires eventual attention (can wait a bit) status/more-info-needed The issue has been sent back to the reporter asking for clarifications
Development

No branches or pull requests

4 participants