Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lost writes after 5M unique keys are inserted in two partions. #112

Closed
uniphil opened this issue Dec 18, 2024 · 4 comments
Closed

Lost writes after 5M unique keys are inserted in two partions. #112

uniphil opened this issue Dec 18, 2024 · 4 comments
Labels
bug Something isn't working reproduced

Comments

@uniphil
Copy link

uniphil commented Dec 18, 2024

Describe the bug

Fjall appears to sometimes lose writes after closing and reopening a keyspace with two partitions.

To Reproduce

My repro is here: https://github.com/uniphil/kv-for-likes-test/blob/fjall-lost-write-repro/fjall/src/lost-write.rs
With data file here: https://github.com/uniphil/kv-for-likes-test/blob/fjall-lost-write-repro/likes5M-anon.txt

It has 5M entries total between the two partitions -- at 4M it wasn't reliably reproducing.

  • Run with cargo run --release --bin lost-write

  • It fails every time for me with an output like

    olde-mbp:fjall phil$ cargo run --release --bin lost-write
       Compiling kv-for-likes_fjall v0.1.0 (/Users/phil/code/kv-for-likes/fjall)
        Finished `release` profile [optimized] target(s) in 2.06s
         Running `target/release/lost-write`
    done in 24.8s. likes: 4932348, unlikes: 67652
    checking likes: OK:   4932348 == 4932348
    checking unlikes: OK:   67652 == 67652
    reopening keyspace...
    checking likes: OK:   4932348 == 4932348
    checking unlikes: FAIL: 67004 != 67652
    

with this size of data file, it's always the unlikes count that's wrong after reopening. with the full 17.5M set i think both were wrong (but i'd need to confirm). The number of lost writes to unlikes varies across runs.

I wasn't seeing this issue before when I had set a larger block size.

Expected behavior

The number of keys in a partition shouldn't change after closing and reopening it.

Screenshots

n/a

Desktop (please complete the following information):

  • OS: macOS 10.15.7 (yeah i know)
  • cargo/rust: 1.82.0
  • fjall: 2.4.1
@uniphil
Copy link
Author

uniphil commented Dec 18, 2024

confirmed the above repro also on macos 14.7.1 / rust 1.81, so it's not only due to my machine being old and sad.

edit: also confirmed with fjall 2.4.2 on the old machine.

edit 2: also confirmed repro on my next-most-convenient machine... a raspberry pi 4b (linux raspberrypi) on an XFS volume. (mac machines were APFS). so likely not FS, OS, or architecture specific.

@marvin-j97
Copy link
Collaborator

@uniphil I think I tracked it down to the journal truncation, which is kind of too convoluted anyway.

Can you try this branch? https://github.com/fjall-rs/fjall/tree/fix/journal-truncation
That works for me, though I don't fully understand where there previous implementation failed.

@marvin-j97 marvin-j97 added bug Something isn't working reproduced labels Dec 18, 2024
@uniphil
Copy link
Author

uniphil commented Dec 18, 2024

yup it worked! 🎉

olde-mbp:fjall phil$ cargo run --release --bin lost-write
    Updating git repository `https://github.com/fjall-rs/fjall.git`
     Locking 1 package to latest compatible version
      Adding fjall v2.4.2 (https://github.com/fjall-rs/fjall.git?branch=fix/journal-truncation#4f042af8)
   Compiling fjall v2.4.2 (https://github.com/fjall-rs/fjall.git?branch=fix/journal-truncation#4f042af8)
   Compiling kv-for-likes_fjall v0.1.0 (/Users/phil/code/kv-for-likes/fjall)
    Finished `release` profile [optimized] target(s) in 9.09s
     Running `target/release/lost-write`
done in 23.3s. likes: 4932348, unlikes: 67652
checking likes: OK:   4932348 == 4932348
checking unlikes: OK:   67652 == 67652
reopening keyspace...
checking likes: OK:   4932348 == 4932348
checking unlikes: OK:   67652 == 67652

@marvin-j97
Copy link
Collaborator

https://crates.io/crates/fjall/2.4.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working reproduced
Projects
None yet
Development

No branches or pull requests

2 participants