Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OnFlushCompleted is called before flush completed #5892

Closed
yiwu-arbug opened this issue Oct 8, 2019 · 5 comments · Fixed by tikv/rocksdb#127
Closed

OnFlushCompleted is called before flush completed #5892

yiwu-arbug opened this issue Oct 8, 2019 · 5 comments · Fixed by tikv/rocksdb#127
Assignees

Comments

@yiwu-arbug
Copy link
Contributor

yiwu-arbug commented Oct 8, 2019

Expected behavior

When OnFlushCompleted is called, we assume the corresponding memtable has finished flush - the data has been persisted as SST file, and the LSM has been updated with the result.

Actual behavior

When there are concurrent flush job on the same CF, OnFlushCompleted can be called before the flush result being install to LSM. For example:

  1. Flush job for memtable A starts (could be triggered by full memtable)
  2. Flush job for memtable B starts (could be a manual flush on the same CF, with just a few keys).
  3. Flush job B finish ahead of A, but TryInstallMemtableFlushResults failed because A has not finish.
  4. Flush job B call OnFlushCompleted before the flush finish.
  5. Flush job A install result for both A and B and then finishes.

Just want to confirm if this is expected behavior. If so, how about we expose WaitForFlushMemTable API to allow user to wait for the flush after OnFlushCompleted? If not expect, any suggestion on how to fix it?

Steps to reproduce the behavior

Reproduced with the demo test: https://github.com/yiwu-arbug/rocksdb/commit/3d56d68de6e3e6c61f2fe80aff1f7195476ac6d9

@yiwu-arbug
Copy link
Contributor Author

yiwu-arbug commented Oct 8, 2019

@siying @riversand963 Is this expected behavior? Thanks.

@riversand963
Copy link
Contributor

Thanks @yiwu-arbug for reporting. Probably not expected behavior. The assumption used to be one flush job per column family. Since atomic flush, it has changed such that multiple flush jobs can work on different (but consecutive) memtables of the same column family.
I am on business trip today, and will take a deeper look after I get back tomorrow.

@riversand963 riversand963 self-assigned this Oct 9, 2019
@yiwu-arbug
Copy link
Contributor Author

@riversand963 thanks for taking a look. It seems that the behavior is not introduced by the recent atomic flush feature, but exists since when OnFlushCompleted callback was added (28c82ff). The logic of TryInstallMemtableFlushResults was added since 1ca0584#diff-2d060c305c681ac95dfc4d335c010e14R72, where attempt to commit flush result will return immediately if there's an unfinished memtable flush before it in the queue.

@yiwu-arbug
Copy link
Contributor Author

@riversand963 I'm working on a fix. Should I call OnFlushCompleted after flush result committed (it could be call from a different flush thread), or should I keep the existing behavior and add a OnFlushResultCommitted callback? Thanks.

@riversand963
Copy link
Contributor

@yiwu-arbug I looked at API comments and original implementation. It seems that the contract is: NotifyOnFlushCompleted() is called after flush succeeds and committed to MANIFEST. I prefer to keep this behavior (fixing current code). Any thoughts @siying ?

yiwu-arbug pushed a commit to tikv/rocksdb that referenced this issue Oct 30, 2019
…ebook#5908) (#127)

Summary:
When there are concurrent flush job on the same CF, `OnFlushCompleted` can be called before the flush result being install to LSM. Fixing the issue by passing `FlushJobInfo` through `MemTable`, and the thread who commit the flush result can fetch the `FlushJobInfo` and fire `OnFlushCompleted` on behave of the thread actually writing the SST.

Fix facebook#5892
Pull Request resolved: facebook#5908

Test Plan: Add new test. The test will fail without the fix.

Differential Revision: D17916144

Pulled By: riversand963

fbshipit-source-id: e18df67d9533b5baee52ae3605026cdeb05cbe10
merryChris pushed a commit to merryChris/rocksdb that referenced this issue Nov 18, 2019
…ebook#5908)

Summary:
When there are concurrent flush job on the same CF, `OnFlushCompleted` can be called before the flush result being install to LSM. Fixing the issue by passing `FlushJobInfo` through `MemTable`, and the thread who commit the flush result can fetch the `FlushJobInfo` and fire `OnFlushCompleted` on behave of the thread actually writing the SST.

Fix facebook#5892
Pull Request resolved: facebook#5908

Test Plan: Add new test. The test will fail without the fix.

Differential Revision: D17916144

Pulled By: riversand963

fbshipit-source-id: e18df67d9533b5baee52ae3605026cdeb05cbe10
Connor1996 pushed a commit to tikv/rocksdb that referenced this issue Nov 19, 2019
…ebook#5908) (#127) (#130)

Summary:
When there are concurrent flush job on the same CF, `OnFlushCompleted` can be called before the flush result being install to LSM. Fixing the issue by passing `FlushJobInfo` through `MemTable`, and the thread who commit the flush result can fetch the `FlushJobInfo` and fire `OnFlushCompleted` on behave of the thread actually writing the SST.

Fix facebook#5892
Pull Request resolved: facebook#5908

Test Plan: Add new test. The test will fail without the fix.

Differential Revision: D17916144

Pulled By: riversand963

fbshipit-source-id: e18df67d9533b5baee52ae3605026cdeb05cbe10
Signed-off-by: Yi Wu <yiwu@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants