Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid using Get interface to obtain metadata when iterating data during migration #906

Merged
merged 4 commits into from
Sep 24, 2022

Conversation

caipengbo
Copy link
Contributor

Fixed #904

src/slot_migrate.cc Outdated Show resolved Hide resolved
git-hulk
git-hulk previously approved these changes Sep 22, 2022
@caipengbo caipengbo removed the request for review from ChrisZMF September 22, 2022 07:46
@datavisorxiaobiaozhao
Copy link

The problem we had was that when we migrated, we had a lot of disk reads, but with this optimized version, these abnormal disk reads disappeared, so I think GetMetaData calls the Get interface

@caipengbo
Copy link
Contributor Author

The problem we had was that when we migrated, we had a lot of disk reads, but with this optimized version, these abnormal disk reads disappeared, so I think GetMetaData calls the Get interface.

@datavisorxiaobiaozhao Yes. During the iteration, we set the cache not to be filled:

https://github.com/apache/incubator-kvrocks/blob/e8721a270d23345f385af6db0617c3549d6e6812/src/slot_migrate.cc#L273-L277

so that when we call DB::Get(), we are likely to read the disk.

@shangxiaoxiong
Copy link
Member

shangxiaoxiong commented Sep 23, 2022

There may be some problem. The Get interface may aim to get the latest expire time of the key.

@caipengbo
Copy link
Contributor Author

caipengbo commented Sep 23, 2022

There may be some problem. The Get interface may aim to get the latest expire time of the key.

@shangxiaoxiong Oh Yes, I missed that. Then we only need GetMetaData for those with expiration time. Or we can record the time snapshot was created and subtract the expiration time to determine whether it is expired!

@shangxiaoxiong
Copy link
Member

During sending snapshot, exipre time of some key may be changed to larger by clients. GetMetaData is needed to deal with this case.

@shangxiaoxiong
Copy link
Member

l think GetMetaData is needed for complex key with expire time to get the latest expire time.

@caipengbo
Copy link
Contributor Author

caipengbo commented Sep 23, 2022

During sending snapshot, exipre time of some key may be changed to larger by clients. GetMetaData is needed to deal with this case.

During sending snapshot:

  • for simple type, user changes to expire are incremental data and will be synchronized through WAL.
  • for complex types, we really should get the latest metadata.

@shangxiaoxiong
Copy link
Member

During sending snapshot, exipre time of some key may be changed to larger by clients. GetMetaData is needed to deal with this case.

During sending snapshot, user changes to expire are incremental data and will be synchronized through WAL.

Simple key is ok but complex key is not.

@shangxiaoxiong shangxiaoxiong requested review from git-hulk and removed request for xiaobiaozhao September 23, 2022 07:42
@git-hulk
Copy link
Member

git-hulk commented Sep 23, 2022

During sending snapshot, exipre time of some key may be changed to larger by clients. GetMetaData is needed to deal with this case.

During sending snapshot, user changes to expire are incremental data and will be synchronized through WAL.

Simple key is ok but complex key is not.

It can't resolve this issue even we use the latest snapshot since the change may come after.

@shangxiaoxiong
Copy link
Member

Yeah. Using GetMetaData with latest snapshot can't solve the problem completely. Just a little improvement.

@caipengbo
Copy link
Contributor Author

Indeed, the whole process is not atomic.

@git-hulk
Copy link
Member

git-hulk commented Sep 23, 2022

Yes, the root cause is the snapshot can be different between writers and background threads. I prefer to use current implementation, then we can try to seek another way to resolve this issue completely. How do you think? @caipengbo @shangxiaoxiong

@shangxiaoxiong
Copy link
Member

shangxiaoxiong commented Sep 23, 2022

Current implementation makes slotmigration faster. Faster slotmigration is, smaller risk of losing key is. On the whole, current implementation may be good.

@caipengbo
Copy link
Contributor Author

I think OK, I will continue to think about how to solve this kind of problem.

@xiaobiaozhao
Copy link
Contributor

I think this is the best solution for now, merge this pr first, and then contionue to look for new solutions.

@caipengbo
Copy link
Contributor Author

caipengbo commented Sep 23, 2022

Thanks all for the great discussion, I'll explain the problems with this implementation in #904

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid using Get interface to obtain Metadata when iterating data during migration
6 participants