-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid using Get interface to obtain Metadata when iterating data during migration #904
Comments
Yes, good catch. |
I have tested and confirmed that this is an issue. The fix works and can speed up migration by 10-100 times.Looking forward to your PR |
More than 10 times? Are you serious? |
The original purpose of using The purpose of this is: snapshot is created at the beginning of the migration. If a slot has been migrated for a long time, users may change the number of subkeys in complex types or change the expiration time. If we use metadata in very old snapshot to determine the expiration time, we may make a mistake. The subkey cannot be migrated to the target end. WAL only records metadata changes, not subkeys. In this case, subkeys will be completely lost. However, as we discussed in #906, even if we get the latest snaphot, we still can't prevent users from modifying complex types of metadata. We still make errors in judgment. As of now, this problem is unavoidable, and we welcome discussion of effective solutions. |
Motivation
When migrating data, I found that when migrating the same number of keys (about 10W), the migration time was very variable, sometimes very short, sometimes very long. When the migration took a long time, I found that the CPU was full.
Looking at the flame diagram below:
we can see that there is a deep call stack on the left.
Zoomed in it:
When we parsed and sent the full data (
SlotMigrate::SendSnapshot()
), we used therocksdb::DB::Get()
interface to get the metadata, which was CPU consuming.How much does this affect migration speed?
rocksdb::DB::Get()
to read disk, which greatly affects the performance!Solution
In fact, we do not need to get the value through the
rocksdb::DB::Get()
interface. When we iterate data, value and key are together, so we can directly get the value through the iterator and then encode into metadata.The text was updated successfully, but these errors were encountered: