Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reload sharding data #268

Merged

Conversation

DifferentialOrange
Copy link
Member

@DifferentialOrange DifferentialOrange commented Mar 24, 2022

deps: bump cartridge dependency

Bump cartridge rock dependency to 2.7.4 in tests. Version 2.7.4 fixes
critical bug related to cartridge hot reload killing box.watchable
fiber and causing Tarantool to segfault [1].

  1. Don't kill box.watchable system fiber on reload cartridge#1741

crud: do not change input tuple object

If crud request uses tuple as input argument (insert, upsert and replace
operations) and its bucket_id is empty, the module will fill this field
and damage input argument tuple. This patch fixes this behavior.

After this patch, performance of insert, upsert and replace has
decreased by 5%.

ddl: store sharding info hashes on storage

Compute and store sharding key and sharding func hashes on storages.
Hashes are updated with on_replace triggers.

ddl: rename sharding_metadata_cache

Rename sharding_metadata_cache to router_metadata_cache to distinct it
from storage_metadata_hash.

ddl: fetch sharding info hashes to router

Fetch sharding info hashes to router on ddl schema load. Hashes are
stored in router metadata cache together with sharding info.

ddl: fail on sharding info mismatch

Return error if router sharding info differs from storage sharding info.
Comparison is based on sharding hash values. Hashes are provided with
each relevant request.

Hashes are extracted together with sharding key and sharding func
definitions on router during request execution.

After this patch, the performance of insert requests decreased by 5%,
the performance of select requests decreased by 1.5%.

ddl: reload and retry on sharding info mismatch

If sharding info mismatch has happened, sharding info will be reloaded
on router. After that, request will be retried with new sharding info
(expect for pairs requests due to its nature, they must be retried
manually).

There are no detectable performance drops introduced in this patch.

ddl: deprecate manual sharding schema reload

Since sharding schema reloads must be processed automatically after this
patchset, there shouldn't be usual cases where user need to reload
sharding info manually. Thus methods for manual sharding schema reload
are deprecated and will be removed in future releases.

I didn't forget about

  • Tests
  • Changelog
  • Documentation

Closes #212

@DifferentialOrange DifferentialOrange changed the title Differential orange/gh 212 reload sharding keys Reload sharding data Mar 24, 2022
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-212-reload-sharding-keys branch from 4381677 to 67da2cf Compare March 25, 2022 15:46
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-212-reload-sharding-keys branch 10 times, most recently from 8f5ca7c to 18657b4 Compare April 18, 2022 07:36
@DifferentialOrange
Copy link
Member Author

@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-212-reload-sharding-keys branch from 18657b4 to ce8b497 Compare April 18, 2022 13:28
Bump `cartridge` rock dependency to 2.7.4 in tests. Version 2.7.4 fixes
critical bug related to cartridge hot reload killing `box.watchable`
fiber and causing Tarantool to segfault [1].

1. tarantool/cartridge#1741
If crud request uses tuple as input argument (insert, upsert and replace
operations) and its bucket_id is empty, the module will fill this field
and damage input argument tuple. This patch fixes this behavior.

After this patch, performance of insert, upsert and replace has
decreased by 5%.

Part of #212
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-212-reload-sharding-keys branch from ce8b497 to 6aaa008 Compare April 18, 2022 19:39
@DifferentialOrange DifferentialOrange marked this pull request as ready for review April 18, 2022 19:42
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-212-reload-sharding-keys branch 3 times, most recently from 2ba2e2c to 544618a Compare April 18, 2022 20:20
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-212-reload-sharding-keys branch 2 times, most recently from 128a338 to 2053ff8 Compare April 19, 2022 07:22
@DifferentialOrange
Copy link
Member Author

I have responded to all current comments with comment, fix or both.

@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-212-reload-sharding-keys branch from 2053ff8 to f28a4c7 Compare April 19, 2022 15:58
Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patchset!

I have no objections (aside of several nits above). Feel free to merge.

I think that we should revisit our implementation of schema reloading in a future and make it less coupled with other logic. So I assigned #253 to you as the first step into this direction.

@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-212-reload-sharding-keys branch from f28a4c7 to ccb467d Compare April 20, 2022 07:51
Compute and store sharding key and sharding func hashes on storages.
Hashes are updated with on_replace triggers.

Part of #212
Rename sharding_metadata_cache to router_metadata_cache to distinct it
from storage_metadata_hash.

Part of #212
Fetch sharding info hashes to router on ddl schema load. Hashes are
stored in router metadata cache together with sharding info.

Part of #212
Return error if router sharding info differs from storage sharding info.
Comparison is based on sharding hash values. Hashes are provided with
each relevant request.

Hashes are extracted together with sharding key and sharding func
definitions on router during request execution.

After this patch, the performance of insert requests decreased by 5%,
the performance of select requests decreased by 1.5%.

Part of #212
If sharding info mismatch has happened, sharding info will be reloaded
on router. After that, request will be retried with new sharding info
(expect for pairs requests due to its nature, they must be retried
manually).

There are no detectable performance drops introduced in this patch.

Closes #212
Since sharding schema reloads must be processed automatically after this
patchset, there shouldn't be usual cases where user need to reload
sharding info manually. Thus methods for manual sharding schema reload
are deprecated and will be removed in future releases.

Follows up #212
@DifferentialOrange DifferentialOrange force-pushed the DifferentialOrange/gh-212-reload-sharding-keys branch from ccb467d to b751c96 Compare April 20, 2022 08:22
@DifferentialOrange DifferentialOrange merged commit 093868d into master Apr 20, 2022
@DifferentialOrange DifferentialOrange deleted the DifferentialOrange/gh-212-reload-sharding-keys branch April 20, 2022 09:30
DifferentialOrange added a commit that referenced this pull request May 6, 2022
After introducing sharding hash info comparison [1], requests with ddl
and explicit bucket_id in options started to fail with
"Sharding hash mismatch" error. It affected following methods:
- insert
- insert_object
- replace
- replace_object
- upsert
- upsert_object
- count

The situation is as follows. Due to a code mistake, router hasn't passed
a sharding hash with a request if bucket_id was specified. If there was
any ddl information for a space on storage, it has caused a hash
mismatch error. Since sharding info reload couldn't fix broken hash
extraction, request failed after a number of retries. This patch fixes
this behavior by skipping hash comparison if sharding info wasn't used
(we already do it in other methods).

1. #268

Closes #278
DifferentialOrange added a commit that referenced this pull request May 6, 2022
After introducing sharding hash info comparison [1], requests with ddl
and explicit bucket_id in options started to fail with
"Sharding hash mismatch" error. It affected following methods:
- insert
- insert_object
- replace
- replace_object
- upsert
- upsert_object
- count

The situation is as follows. Due to a code mistake, router hasn't passed
a sharding hash with a request if bucket_id was specified. If there was
any ddl information for a space on storage, it has caused a hash
mismatch error. Since sharding info reload couldn't fix broken hash
extraction, request failed after a number of retries. This patch fixes
this behavior by skipping hash comparison if sharding info wasn't used
(we already do it in other methods).

1. #268

Closes #278
DifferentialOrange added a commit that referenced this pull request May 6, 2022
After introducing sharding hash info comparison [1], requests with ddl
and explicit bucket_id in options started to fail with
"Sharding hash mismatch" error. It affected following methods:
- insert
- insert_object
- replace
- replace_object
- upsert
- upsert_object
- count

The situation is as follows. Due to a code mistake, router hasn't passed
a sharding hash with a request if bucket_id was specified. If there was
any ddl information for a space on storage, it has caused a hash
mismatch error. Since sharding info reload couldn't fix broken hash
extraction, request failed after a number of retries. This patch fixes
this behavior by skipping hash comparison if sharding info wasn't used
(we already do it in other methods).

1. #268

Closes #278
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make automated invalidation of caches on router on schema reload or ddl sharding keys update
2 participants