-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storages: Make DMFile ready for new column indexes/types #8756
Conversation
5d665a7
to
683de5a
Compare
683de5a
to
a03986d
Compare
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
e19a5ea
to
7e3b18c
Compare
/run-all-tests |
/run-unit-test |
#else | ||
// ExtendColumnStat is not enabled yet because it cause downgrade compatibility, wait | ||
// to be released with other binary format changes. | ||
writeExtendColumnStatToBuffer(tmp_buffer), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we write the new format data at the end (and also write old format in the old place)? In this way it may be possible to keep downgrade compatibility as new format will be regarded as extra data and will be discarded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion! The ColumnStat is small enough to ignore its impact on performance and data size. We can keep it until we actually need an incompatible storage format changed by other components.
I'll change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some tests, I found the downgrade compatibility can not be satisfied because there is a default branch 😂
tiflash/dbms/src/Storages/DeltaMerge/File/DMFile.cpp
Lines 1055 to 1059 in d9dc96b
default: | |
throw Exception( | |
ErrorCodes::INCORRECT_DATA, | |
"MetaBlockType {} is not recognized", | |
magic_enum::enum_name(handle->type)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can just logging instread of throwing exception here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. What kind of downgrade compatibility do we need to have? How about we first land the code that will not throw error (and will not change the format at all) for a few major versions, and then land the compatible format? In this way we will have downgrade compatibility for 1 major version. I think Vector will not land pingcap/tiflash in 6 months, so that there could be enough time for us to make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only downgrade compatibility requirement between patch versions under the same minor version. And no requirement across minor or major versions now. But it is good to have.
I've removed the exception thrown branch in this PR because the correctness of bytes read is protected by checksum. I think ignoring unknown meta block types without exception or logging is an acceptable behavior.
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
[LGTM Timeline notifier]Timeline:
|
/hold |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CalvinNeo, JinheLin, Lloyd-Pottiger The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold |
/run-all-tests |
/run-integration-test |
/run-unit-test |
@JaySon-Huang: Your PR was out of date, I have automatically updated it for you. At the same time I will also trigger all tests for you: /run-all-tests
If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
What problem does this PR solve?
Issue Number: ref #6233, close #8768
Problem Summary:
In the near future, we may add new types (vector type) or new indexes. But currently the ColumnStat is not flexible enough for extending new fields for the index size, array size, etc.
What is changed and how it works?
ExtendColumnStat
meta block under DMFile meta v2, which accept protobuf base ColumnStatColumnStat
to support vec type.size0.dat
) into X.merged fileCheck List
Tests
Side effects
Documentation
Release note