-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: apply column-aware row encoding & enable schema change #8394
Conversation
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
…ncoding-for-table Signed-off-by: Bugen Zhao <i@bugenzhao.com>
// TODO: If we do some `Delete` after schema change, we cannot ensure the encoded value | ||
// with the new version of serializer is the same as the old one, even if they can be | ||
// decoded into the same value. The table is now performing consistency check on the raw | ||
// bytes, so we need to turn off the check here. We may turn it on if we can compare the | ||
// decoded row. | ||
let state_table = | ||
StateTableInner::from_table_catalog_inconsistent_op(table_catalog, store, vnodes).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @hzxa21 😰
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc. @hzxa21 Sanity check is disabled here
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Do existing e2e/unit tests cover schema change cases? |
Oops, I forgot. I'll add some. |
…ncoding-for-table
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Codecov Report
@@ Coverage Diff @@
## main #8394 +/- ##
==========================================
- Coverage 71.62% 71.61% -0.02%
==========================================
Files 1135 1135
Lines 185458 185541 +83
==========================================
+ Hits 132835 132875 +40
- Misses 52623 52666 +43
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Done. PTAL 🥺 |
} | ||
|
||
pub async fn handlle_conflict<'a, S: StateStore>( | ||
&mut self, | ||
buffer: MaterializeBuffer, | ||
table: &StateTable<S>, | ||
table: &StateTableInner<S, SD>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we extract a trait for the state table such as StateTableRead
and change the type to impl StateTableRead
to erase the S
and SD
here 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good idea. We can do this refactor in the future.
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
pub mod column_aware_row_encoding; | ||
|
||
pub type Result<T> = std::result::Result<T, ValueEncodingError>; | ||
|
||
/// The kind of all possible `ValueRowSerde`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we run fast enough, perhaps we could remove the Basic
encoding in 0.18. (Because the storage's side work has break the compatibility anyway. Of course just for this time 🥵)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of still using the basic value-encoding for the tables that (temporarily) not supports schema change, like internal tables or normal materialized views (except for Index). The reason is that "column ID" does not make much sense for these tables with immutable schema, and we can save more storage costs.
@@ -41,6 +41,9 @@ message StorageTableDesc { | |||
uint32 retention_seconds = 5; | |||
repeated uint32 value_indices = 6; | |||
uint32 read_prefix_len_hint = 7; | |||
// Whether the table is versioned. If `true`, column-aware row encoding will be used | |||
// to be compatible with schema changes. | |||
bool versioned = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that a table contains both basic encoded and column-aware encoded rows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't consider backward compatibility, then nope. Currently, we follow the version
field of TableCatalog
to decide the encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any concerns, but I may be better to have another senior's approval 😀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
// Decide which serializer to use based on whether the table is versioned or not. | ||
let row_serde = if versioned { | ||
ColumnAwareSerde::new(&column_ids, schema.into()).into() | ||
} else { | ||
BasicSerde::new(&column_ids, schema.into()).into() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just found that there's something else we need to update in the storage java binding. cc @wenym1 Is it possible to reuse some code of StorageTable
here?
Signed-off-by: Bugen Zhao i@bugenzhao.comI hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Enable column-aware row encoding for
MaterializeExecutor
and allStorageTable
usages.StateTable
will decide which encoding format is used according to whetherversion
field is set. Note that this is only set for regular tables.versioned
field toTableDesc
.StorageTable
will similarly decide the format by this.ALTER TABLE [ADD|DROP] COLUMN
in production.Checklist For Contributors
./risedev check
(or alias,./risedev c
)Checklist For Reviewers
Documentation
Types of user-facing changes
Please keep the types that apply to your changes, and remove the others.
ALTER TABLE [ADD|DROP] COLUMN
Release note
Initial support for
ALTER TABLE [ADD|DROP] COLUMN
on regular tables (without connectors).