-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX](resize) fix array and map offsets resize with default value #25669
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
// specific language governing permissions and limitations | ||
// under the License. | ||
|
||
#include <gtest/gtest-message.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 'gtest/gtest-message.h' file not found [clang-diagnostic-error]
#include <gtest/gtest-message.h>
^
return dataTypes; | ||
} | ||
|
||
TEST(ResizeTest, ScalaTypeTest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: all parameters should be named in a function [readability-named-parameter]
TEST(ResizeTest, ScalaTypeTest) { | |
TEST(ResizeTest /*unused*/, ScalaTypeTest /*unused*/) { |
} | ||
} | ||
|
||
TEST(ResizeTest, ArrayTypeTest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: all parameters should be named in a function [readability-named-parameter]
TEST(ResizeTest, ArrayTypeTest) { | |
TEST(ResizeTest /*unused*/, ArrayTypeTest /*unused*/) { |
} | ||
} | ||
|
||
TEST(ResizeTest, ArrayTypeWithValuesTest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: all parameters should be named in a function [readability-named-parameter]
TEST(ResizeTest, ArrayTypeWithValuesTest) { | |
TEST(ResizeTest /*unused*/, ArrayTypeWithValuesTest /*unused*/) { |
col_a->insert(af); | ||
col_a->insert(af); | ||
|
||
col_a->resize(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 10 is a magic number; consider replacing it with a named constant [readability-magic-numbers]
col_a->resize(10);
^
MutableColumnPtr b = a->create_column(); | ||
b->insert_range_from(*col_a, 0, 10); | ||
EXPECT_EQ(b->size(), 10); | ||
ColumnMap* col_map = reinterpret_cast<ColumnMap*>(b.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
ColumnMap* col_map = reinterpret_cast<ColumnMap*>(b.get()); | |
auto* col_map = reinterpret_cast<ColumnMap*>(b.get()); |
} | ||
} | ||
|
||
TEST(ResizeTest, StructTypeTest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: all parameters should be named in a function [readability-named-parameter]
TEST(ResizeTest, StructTypeTest) { | |
TEST(ResizeTest /*unused*/, StructTypeTest /*unused*/) { |
|
||
DataTypePtr a = std::make_shared<DataTypeStruct>(dataTypes); | ||
auto col_a = a->create_column(); | ||
col_a->resize(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 10 is a magic number; consider replacing it with a named constant [readability-magic-numbers]
col_a->resize(10);
^
auto col_a = a->create_column(); | ||
col_a->resize(10); | ||
MutableColumnPtr b = a->create_column(); | ||
b->insert_range_from(*col_a, 0, 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 10 is a magic number; consider replacing it with a named constant [readability-magic-numbers]
b->insert_range_from(*col_a, 0, 10);
^
col_a->resize(10); | ||
MutableColumnPtr b = a->create_column(); | ||
b->insert_range_from(*col_a, 0, 10); | ||
EXPECT_EQ(b->size(), 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 10 is a magic number; consider replacing it with a named constant [readability-magic-numbers]
EXPECT_EQ(b->size(), 10);
^
run buildall |
be/src/vec/columns/column_array.cpp
Outdated
@@ -422,7 +422,8 @@ void ColumnArray::reserve(size_t n) { | |||
|
|||
//please check you real need size in data column, because it's maybe need greater size when data is string column | |||
void ColumnArray::resize(size_t n) { | |||
get_offsets().resize(n); | |||
auto last_off = get_offsets().back(); | |||
get_offsets().resize_fill(n, last_off); | |||
get_data().resize(n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this get_data().resize(n)
correct? for example if this array contains 3 element s[1, 2]
[1, 2]
[1, 2]
, so the inner data size is 6 instead of 3. so after doing ::resize(3)
, the inner data size became 3 and lost 3 nested element.Is this semantic right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should not call get_data().resize(n). Just append offsets with the same value.
TeamCity be ut coverage result: |
be/src/vec/columns/column_array.cpp
Outdated
@@ -422,7 +422,8 @@ void ColumnArray::reserve(size_t n) { | |||
|
|||
//please check you real need size in data column, because it's maybe need greater size when data is string column | |||
void ColumnArray::resize(size_t n) { | |||
get_offsets().resize(n); | |||
auto last_off = get_offsets().back(); | |||
get_offsets().resize_fill(n, last_off); | |||
get_data().resize(n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should not call get_data().resize(n). Just append offsets with the same value.
@@ -422,7 +422,8 @@ void ColumnArray::reserve(size_t n) { | |||
|
|||
//please check you real need size in data column, because it's maybe need greater size when data is string column | |||
void ColumnArray::resize(size_t n) { | |||
get_offsets().resize(n); | |||
auto last_off = get_offsets().back(); | |||
get_offsets().resize_fill(n, last_off); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check if resize_fill will modify old offsets value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (n > old_size) {
this->reserve(n);
std::fill(t_end(), t_end() + n - old_size, value);
}
this->c_end = this->c_start + this->byte_size(n);
be/src/vec/columns/column_map.cpp
Outdated
@@ -450,7 +450,8 @@ void ColumnMap::reserve(size_t n) { | |||
} | |||
|
|||
void ColumnMap::resize(size_t n) { | |||
get_offsets().resize(n); | |||
auto last_off = get_offsets().back(); | |||
get_offsets().resize_fill(n, last_off); | |||
keys_column->resize(n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not call resize on keys and values column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
@@ -422,7 +422,8 @@ void ColumnArray::reserve(size_t n) { | |||
|
|||
//please check you real need size in data column, because it's maybe need greater size when data is string column | |||
void ColumnArray::resize(size_t n) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there the same problem for ColumnStruct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct has no offsets array so do not has this problem
(From new machine)TeamCity pipeline, clickbench performance test result: |
run p1 |
run buildall |
run compile |
run beut |
run clickbench |
(From new machine)TeamCity pipeline, clickbench performance test result: |
TeamCity be ut coverage result: |
run p0 |
run p1 |
run external |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Proposed changes
Issue Number: close #xxx
fix array and map offsets resize with default value
if we create_column() then resize() , offsets in array\map may has wrong default value to push
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...