-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
api-reviewCategorizes an issue or PR as actively needing an API review.Categorizes an issue or PR as actively needing an API review.area/sql/functionIssues or PRs related to the SQL functionsIssues or PRs related to the SQL functionskind/designCategorizes issue or PR as related to design.Categorizes issue or PR as related to design.
Description
Support Bitmap Intersect
Support aggregate function Bitmap Intersect, it is mainly used to take intersection of grouped data.
bitmap_intersect
Calculates the intersection of bitmap columns and returns a bitmap object.
bitmap_intersect(expr)
Parameters
The expr column type must be bitmap.
Return value
bitmap object
Example
table schema
create table bitmap_intersect_test (
tag varchar(20),
user_id bitmap bitmap_union
)
AGGREGATE KEY(tag)
DISTRIBUTED BY HASH(tag) BUCKETS 3;
Query which users satisfy the three tags a, b, and c at the same time.
select bitmap_to_string(bitmap_intersect(user_id)) from
(
select bitmap_union(user_id) user_id from bitmap_intersect_test
where tag in ('a', 'b', 'c')
group by tag
) a
Design
Semantic analysis
The child type of bitmap_intersect must be bitmap.
class FunctionCallExpr {
void analyze() {
if(fnName.equals("bitmap_intersect")) {
...
if(!fn.getChild(0).isBitmapType()) {
throw new AnalysisException("the child type of " + fnName + " must be bitmap")
}
...
}
}
}
Function implement
The function of each stage of `bitmap_intersect``` is declared in` function set```.
Function definition
FunctionName: bitmap_intersect,
InputType: bitmap,
OutputType: bitmap,
IntermediateType: varchar
init
Directly reuse the current bitmap init function
update
merge
Perform intersection calculation on the bitmap grouped on the current node
void BitmapFunctions::bitmap_intersect(FunctionContext* ctx, const StringVal& src, StringVal* dst) {
if (src.is_null) {
return;
}
auto dst_bitmap = reinterpret_cast<BitmapValue*>(dst->ptr);
// zero size means the src input is a agg object
if (src.len == 0) {
(*dst_bitmap) &= *reinterpret_cast<BitmapValue*>(src.ptr);
} else {
(*dst_bitmap) &= BitmapValue((char*) src.ptr);
}
}
serialize
finalize
Directly reuse the current bitmap serialization function
Query plan
mysql> explain select bitmap_intersect(user_id) from (select bitmap_union(user_id) user_id from bitmap_intersect_test where tag in ('a', 'b', 'c') group by tag ) a;
+----------------------------------------------------------------------------------------+
| Explain String |
+----------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:<slot 8> |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 6:AGGREGATE (merge finalize) |
| | output: bitmap_intersect(<slot 7>) |
| | group by: |
| | tuple ids: 5 |
| | |
| 5:EXCHANGE |
| tuple ids: 4 |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: <slot 2> |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 05 |
| UNPARTITIONED |
| |
| 2:AGGREGATE (update serialize) |
| | output: bitmap_intersect(<slot 5>) |
| | group by: |
| | tuple ids: 4 |
| | |
| 4:AGGREGATE (merge finalize) |
| | output: bitmap_union(<slot 3>) |
| | group by: <slot 2> |
| | tuple ids: 2 |
| | |
| 3:EXCHANGE |
| tuple ids: 1 |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 03 |
| HASH_PARTITIONED: <slot 2> |
| |
| 1:AGGREGATE (update serialize) |
| | STREAMING |
| | output: bitmap_union(`user_id`) |
| | group by: `tag` |
| | tuple ids: 1 |
| | |
| 0:OlapScanNode |
| TABLE: bitmap_intersect_test |
| PREAGGREGATION: ON |
| PREDICATES: `tag` IN ('a', 'b', 'c') |
| partitions=1/1 |
| rollup: bitmap_intersect_test |
| tabletRatio=100/100 | |
| numNodes=6 |
| tuple ids: 0 |
+----------------------------------------------------------------------------------------+
imay and kangpinghuang
Metadata
Metadata
Assignees
Labels
api-reviewCategorizes an issue or PR as actively needing an API review.Categorizes an issue or PR as actively needing an API review.area/sql/functionIssues or PRs related to the SQL functionsIssues or PRs related to the SQL functionskind/designCategorizes issue or PR as related to design.Categorizes issue or PR as related to design.