Skip to content

Conversation

@zhbinbin
Copy link
Contributor

The original Doris bitmap aggregation function has poor performance on the intersection and union set of bitmap cardinality of more than one billion. There are two reasons for this. The first is that when the bitmap cardinality is large, if the data size exceeds 1g, the network / disk IO time consumption will increase; The second point is that all the sink data of the back-end be instance are transferred to the top node for intersection and union calculation, which leads to the pressure on the top single node and becomes the bottleneck.

My solution is to create a fixed schema table based on the Doris fragmentation rule, and hash fragment the ID range based on the bitmap, that is, cut the ID range vertically to form a small cube. Such bitmap blocks will become smaller and evenly distributed on all back-end be instances. Based on the schema table, some new high-performance udaf aggregation functions are developed. All Scan nodes participate in intersection and union calculation, and top nodes only summarize

The design goal is that the base number of bitmap is more than 10 billion, and the response time of cross union set calculation of 100 dimensional granularity is within 5 s

directoryPath: "contrib/",
children:[],
children:[
"udaf-bitmap-manual.md",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove md

directoryPath: "contrib/",
children:[],
children:[
"udaf-bitmap-manual.md",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

-->


#Bitmap longitudinal cutting udaf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a space at the beginning of title




##Custom udaf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@EmmyMiao87
Copy link
Contributor

This name of udaf is not accurate. Maybe 'udaf_orthogonal_bitmap' is better? Or?

// specific language governing permissions and limitations
// under the License.

#ifndef DORIS_CONTRIB_UDF_SRC_UDAF_BITMAP_BITMAP_VALUE_H
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file same as the be/src/util/bitmap_value.h ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it can not be reused, so it is copy, but a small change.


namespace doris_udf {

class CustomBitmapFunctions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class name should best reflect the meaning of dealing with orthogonal bitmap.

@@ -0,0 +1,209 @@
---
{
"title": "BITMAP正交计算UDAF",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

正交的BITMAP计算UDAF

under the License.
-->

# BITMAP正交计算UDAF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

新udaf需要在doris定义聚合函数时注册函数符号,函数符号通过动态库.so的方式被加载。

### bitmap_orthogonal_intersect

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

首先需要有函数的介绍,就是这个函数的行为是什么?是用来干啥的

求交集函数
bitmap_orthogonal_intersect(bitmap_column, column_to_filter, filter_values)

参数:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

每个参数的介绍是需要包含,每个参数是什么意思的的,比如
第一个参数类型是bitmap,是待求交集的列。


Doris原有的Bitmap聚合函数设计比较通用,但对亿级别以上bitmap大基数的交集和并集计算性能较差。排查后端be的bitmap聚合函数逻辑,发现主要有两个原因。一是当bitmap基数较大时,如数据大小超过1g,网络/磁盘IO处理时间比较长;二是后端be实例在scan数据后全部传输到顶层节点进行求交和并运算,给顶层单节点带来压力,成为处理瓶颈。

解决方案是建表时增加hid列,罐库时hid列按照bitmap列的range划分,并且按hid均匀分桶。这样按range划分的聚合bitmap数据会均匀地分布在所有后端be实例上。在schema表的基础上,优化udaf聚合函数,使其在所有扫描节点参与分布式正交并算,然后在顶层节点进行汇总,如此会大大提高计算效率。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以先总说,解决思路是什么。比如思路是将 bitmap列的值先按照range划分,不同range的值存储在不同的分桶中。保证不同分桶之间的bitmap值是正交的。然后再说怎么详细,最后说为什么这么做可以加速查询

}

void OrthogonalBitmapFunctions::bitmap_count_merge(FunctionContext* context, const StringVal& src, StringVal* dst) {
if (dst->len != sizeof(int64_t)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will dst be bitmap value?

```
libudaf_bitmap.so产出目录:
```
output/contrib/udf/lib/udaf_bitmap/libudaf_bitmap.so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

名字好像是? libudaf_orthogonal_bitmap.so?

## 源码及编译
源代码:
```
contrib/udf/src/udaf_bitmap/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

名称统一一下,比如 都用udaf_orthogonal_bitmap

定义:
```
drop FUNCTION bitmap_orthogonal_intersect(BITMAP,BIGINT,BIGINT, ...);
CREATE AGGREGATE FUNCTION bitmap_orthogonal_intersect(BITMAP,BIGINT,BIGINT, ...) RETURNS BITMAP INTERMEDIATE varchar(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注意文档中的名称统一

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是指哪个名称?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

比如前面都是 orthogonal_bitmap 那么这里也最好是 orthogonal_bitmap

contrib/udf/src/udaf_bitmap/
|-- bitmap_value.h
|-- CMakeLists.txt
|-- custom_bitmap_function.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|-- custom_bitmap_function.cpp
|-- orthogonal_bitmap_function.cpp

@zhbinbin zhbinbin force-pushed the udaf_bitmap branch 3 times, most recently from 9f7314f to d348d33 Compare July 31, 2020 09:57
定义:
```
drop FUNCTION bitmap_orthogonal_intersect(BITMAP,BIGINT,BIGINT, ...);
CREATE AGGREGATE FUNCTION bitmap_orthogonal_intersect(BITMAP,BIGINT,BIGINT, ...) RETURNS BITMAP INTERMEDIATE varchar(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

比如前面都是 orthogonal_bitmap 那么这里也最好是 orthogonal_bitmap


Doris原有的Bitmap聚合函数设计比较通用,但对亿级别以上bitmap大基数的交并集计算性能较差。排查后端be的bitmap聚合函数逻辑,发现主要有两个原因。一是当bitmap基数较大时,如bitmap大小超过1g,网络/磁盘IO处理时间比较长;二是后端be实例在scan数据后全部传输到顶层节点进行求交和并运算,给顶层单节点带来压力,成为处理瓶颈。

解决思路是将bitmap列的值按照range划分,不同range的值存储在不同的分桶中,保证了不同分桶的bitmap值是正交的。这样,数据分布更均匀,一个查询会扫描所有分桶,在每个分桶中将正交的BITMAP进行聚合计算,然后把计算结果传输至顶层节点汇总,如此会大大提高计算效率,解决了顶层单节点计算瓶颈问题。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一句有点歧义 在每个分桶中将正交的BITMAP进行聚合计算 是否改为 先分别对不同分桶中的正交bitmap进行聚合计算, 然后顶层节点直接将聚合计算后的值合并汇总,并输出

Doris原有的Bitmap聚合函数设计比较通用,但对亿级别以上bitmap大基数的交并集计算性能较差。排查后端be的bitmap聚合函数逻辑,发现主要有两个原因。一是当bitmap基数较大时,如bitmap大小超过1g,网络/磁盘IO处理时间比较长;二是后端be实例在scan数据后全部传输到顶层节点进行求交和并运算,给顶层单节点带来压力,成为处理瓶颈。

解决思路是将bitmap列的值按照range划分,不同range的值存储在不同的分桶中,保证了不同分桶的bitmap值是正交的。这样,数据分布更均匀,一个查询会扫描所有分桶,在每个分桶中将正交的BITMAP进行聚合计算,然后把计算结果传输至顶层节点汇总,如此会大大提高计算效率,解决了顶层单节点计算瓶颈问题。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可以加一个总说,

第一步:建表。这一步主要是为了xxx
第二步:编译 udaf,也就是编译 xxx, 是为了xxx
第三步:将udaf 注册到doris中
第四部:如何使用

然后再针对每项分别说。

optimize docs and code
@EmmyMiao87 EmmyMiao87 added area/udf Issues or PRs related to the UDF good first issue approved Indicates a PR has been approved by one committer. labels Aug 14, 2020
Copy link
Contributor

@EmmyMiao87 EmmyMiao87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EmmyMiao87 EmmyMiao87 merged commit f924282 into apache:master Aug 19, 2020
acelyc111 pushed a commit to acelyc111/incubator-doris that referenced this pull request Aug 21, 2020
The original Doris bitmap aggregation function has poor performance on the intersection and union set of bitmap cardinality of more than one billion. There are two reasons for this. The first is that when the bitmap cardinality is large, if the data size exceeds 1g, the network / disk IO time consumption will increase; The second point is that all the sink data of the back-end be instance are transferred to the top node for intersection and union calculation, which leads to the pressure on the top single node and becomes the bottleneck.

My solution is to create a fixed schema table based on the Doris fragmentation rule, and hash fragment the ID range based on the bitmap, that is, cut the ID range vertically to form a small cube. Such bitmap blocks will become smaller and evenly distributed on all back-end be instances. Based on the schema table, some new high-performance udaf aggregation functions are developed. All Scan nodes participate in intersection and union calculation, and top nodes only summarize

The design goal is that the base number of bitmap is more than 10 billion, and the response time of cross union set calculation of 100 dimensional granularity is within 5 s.

There are three udaf functions in this commit: orthogonal_bitmap_intersect_count, orthogonal_bitmap_union_count, orthogonal_bitmap_intersect.
@EmmyMiao87 EmmyMiao87 mentioned this pull request Sep 1, 2020
@yangzhg yangzhg mentioned this pull request Feb 9, 2021
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Aug 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/udf Issues or PRs related to the UDF

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants