Skip to content

Commit

Permalink
add doc
Browse files Browse the repository at this point in the history
  • Loading branch information
morningman committed May 12, 2022
1 parent 8aa6e7c commit 2bb1e36
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 1 deletion.
12 changes: 12 additions & 0 deletions docs/en/admin-manual/multi-tenant.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,18 @@ Node resource division refers to setting tags for BE nodes in a Doris cluster, a
In this way, we have achieved physical resource isolation for different user queries by dividing nodes and restricting user resource usage. Furthermore, we can create different users for different business departments and restrict each user from using different resource groups. In order to avoid the use of resource interference between different business parts. For example, there is a business table in the cluster that needs to be shared by all 9 business departments, but it is hoped that resource preemption between different departments can be avoided as much as possible. Then we can create 3 copies of this table and store them in 3 resource groups. Next, we create 9 users for 9 business departments, and limit the use of one resource group for every 3 users. In this way, the degree of competition for resources is reduced from 9 to 3.

On the other hand, for the isolation of online and offline tasks. We can use resource groups to achieve this. For example, we can divide nodes into two resource groups, Online and Offline. The table data is still stored in 3 copies, of which 2 copies are stored in the Online resource group, and 1 copy is stored in the Offline resource group. The Online resource group is mainly used for online data services with high concurrency and low latency. Some large queries or offline ETL operations can be executed using nodes in the Offline resource group. So as to realize the ability to provide online and offline services simultaneously in a unified cluster.

4. Resource group assignments for load job

The resource usage of load jobs (including insert, broker load, routine load, stream load, etc.) can be divided into two parts:
1. Computing resources: responsible for reading data sources, data transformation and distribution.
2. Write resource: responsible for data encoding, compression and writing to disk.

The write resource must be the node where the replica is located, and the computing resource can theoretically select any node to complete. Therefore, the allocation of resource groups for load jobs is divided into two steps:
1. Use user-level resource tags to limit the resource groups that computing resources can use.
2. Use the resource tag of the replica to limit the resource group that the write resource can use.

So if you want all the resources used by the load operation to be limited to the resource group where the data is located, you only need to set the resource tag of the user level to the same as the resource tag of the replica.

## Single query resource limit

Expand Down
14 changes: 13 additions & 1 deletion docs/zh-CN/admin-manual/multi-tenant.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,18 @@ FE 不参与用户数据的处理计算等工作,因此是一个资源消耗

另一方面,针对在线和离线任务的隔离。我们可以利用资源组的方式实现。比如我们可以将节点划分为 Online 和 Offline 两个资源组。表数据依然以3副本的方式存储,其中 2 个副本存放在 Online 资源组,1 个副本存放在 Offline 资源组。Online 资源组主要用于高并发低延迟的在线数据服务,而一些大查询或离线ETL操作,则可以使用 Offline 资源组中的节点执行。从而实现在统一集群内同时提供在线和离线服务的能力。

4. 导入作业的资源组分配

导入作业(包括insert、broker load、routine load、stream load等)的资源使用可以分为两部分:
1. 计算资源:负责读取数据源、数据转换和分发。
2. 写入资源:负责数据编码、压缩并写入磁盘。

其中写入资源必须是数据副本所在的节点,而计算资源理论上可以选择任意节点完成。所以对于导入作业的资源组的分配分成两个步骤:
1. 使用用户级别的 resource tag 来限定计算资源所能使用的资源组。
2. 使用副本的 resource tag 来限定写入资源所能使用的资源组。

所以如果希望导入操作所使用的全部资源都限定在数据所在的资源组的话,只需将用户级别的 resource tag 设置为和副本的 resource tag 相同即可。

## 单查询资源限制

前面提到的资源组方法是节点级别的资源隔离和限制。而在资源组内,依然可能发生资源抢占问题。比如前文提到的将3个业务部门安排在同一资源组内。虽然降低了资源竞争程度,但是这3个部门的查询依然有可能相互影响。
Expand Down Expand Up @@ -217,4 +229,4 @@ Tag 划分和 CPU 限制是 0.15 版本中的新功能。为了保证可以从

等数据重分布完毕后。我们就可以开始设置用户的资源标签权限了。因为默认情况下,用户的 `resource_tags.location` 属性为空,即可以访问任意 Tag 的 BE。所以在前面步骤中,不会影响到已有用户的正常查询。当 `resource_tags.location` 属性非空时,用户将被限制访问指定 Tag 的 BE。

通过以上4步,我们可以较为平滑的在原有集群升级后,使用资源划分功能。
通过以上4步,我们可以较为平滑的在原有集群升级后,使用资源划分功能。

0 comments on commit 2bb1e36

Please sign in to comment.