Skip to content

Commit

Permalink
2
Browse files Browse the repository at this point in the history
  • Loading branch information
KassieZ committed Nov 18, 2024
1 parent f50f8b5 commit 5448d58
Show file tree
Hide file tree
Showing 7 changed files with 77 additions and 98 deletions.
2 changes: 0 additions & 2 deletions common_docs_zh/gettingStarted/what-is-new.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -31,5 +31,3 @@ import Latest from './demo-block/latest.tsx'





44 changes: 6 additions & 38 deletions community/join-community.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,38 +31,6 @@ We have graduated from Apache incubator successfully and become an Top-Level Pro

<hr />

## 🙌 More Developers Join Us
[![Monthly Active Contributors](https://contributor-overtime-api.apiseven.com/contributors-svg?chart=contributorMonthlyActivity&repo=apache/doris)](https://www.apiseven.com/en/contributor-graph?chart=contributorMonthlyActivity&repo=apache/doris)













[![Contributor over time](https://contributor-overtime-api.apiseven.com/contributors-svg?chart=contributorOverTime&repo=apache/doris)](https://www.apiseven.com/en/contributor-graph?chart=contributorOverTime&repo=apache/doris)

















## 🌟 More Stars on Github
<a href="https://star-history.com/#apache/doris&Date">
Expand All @@ -87,25 +55,25 @@ We have graduated from Apache incubator successfully and become an Top-Level Pro



##### We deeply appreciate 🔗[community contributors](https://github.com/apache/doris/graphs/contributors) for their contribution to Apache Doris.
**We deeply appreciate 🔗[community contributors](https://github.com/apache/doris/graphs/contributors) for their contribution to Apache Doris.**




<hr />

# Don't Miss Out the Latest News and Events
## Don't Miss Out the Latest News and Events

Learn our latest techniques, get inspirations from our rich use cases, and see what the community has been up to !


- ### Join our heated discussions - 💬 [Slack](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2kl08hzc0-SPJe4VWmL_qzrFd2u2XYQA) 📇 [Github](https://github.com/apache/doris)
- Join our heated discussions - 💬 [Slack](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2kl08hzc0-SPJe4VWmL_qzrFd2u2XYQA) 📇 [Github](https://github.com/apache/doris)

- ### Use cases and tech insight - 📭 [Twitter](https://twitter.com/doris_apache)
- Use cases and tech insight - 📭 [Twitter](https://twitter.com/doris_apache)

- ### Come and connect with us - 🌐 [LinkedIn](https://www.linkedin.com/company/doris-apache/)
- Come and connect with us - 🌐 [LinkedIn](https://www.linkedin.com/company/doris-apache/)

- ### Events Videos - ▶️ [YouTube](https://www.youtube.com/@Select_DB) 📺 [Bilibili](https://space.bilibili.com/362350065)
- Events Videos - ▶️ [YouTube](https://www.youtube.com/@Select_DB) 📺 [Bilibili](https://space.bilibili.com/362350065)



Expand Down
6 changes: 3 additions & 3 deletions gettingStarted/what-is-apache-doris.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,15 @@ specific language governing permissions and limitations
under the License.
-->

# What's Apache Doris


Apache Doris is an MPP-based real-time data warehouse known for its high query speed. For queries on large datasets, it returns results in sub-seconds. It supports both high-concurrent point queries and high-throughput complex analysis. It can be used for report analysis, ad-hoc queries, unified data warehouse, and data lake query acceleration. Based on Apache Doris, users can build applications for user behavior analysis, A/B testing platform, log analysis, user profile analysis, and e-commerce order analysis.

Apache Doris, formerly known as Palo, was initially created to support Baidu's ad reporting business. It was officially open-sourced in 2017 and donated by Baidu to the Apache Software Foundation in July 2018, where it was operated by members of the incubator project management committee under the guidance of Apache mentors. In June 2022, Apache Doris graduated from the Apache incubator as a Top-Level Project. By 2024, the Apache Doris community has gathered more than 600 contributors from hundreds of companies in different industries, with over 120 monthly active contributors.

Apache Doris has a wide user base. It has been used in production environments of over 4000 companies worldwide, including giants such as TikTok, Baidu, Cisco, Tencent, and NetEase. It is also widely used across industries from finance, retailing, and telecommunications to energy, manufacturing, medical care, etc.

# Usage Scenarios
## Usage Scenarios

The figure below shows what Apache Doris can do in a data pipeline. Data sources, after integration and processing, are ingested into the Apache Doris real-time data warehouse and offline data lakehouses such as Hive, Iceberg, and Hudi. Apache Doris can be used for the following purposes:

Expand Down Expand Up @@ -88,7 +88,7 @@ The query engine of Apache Doris is fully vectorized, with all memory structures

![Query engine](/images/apache-doris-query-engine-2.png)

Apache Doris uses **adaptive query execution** technology to dynamically adjust the execution plan based on runtime statistics. For example, it can generate a runtime filter and push it to the probe side. Specifically, it pushes the filters to the lowest-level scan node on the probe side, which largely reduces the data amount to be processed and increases join performance. The runtime filter of Apache Doriz supports In/Min/Max/Bloom Filter.
Apache Doris uses **adaptive query execution** technology to dynamically adjust the execution plan based on runtime statistics. For example, it can generate a runtime filter and push it to the probe side. Specifically, it pushes the filters to the lowest-level scan node on the probe side, which largely reduces the data amount to be processed and increases join performance. The runtime filter of Apache Doris supports In/Min/Max/Bloom Filter.

The query **optimizer** of Apache Doris is a combination of CBO and RBO. RBO supports constant folding, subquery rewriting, and predicate pushdown while CBO supports join reorder. The Apache Doris CBO is under continuous optimization for more accurate statistics collection and inference as well as a more accurate cost model.

Original file line number Diff line number Diff line change
Expand Up @@ -31,39 +31,6 @@ under the License.




## 🙌 更多开发者加入我们

[![Monthly Active Contributors](https://contributor-overtime-api.apiseven.com/contributors-svg?chart=contributorMonthlyActivity&repo=apache/doris)](https://www.apiseven.com/en/contributor-graph?chart=contributorMonthlyActivity&repo=apache/doris)















[![Contributor over time](https://contributor-overtime-api.apiseven.com/contributors-svg?chart=contributorOverTime&repo=apache/doris)](https://www.apiseven.com/en/contributor-graph?chart=contributorOverTime&repo=apache/doris)













## 🌟 更多用户认可我们

<a href="https://star-history.com/#apache/doris&Date">
Expand All @@ -81,14 +48,6 @@ under the License.











**我们非常感谢 🔗[社区贡献者](https://github.com/apache/doris/graphs/contributors) 对 Apache Doris 的大力支持!**


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ CANCEL BUILD INDEX ON table_name (job_id1,jobid_2,...);

:::tip

`BUILD INDEX` 会生成一个异步任务执行,在每个 BE 上有多个线程执行索引构建任务,通过 BE 参数 `alter_index_worker_count` 可以设置,默认值是 3。
`BUILD INDEX` 会生成一个异步任务执行,在每个 BE 上有多个线程执行索引删除任务,通过 BE 参数 `alter_index_worker_count` 可以设置,默认值是 3。

2.0.12 之前的版本 `BUILD INDEX` 会一直重试直到成功,从这两个版本开始通过失败和超时机制避免一直重试。

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
{
"title": "Test",
"language": "zh-CN"
}
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->


## Title 标题

### level -3 标题

#### level 4

1. 创建 EKS 集群前,请确保[环境中已安装如下命令行工具](https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html)
- 安装并配置 AWS 的命令行工具 AWS CLI。
- 安装 EKS 集群命令行工具 eksctl。
- 安装 Kubernetes 集群命令行工具 kubectl。
2. 创建 EKS 集群。支持以下两种方式:
- [使用 eksctl 快速创建 EKS 集群](https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/getting-started-eksctl.html)
- [使用 AWS 控制台和 AWS CLI 手动创建 EKS 集群](https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/getting-started-console.html)



1. 添加定制资源 StarRocksCluster。

```Bash
kubectl apply -f https://raw.githubusercontent.com/StarRocks/starrocks-kubernetes-operator/main/deploy/starrocks.com_starrocksclusters.yaml
```

:::tip
zhengwei
:::


Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ under the License.

- 支持短语查询 `MATCH_PHRASE`
- 支持指定词距 `slop`
- 支持短语+前缀 `MATCH_PHRASE_PREFIX`
- 支持短语 + 前缀 `MATCH_PHRASE_PREFIX`

- 支持分词正则查询 `MATCH_REGEXP`

Expand Down Expand Up @@ -175,7 +175,7 @@ table_properties;
<details>
<summary>ignore_above</summary>

**用于指定不分词字符串索引(没有指定parser)的长度限制**
**用于指定不分词字符串索引(没有指定 parser)的长度限制**
<p>- 长度超过 ignore_above 设置的字符串不会被索引。对于字符串数组,ignore_above 将分别应用于每个数组元素,长度超过 ignore_above 的字符串元素将不被索引。</p>
<p>- 默认为 256,单位是字节</p>

Expand Down Expand Up @@ -243,7 +243,7 @@ CANCEL BUILD INDEX ON table_name (job_id1,jobid_2,...);

:::tip

`BUILD INDEX` 会生成一个异步任务执行,在每个 BE 上有多个线程执行索引构建任务,通过 BE 参数 `alter_index_worker_count` 可以设置,默认值是3
`BUILD INDEX` 会生成一个异步任务执行,在每个 BE 上有多个线程执行索引构建任务,通过 BE 参数 `alter_index_worker_count` 可以设置,默认值是 3

2.0.12 和 2.1.4 之前的版本 `BUILD INDEX` 会一直重试直到成功,从这两个版本开始通过失败和超时机制避免一直重试。3.0 存算分离模式暂不支持此命令。

Expand All @@ -265,7 +265,7 @@ ALTER TABLE table_name DROP INDEX idx_name;

:::tip

`DROP INDEX` 会删除索引定义,新写入数据不会再写索引,同时会生成一个异步任务执行索引删除操作,在每个 BE 上有多个线程执行索引构建任务,通过 BE 参数 `alter_index_worker_count` 可以设置,默认值是3
`DROP INDEX` 会删除索引定义,新写入数据不会再写索引,同时会生成一个异步任务执行索引删除操作,在每个 BE 上有多个线程执行索引删除任务,通过 BE 参数 `alter_index_worker_count` 可以设置,默认值是 3

:::

Expand All @@ -287,30 +287,30 @@ SELECT * FROM table_name WHERE content MATCH_ALL 'keyword1 keyword2';

-- 2. 全文检索短语匹配,通过 MATCH_PHRASE 完成
-- 2.1 content 列中同时包含 keyword1 和 keyword2 的行,而且 keyword2 必须紧跟在 keyword1 后面
-- 'keyword1 keyword2','wordx keyword1 keyword2','wordx keyword1 keyword2 wordy' 能匹配,因为他们都包含keyword1 keyword2,而且keyword2 紧跟在 keyword1 后面
-- 'keyword1 keyword2','wordx keyword1 keyword2','wordx keyword1 keyword2 wordy' 能匹配,因为他们都包含 keyword1 keyword2,而且 keyword2 紧跟在 keyword1 后面
-- 'keyword1 wordx keyword2' 不能匹配,因为 keyword1 keyword2 之间隔了一个词 wordx
-- 'keyword2 keyword1',因为 keyword1 keyword2 的顺序反了
SELECT * FROM table_name WHERE content MATCH_PHRASE 'keyword1 keyword2';

-- 2.2 content 列中同时包含 keyword1 和 keyword2 的行,而且 keyword1 keyword2 的 `词距`(slop) 不超过3
-- 'keyword1 keyword2', 'keyword1 a keyword2', 'keyword1 a b c keyword2' 都能匹配,因为keyword1 keyword2中间隔的词分别是0 1 3 都不超过3
-- 'keyword1 a b c d keyword2' 不能能匹配,因为keyword1 keyword2中间隔的词有4个,超过3
-- 'keyword2 keyword1', 'keyword2 a keyword1', 'keyword2 a b c keyword1' 也能匹配,因为指定 slop > 0 时不再要求keyword1 keyword2 的顺序。这个行为参考了 ES,与直觉的预期不一样,因此 Doris 提供了在 slop 后面指定正数符号(+)表示需要保持 keyword1 keyword2 的先后顺序
-- 2.2 content 列中同时包含 keyword1 和 keyword2 的行,而且 keyword1 keyword2 的 `词距`(slop)不超过 3
-- 'keyword1 keyword2', 'keyword1 a keyword2', 'keyword1 a b c keyword2' 都能匹配,因为 keyword1 keyword2 中间隔的词分别是 0 1 3 都不超过 3
-- 'keyword1 a b c d keyword2' 不能能匹配,因为 keyword1 keyword2 中间隔的词有 4 个,超过 3
-- 'keyword2 keyword1', 'keyword2 a keyword1', 'keyword2 a b c keyword1' 也能匹配,因为指定 slop > 0 时不再要求 keyword1 keyword2 的顺序。这个行为参考了 ES,与直觉的预期不一样,因此 Doris 提供了在 slop 后面指定正数符号(+)表示需要保持 keyword1 keyword2 的先后顺序
SELECT * FROM table_name WHERE content MATCH_PHRASE 'keyword1 keyword2 ~3';
-- slop 指定正号,'keyword1 a b c keyword2' 能匹配,而 'keyword2 a b c keyword1' 不能匹配
SELECT * FROM table_name WHERE content MATCH_PHRASE 'keyword1 keyword2 ~3+';

-- 2.3 在保持词顺序的前提下,对最后一个词keyword2做前缀匹配,默认找50个前缀词(session变量inverted_index_max_expansions控制
-- 'keyword1 keyword2abc' 能匹配,因为keyword1完全一样,最后一个 keyword2abc 是 keyword2 的前缀
-- 2.3 在保持词顺序的前提下,对最后一个词 keyword2 做前缀匹配,默认找 50 个前缀词(session 变量 inverted_index_max_expansions 控制
-- 'keyword1 keyword2abc' 能匹配,因为 keyword1 完全一样,最后一个 keyword2abc 是 keyword2 的前缀
-- 'keyword1 keyword2' 也能匹配,因为 keyword2 也是 keyword2 的前缀
-- 'keyword1 keyword3' 不能匹配,因为 keyword3 不是 keyword2 的前缀
-- 'keyword1 keyword3abc' 也不能匹配,因为 keyword3abc 也不是 keyword2 的前缀
SELECT * FROM table_name WHERE content MATCH_PHRASE_PREFIX 'keyword1 keyword2';

-- 2.4 如果只填一个词会退化为前缀查询,默认找50个前缀词(session变量inverted_index_max_expansions控制
-- 2.4 如果只填一个词会退化为前缀查询,默认找 50 个前缀词(session 变量 inverted_index_max_expansions 控制
SELECT * FROM table_name WHERE content MATCH_PHRASE_PREFIX 'keyword1';

-- 2.5 对分词后的词进行正则匹配,默认匹配50个(session变量inverted_index_max_expansions控制
-- 2.5 对分词后的词进行正则匹配,默认匹配 50 个(session 变量 inverted_index_max_expansions 控制
-- 类似 MATCH_PHRASE_PREFIX 的匹配规则,只是前缀变成了正则
SELECT * FROM table_name WHERE content MATCH_REGEXP 'key*';

Expand Down

0 comments on commit 5448d58

Please sign in to comment.