-
Notifications
You must be signed in to change notification settings - Fork 755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Optimize StringLikeSearch faster #8458
Comments
Use the newest version Databend. Try to config max_storage_io_requests to different value. The like search query has best performance when set the value equal or greater the CPU core. cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c
8 Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
#release databend-query start
./target/release/databend-query databend> select version();
+------------------------------------------------------------------------------------+
| version() |
+------------------------------------------------------------------------------------+
| 8.0.26-v0.8.43-nightly-ef64046(rust-1.66.0-nightly-2022-10-26T02:09:40.432768335Z) |
+------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
databend> set max_storage_io_requests=1;
Query OK, 0 rows affected (0.00 sec)
databend> select count() from orders where o_comment like '%pending%deposits%';
+---------+
| count() |
+---------+
| 161870 |
+---------+
1 row in set (3.26 sec)
Read 15000000 rows, 806.23 MiB in 3.251 sec., 4.61 million rows/sec., 248.00 MiB/sec.
databend> set max_storage_io_requests=2;
Query OK, 0 rows affected (0.01 sec)
databend> select count() from orders where o_comment like '%pending%deposits%';
+---------+
| count() |
+---------+
| 161870 |
+---------+
1 row in set (1.75 sec)
Read 15000000 rows, 806.23 MiB in 1.746 sec., 8.59 million rows/sec., 461.72 MiB/sec.
databend> set max_storage_io_requests=4;
Query OK, 0 rows affected (0.01 sec)
databend> select count() from orders where o_comment like '%pending%deposits%';
+---------+
| count() |
+---------+
| 161870 |
+---------+
1 row in set (1.02 sec)
Read 15000000 rows, 806.23 MiB in 1.008 sec., 14.88 million rows/sec., 799.94 MiB/sec.
databend> set max_storage_io_requests=8;
Query OK, 0 rows affected (0.01 sec)
databend> select count() from orders where o_comment like '%pending%deposits%';
+---------+
| count() |
+---------+
| 161870 |
+---------+
1 row in set (0.85 sec)
Read 15000000 rows, 806.23 MiB in 0.843 sec., 17.8 million rows/sec., 956.77 MiB/sec.
databend> set max_storage_io_requests=16;
Query OK, 0 rows affected (0.01 sec)
databend> select count() from orders where o_comment like '%pending%deposits%';
+---------+
| count() |
+---------+
| 161870 |
+---------+
1 row in set (0.83 sec)
Read 15000000 rows, 806.23 MiB in 0.818 sec., 18.33 million rows/sec., 985.34 MiB/sec.
databend> set max_storage_io_requests=1000;
Query OK, 0 rows affected (0.01 sec)
databend> select count() from orders where o_comment like '%pending%deposits%';
+---------+
| count() |
+---------+
| 161870 |
+---------+
1 row in set (0.84 sec)
Read 15000000 rows, 806.23 MiB in 0.833 sec., 18 million rows/sec., 967.69 MiB/sec.
|
Try to query databend avg cost 0.088s clickhouse avg cost 0.087s 'root'@mysqldb 11:12:12 [tpch]> select count() from orders where o_orderkey>36901;
+----------+
| count() |
+----------+
| 14990771 |
+----------+
1 row in set (0.10 sec)
Read 15000000 rows, 114.44 MiB in 0.088 sec., 170.41 million rows/sec., 1.27 GiB/sec.
default@mysqldb 11:12:59 [tpch]> select count() from orders where o_orderkey>36901;
+----------+
| count() |
+----------+
| 14990771 |
+----------+
1 row in set (0.09 sec)
Read 14991808 rows, 114.38 MiB in 0.087953918 sec., 170450712 rows/sec., 1.27 GiB/sec.
So now, the |
record: In databend position is faster than like. ck is same.
'root'@mysqldb 09:24:58 [tpch]> select version();
+------------------------------------------------------------------------------------+
| version() |
+------------------------------------------------------------------------------------+
| 8.0.26-v0.8.43-nightly-ccd5ae2(rust-1.66.0-nightly-2022-11-01T02:50:57.392225465Z) |
+------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
'root'@mysqldb 09:25:47 [tpch]> select count() from orders where o_clerk like '%lerk#00000177%';
+---------+
| count() |
+---------+
| 15086 |
+---------+
1 row in set (0.42 sec)
Read 15000000 rows, 329.02 MiB in 0.420 sec., 35.72 million rows/sec., 783.40 MiB/sec.
'root'@mysqldb 09:25:51 [tpch]> select count() from orders where position('lerk#00000177' in o_clerk)>0;
+---------+
| count() |
+---------+
| 15086 |
+---------+
1 row in set (0.23 sec)
Read 15000000 rows, 329.02 MiB in 0.222 sec., 67.47 million rows/sec., 1.45 GiB/sec.
default@mysqldb 09:25:26 [tpch]> select version();
+-----------+
| version() |
+-----------+
| 22.8.5.29 |
+-----------+
1 row in set (0.00 sec)
Read 1 rows, 1.00 B in 0.001141889 sec., 875 rows/sec., 875.74 B/sec.
default@mysqldb 09:25:34 [tpch]> select count() from orders where o_clerk like '%lerk#00000177%';
+---------+
| count() |
+---------+
| 15086 |
+---------+
1 row in set (0.11 sec)
Read 15000000 rows, 343.32 MiB in 0.117024642 sec., 128178131 rows/sec., 2.87 GiB/sec.
default@mysqldb 09:25:37 [tpch]> select count() from orders where position('lerk#00000177' in o_clerk)>0;
+---------+
| count() |
+---------+
| 15086 |
+---------+
1 row in set (0.13 sec)
Read 15000000 rows, 343.32 MiB in 0.123843314 sec., 121120789 rows/sec., 2.71 GiB/sec.
|
This result has some mistake. Test again: ck: 0.023 --databend-quer
mysql> select count() from orders where o_orderkey>36901;
+----------+
| count() |
+----------+
| 14990771 |
+----------+
1 row in set (0.09 sec)
Read 15000000 rows, 114.44 MiB in 0.076 sec., 197.11 million rows/sec., 1.47 GiB/sec.
--clickhouse
databend-arch :) select count() from orders where o_orderkey>36901;
SELECT count()
FROM orders
WHERE o_orderkey > 36901
Query id: 31305d2a-82dc-4279-b227-10c34607ae56
┌──count()─┐
│ 14990771 │
└──────────┘
1 row in set. Elapsed: 0.023 sec. Processed 14.99 million rows, 119.93 MB (648.38 million rows/s., 5.19 GB/s.)
|
Summary
After multiple executions: 2x
The text was updated successfully, but these errors were encountered: