Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitmap scan support #4634

Closed
rafal-hawrylak opened this issue Jun 1, 2020 · 5 comments
Closed

Bitmap scan support #4634

rafal-hawrylak opened this issue Jun 1, 2020 · 5 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) community/request Issues created by external users kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@rafal-hawrylak
Copy link

rafal-hawrylak commented Jun 1, 2020

Jira Link: DB-2547
Some queries that are supposed to be using secondary indexes are ran with sequence scans which ends with very high times for even not big tables (2GB).

Including query plans.
query_by_secondary_index.txt

@yugabyte-ci yugabyte-ci added the community/request Issues created by external users label Jun 1, 2020
@ddorian
Copy link
Contributor

ddorian commented Jun 1, 2020

Hi @rafal-hawrylak

Can you also paste the table schema that you used ?

@ddorian ddorian added the area/ysql Yugabyte SQL (YSQL) label Jun 1, 2020
@rafal-hawrylak
Copy link
Author

rafal-hawrylak commented Jun 1, 2020

Yes.

CREATE TABLE schema.a1 (
    id uuid NOT NULL,
    f_id_1 uuid NOT NULL,
    f_id_2 uuid NOT NULL,
    t integer NOT NULL,
    sequence integer NOT NULL,
    PRIMARY KEY (id)
);

CREATE INDEX a1_f_id_1_idx ON schema.a1 (f_id_1);
CREATE INDEX a1_f_id_2_idx ON schema.a1 (f_id_2);

@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 9, 2022
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug labels Aug 23, 2022
@sushantrmishra
Copy link

sushantrmishra commented Feb 23, 2023

PostgreSQL plan:

postgres=#  EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM a1 WHERE f_id_1 = 'fc12ef23-27f3-43d9-8c85-30e50d694b1c' OR f_id_2 = 'fc12ef23-27f3-43d9-8c85-30e50d694b1c';
                                                              QUERY PLAN

--------------------------------------------------------------------------------------------------------------
------------------------
 Bitmap Heap Scan on a1  (cost=8.38..18.96 rows=10 width=56) (actual time=0.012..0.014 rows=0 loops=1)
   Recheck Cond: ((f_id_1 = 'fc12ef23-27f3-43d9-8c85-30e50d694b1c'::uuid) OR (f_id_2 = 'fc12ef23-27f3-43d9-8c8
5-30e50d694b1c'::uuid))
   Buffers: shared hit=2
   ->  BitmapOr  (cost=8.38..8.38 rows=10 width=0) (actual time=0.010..0.011 rows=0 loops=1)
         Buffers: shared hit=2
         ->  Bitmap Index Scan on a1_f_id_1_idx  (cost=0.00..4.19 rows=5 width=0) (actual time=0.006..0.007 ro
ws=0 loops=1)
               Index Cond: (f_id_1 = 'fc12ef23-27f3-43d9-8c85-30e50d694b1c'::uuid)
               Buffers: shared hit=1
         ->  Bitmap Index Scan on a1_f_id_2_idx  (cost=0.00..4.19 rows=5 width=0) (actual time=0.002..0.002 ro
ws=0 loops=1)
               Index Cond: (f_id_2 = 'fc12ef23-27f3-43d9-8c85-30e50d694b1c'::uuid)
               Buffers: shared hit=1
 Planning Time: 0.474 ms
 Execution Time: 0.147 ms
(13 rows)

@sushantrmishra
Copy link

sushantrmishra commented Feb 23, 2023

YB plan:

EXPLAIN (ANALYZE, dist) SELECT * FROM a1 WHERE f_id_1 = 'fc12ef23-27f3-43d9-8c85-30e50d694b1c' OR
f_id_2 = 'fc12ef23-27f3-43d9-8c85-30e50d694b1c';
                                                             QUERY PLAN

--------------------------------------------------------------------------------------------------------------
-------------------------
Seq Scan on a1  (cost=0.00..0.00 rows=1 width=56) (actual time=0.497..0.497 rows=0 loops=1)
  Remote Filter: ((f_id_1 = 'fc12ef23-27f3-43d9-8c85-30e50d694b1c'::uuid) OR (f_id_2 = 'fc12ef23-27f3-43d9-8c
85-30e50d694b1c'::uuid))
  Storage Table Read Requests: 1
  Storage Table Execution Time: 1.000 ms
Planning Time: 0.057 ms
Execution Time: 0.549 ms
Storage Read Requests: 1
Storage Write Requests: 0
Storage Execution Time: 1.000 ms
Peak Memory Usage: 14 kB
(10 rows)

@sushantrmishra
Copy link

Repro with data in the table:

yugabyte=# create table bitmap_scan_test (h int primary key , r1 int, r2 int);
CREATE TABLE

yugabyte=# insert into bitmap_scan_test select  i, i%10, i%100  from generate_series(1, 1000000) i;
INSERT 0 1000000
Time: 6511.770 ms (00:06.512)
yugabyte=# explain (analyze, dist) select * from bitmap_scan_test where r1 = 10 and r2 = 50;
                                                  QUERY PLAN

--------------------------------------------------------------------------------------------------------------
-
 Seq Scan on bitmap_scan_test  (cost=0.00..0.00 rows=1 width=12) (actual time=691.823..691.823 rows=0 loops=1)
   Remote Filter: ((r1 = 10) AND (r2 = 50))
   Storage Table Read Requests: 1
   Storage Table Execution Time: 692.005 ms
 Planning Time: 2.006 ms
 Execution Time: 691.873 ms
 Storage Read Requests: 1
 Storage Write Requests: 0
 Storage Execution Time: 692.005 ms
 Peak Memory Usage: 14 kB
(10 rows)

Time: 695.738 ms

After analyze

yugabyte=# explain (analyze, dist) select * from bitmap_scan_test where r1 = 10 and r2 = 50;
QUERY PLAN



Seq Scan on bitmap_scan_test (cost=0.00..105000.00 rows=1 width=12) (actual time=637.028..637.029 rows=0 loo
ps=1)
Remote Filter: ((r1 = 10) AND (r2 = 50))
Storage Table Read Requests: 1
Storage Table Execution Time: 637.005 ms
Planning Time: 2.080 ms
Execution Time: 637.074 ms
Storage Read Requests: 1
Storage Write Requests: 0
Storage Execution Time: 637.005 ms
Peak Memory Usage: 14 kB
(10 rows)

Time: 642.372 ms
yugabyte=#

@yugabyte-ci yugabyte-ci assigned vbalvida and tanujnay112 and unassigned vbalvida Jun 13, 2023
@yugabyte-ci yugabyte-ci changed the title Bitmap scan support LRT QE Bitmap scan support Oct 24, 2023
@yugabyte-ci yugabyte-ci changed the title LRT QE Bitmap scan support Bitmap scan support Oct 24, 2023
timothy-e added a commit that referenced this issue Mar 12, 2024
Summary:
== Bitmap Scans ==
Bitmap Scans use multiple indexes to answer a query, with only one scan of the main table. Each index produces a “bitmap” indicating which rows of the main table are interesting. Multiple bitmaps can be combined with AND or OR operators to create a final bitmap which is used to collect rows from the main table.

In this diff, initial support for Yugabyte bitmap scans is added. Yugabyte bitmap scans:
* **have two nodes:** `Bitmap Index Scan` shared with PG, and `YB Bitmap Table Scan`.
* **support exact results only**: When PG approaches work_mem, it switches to a lossy bitmap that identifies specific pages of the heap relation to access. When YB approaches work_mem, it simply switches to a full sequential scan.
* **support arbitary nesting of logical operands and index conditions**. Assuming the same indexes and support for conditions, any arrangement of ANDs, ORs, and index scans that Postgres supports are also supported by Yugabyte because the code is reused. Note that Yugabyte is currently extremely unlikely to use an `AND` condition because the planner doesn't account for the cost of each bitmap index access being half the cost of a normal index access. (Bitmap index scans only need to access the ybctids, but normal index scans access ybctids + use them to query the main table.)
* **allow conditions with `AND` to be used as a filter**. This currently happens for all `AND` conditions, see the above bullet point. But with a hack to the planner, `Bitmap And` and filters are both valid options.
* **are controlled by `enable_bitmapscan`**
* **are controlled by pg hint plan's `/*+ BitmapScan(tab) */` syntax **
* ** only print the recheck condition when recheck is required**. This is contrary to PG behavior, but since fetching unnecessary rows has a higher cost in YB, it’s better to clearly communicate that.
* **show distributed storage stats when `EXPLAIN (ANALYZE, DIST)` is run**. `Storage Index Read Requests` are shown under `Bitmap Index Scans` on a secondary index, `Storage Table Read Requests` on`Bitmap Index Scans` on a primary index or `YB Bitmap Table Scans`.
* **do not use unbounded memory**. Similar to Postgres, each bitmap is bounded by `work_mem` (although PG follows this bound loosely, i.e. if it can't lossify the bitmap enough it will exceed it). In the current draft, if Yugabyte exceeds work_mem, we stop collecting ybctids and run a full table scan. There is some output in the explain plan to indicate this occurred.
* **output average ybctid size when `EXPLAIN (ANALYZE, DEBUG)` is run**. This can be used to get an estimate of how much memory a bitmap scan might take up.

Yugabyte bitmap scans do **not** currently:
* **have an accurate cost model:** [[ #20573 | #20573 ]]
* **spill to disk when exceeding work_mem:** [[ #20576 | #20576 ]]
* **parallelise bitmap table scans**: [[ #20575 | #20575 ]]
* **support GIN indexes**: [[ #20574 | #20574 ]]
* **skip fetching ybctids from YB Bitmap Table Scan**: [[ #21036 | #21036 ]]
* **parallelise bitmap index scans**: [[ #21037 | #21037 ]]

See the design document here: https://docs.google.com/document/d/1pRUsXZDywzrZTQpTfXZNTZckh-8tTtkfOcurUoybTr0/edit?usp=sharing
Jira: DB-2547

Test Plan:
```lang=sh
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans'
```

Stress test on a 10 million row table:
```lang=sql
CREATE TABLE large (a INT, b INT);
CREATE INDEX ON large (a ASC);
CREATE INDEX ON large (b ASC);
INSERT INTO large SELECT i, i FROM generate_series(1, 10000000) i;
```

Bitmap Scan:
```
EXPLAIN (ANALYZE, DIST) /*+ BitmapScan(large) */ SELECT * FROM large WHERE a > 0 OR b > 0;
NOTICE:  exceeded work_mem, switching to full table scan
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 YB Bitmap Table Scan on large  (cost=6.91..11.21 rows=10 width=8) (actual time=233.920..10990.324 rows=10000000 loops=1)
   Storage Table Read Requests: 9767
   Storage Table Read Execution Time: 6288.534 ms
   Storage Table Rows Scanned: 10000000
   Exceeded work_mem: true
   ->  BitmapOr  (cost=6.91..6.91 rows=20 width=0) (actual time=231.783..231.783 rows=0 loops=1)
         ->  Bitmap Index Scan on large_a_idx  (cost=0.00..3.45 rows=10 width=0) (actual time=231.771..231.771 rows=0 loops=1)
               Index Cond: (a > 0)
               Storage Index Read Requests: 171
               Storage Index Read Execution Time: 199.258 ms
               Storage Index Rows Scanned: 175104
         ->  Bitmap Index Scan on large_b_idx  (cost=0.00..3.45 rows=10 width=0) (actual time=0.001..0.001 rows=0 loops=1)
               Index Cond: (b > 0)
 Planning Time: 10.735 ms
 Execution Time: 15493.938 ms
 Storage Read Requests: 9938
 Storage Read Execution Time: 6487.792 ms
 Storage Rows Scanned: 10175104
 Storage Write Requests: 0
 Catalog Read Requests: 8
 Catalog Read Execution Time: 6.969 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 6494.761 ms
 Peak Memory Usage: 100 kB
(25 rows)
```

Sequential Scan:
```
EXPLAIN (ANALYZE, DIST) SELECT * FROM large WHERE a > 0 OR b > 0;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Seq Scan on large  (cost=0.00..105.00 rows=1000 width=8) (actual time=2.721..9896.584 rows=10000000 loops=1)
   Storage Filter: ((a > 0) OR (b > 0))
   Storage Table Read Requests: 4885
   Storage Table Read Execution Time: 7578.189 ms
   Storage Table Rows Scanned: 10000000
 Planning Time: 0.082 ms
 Execution Time: 14737.837 ms
 Storage Read Requests: 4885
 Storage Read Execution Time: 7578.189 ms
 Storage Rows Scanned: 10000000
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 7578.189 ms
 Peak Memory Usage: 24 kB
(16 rows)
```

Bitmap scan with high work_mem:
```
SET work_mem TO '1GB';
EXPLAIN (ANALYZE, DIST) /*+ BitmapScan(large) */ SELECT * FROM large WHERE a > 0 OR b > 0;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 YB Bitmap Table Scan on large  (cost=6.91..11.21 rows=10 width=8) (actual time=76779.066..162561.046 rows=10000000 loops=1)
   Storage Table Read Requests: 9766
   Storage Table Read Execution Time: 78226.718 ms
   Storage Table Rows Scanned: 10000000
   ->  BitmapOr  (cost=6.91..6.91 rows=20 width=0) (actual time=73603.559..73603.559 rows=10000000 loops=1)
         ->  Bitmap Index Scan on large_a_idx  (cost=0.00..3.45 rows=10 width=0) (actual time=36774.895..36774.895 rows=10000000 loops=1)
               Index Cond: (a > 0)
               Storage Index Read Requests: 9766
               Storage Index Read Execution Time: 11136.148 ms
               Storage Index Rows Scanned: 10000000
         ->  Bitmap Index Scan on large_b_idx  (cost=0.00..3.45 rows=10 width=0) (actual time=36828.655..36828.655 rows=10000000 loops=1)
               Index Cond: (b > 0)
               Storage Index Read Requests: 9766
               Storage Index Read Execution Time: 11419.616 ms
               Storage Index Rows Scanned: 10000000
 Planning Time: 0.153 ms
 Execution Time: 169818.045 ms
 Storage Read Requests: 29298
 Storage Read Execution Time: 100782.482 ms
 Storage Rows Scanned: 30000000
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 100782.482 ms
 Peak Memory Usage: 1217840 kB
(26 rows)
```

Reviewers: tnayak, amartsinchyk

Reviewed By: tnayak

Subscribers: smishra, ybase, yql

Differential Revision: https://phorge.dev.yugabyte.com/D32015
@github-project-automation github-project-automation bot moved this to Done in YQL-beta Mar 13, 2024
asrinivasanyb pushed a commit to asrinivasanyb/yugabyte-db that referenced this issue Mar 18, 2024
Summary:
== Bitmap Scans ==
Bitmap Scans use multiple indexes to answer a query, with only one scan of the main table. Each index produces a “bitmap” indicating which rows of the main table are interesting. Multiple bitmaps can be combined with AND or OR operators to create a final bitmap which is used to collect rows from the main table.

In this diff, initial support for Yugabyte bitmap scans is added. Yugabyte bitmap scans:
* **have two nodes:** `Bitmap Index Scan` shared with PG, and `YB Bitmap Table Scan`.
* **support exact results only**: When PG approaches work_mem, it switches to a lossy bitmap that identifies specific pages of the heap relation to access. When YB approaches work_mem, it simply switches to a full sequential scan.
* **support arbitary nesting of logical operands and index conditions**. Assuming the same indexes and support for conditions, any arrangement of ANDs, ORs, and index scans that Postgres supports are also supported by Yugabyte because the code is reused. Note that Yugabyte is currently extremely unlikely to use an `AND` condition because the planner doesn't account for the cost of each bitmap index access being half the cost of a normal index access. (Bitmap index scans only need to access the ybctids, but normal index scans access ybctids + use them to query the main table.)
* **allow conditions with `AND` to be used as a filter**. This currently happens for all `AND` conditions, see the above bullet point. But with a hack to the planner, `Bitmap And` and filters are both valid options.
* **are controlled by `enable_bitmapscan`**
* **are controlled by pg hint plan's `/*+ BitmapScan(tab) */` syntax **
* ** only print the recheck condition when recheck is required**. This is contrary to PG behavior, but since fetching unnecessary rows has a higher cost in YB, it’s better to clearly communicate that.
* **show distributed storage stats when `EXPLAIN (ANALYZE, DIST)` is run**. `Storage Index Read Requests` are shown under `Bitmap Index Scans` on a secondary index, `Storage Table Read Requests` on`Bitmap Index Scans` on a primary index or `YB Bitmap Table Scans`.
* **do not use unbounded memory**. Similar to Postgres, each bitmap is bounded by `work_mem` (although PG follows this bound loosely, i.e. if it can't lossify the bitmap enough it will exceed it). In the current draft, if Yugabyte exceeds work_mem, we stop collecting ybctids and run a full table scan. There is some output in the explain plan to indicate this occurred.
* **output average ybctid size when `EXPLAIN (ANALYZE, DEBUG)` is run**. This can be used to get an estimate of how much memory a bitmap scan might take up.

Yugabyte bitmap scans do **not** currently:
* **have an accurate cost model:** [[ yugabyte#20573 | yugabyte#20573 ]]
* **spill to disk when exceeding work_mem:** [[ yugabyte#20576 | yugabyte#20576 ]]
* **parallelise bitmap table scans**: [[ yugabyte#20575 | yugabyte#20575 ]]
* **support GIN indexes**: [[ yugabyte#20574 | yugabyte#20574 ]]
* **skip fetching ybctids from YB Bitmap Table Scan**: [[ yugabyte#21036 | yugabyte#21036 ]]
* **parallelise bitmap index scans**: [[ yugabyte#21037 | yugabyte#21037 ]]

See the design document here: https://docs.google.com/document/d/1pRUsXZDywzrZTQpTfXZNTZckh-8tTtkfOcurUoybTr0/edit?usp=sharing
Jira: DB-2547

Test Plan:
```lang=sh
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans'
```

Stress test on a 10 million row table:
```lang=sql
CREATE TABLE large (a INT, b INT);
CREATE INDEX ON large (a ASC);
CREATE INDEX ON large (b ASC);
INSERT INTO large SELECT i, i FROM generate_series(1, 10000000) i;
```

Bitmap Scan:
```
EXPLAIN (ANALYZE, DIST) /*+ BitmapScan(large) */ SELECT * FROM large WHERE a > 0 OR b > 0;
NOTICE:  exceeded work_mem, switching to full table scan
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 YB Bitmap Table Scan on large  (cost=6.91..11.21 rows=10 width=8) (actual time=233.920..10990.324 rows=10000000 loops=1)
   Storage Table Read Requests: 9767
   Storage Table Read Execution Time: 6288.534 ms
   Storage Table Rows Scanned: 10000000
   Exceeded work_mem: true
   ->  BitmapOr  (cost=6.91..6.91 rows=20 width=0) (actual time=231.783..231.783 rows=0 loops=1)
         ->  Bitmap Index Scan on large_a_idx  (cost=0.00..3.45 rows=10 width=0) (actual time=231.771..231.771 rows=0 loops=1)
               Index Cond: (a > 0)
               Storage Index Read Requests: 171
               Storage Index Read Execution Time: 199.258 ms
               Storage Index Rows Scanned: 175104
         ->  Bitmap Index Scan on large_b_idx  (cost=0.00..3.45 rows=10 width=0) (actual time=0.001..0.001 rows=0 loops=1)
               Index Cond: (b > 0)
 Planning Time: 10.735 ms
 Execution Time: 15493.938 ms
 Storage Read Requests: 9938
 Storage Read Execution Time: 6487.792 ms
 Storage Rows Scanned: 10175104
 Storage Write Requests: 0
 Catalog Read Requests: 8
 Catalog Read Execution Time: 6.969 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 6494.761 ms
 Peak Memory Usage: 100 kB
(25 rows)
```

Sequential Scan:
```
EXPLAIN (ANALYZE, DIST) SELECT * FROM large WHERE a > 0 OR b > 0;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Seq Scan on large  (cost=0.00..105.00 rows=1000 width=8) (actual time=2.721..9896.584 rows=10000000 loops=1)
   Storage Filter: ((a > 0) OR (b > 0))
   Storage Table Read Requests: 4885
   Storage Table Read Execution Time: 7578.189 ms
   Storage Table Rows Scanned: 10000000
 Planning Time: 0.082 ms
 Execution Time: 14737.837 ms
 Storage Read Requests: 4885
 Storage Read Execution Time: 7578.189 ms
 Storage Rows Scanned: 10000000
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 7578.189 ms
 Peak Memory Usage: 24 kB
(16 rows)
```

Bitmap scan with high work_mem:
```
SET work_mem TO '1GB';
EXPLAIN (ANALYZE, DIST) /*+ BitmapScan(large) */ SELECT * FROM large WHERE a > 0 OR b > 0;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 YB Bitmap Table Scan on large  (cost=6.91..11.21 rows=10 width=8) (actual time=76779.066..162561.046 rows=10000000 loops=1)
   Storage Table Read Requests: 9766
   Storage Table Read Execution Time: 78226.718 ms
   Storage Table Rows Scanned: 10000000
   ->  BitmapOr  (cost=6.91..6.91 rows=20 width=0) (actual time=73603.559..73603.559 rows=10000000 loops=1)
         ->  Bitmap Index Scan on large_a_idx  (cost=0.00..3.45 rows=10 width=0) (actual time=36774.895..36774.895 rows=10000000 loops=1)
               Index Cond: (a > 0)
               Storage Index Read Requests: 9766
               Storage Index Read Execution Time: 11136.148 ms
               Storage Index Rows Scanned: 10000000
         ->  Bitmap Index Scan on large_b_idx  (cost=0.00..3.45 rows=10 width=0) (actual time=36828.655..36828.655 rows=10000000 loops=1)
               Index Cond: (b > 0)
               Storage Index Read Requests: 9766
               Storage Index Read Execution Time: 11419.616 ms
               Storage Index Rows Scanned: 10000000
 Planning Time: 0.153 ms
 Execution Time: 169818.045 ms
 Storage Read Requests: 29298
 Storage Read Execution Time: 100782.482 ms
 Storage Rows Scanned: 30000000
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 100782.482 ms
 Peak Memory Usage: 1217840 kB
(26 rows)
```

Reviewers: tnayak, amartsinchyk

Reviewed By: tnayak

Subscribers: smishra, ybase, yql

Differential Revision: https://phorge.dev.yugabyte.com/D32015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) community/request Issues created by external users kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
Status: Done
Development

No branches or pull requests

8 participants