Skip to content

Conversation

@zhiqiang-hhhh
Copy link
Contributor

cherry pick from #57243

This PR introduces **ANN index-only scan** to avoid reading the array
column when it is not included in the projection.
Currently, nearly **half of the total CPU cost** is spent on reading the
array column.

![img_v3_02rb_21c39870-9c59-4565-8c38-ff1acd93938g](https://github.com/user-attachments/assets/71b73a8d-05a2-40e1-8202-b5870b9aa7b6)

In addition, the data pages of the array column consume a large amount
of memory.
By adopting index-only scan, both **CPU usage** and **memory footprint**
can be significantly reduced.

---
concurrency = 30, 60, 80)

**Before this PR:**

```
"conc_qps_list": [336.4794, 635.5969, 601.952],
"conc_latency_p99_list": [0.7159, 0.1690, 0.2239],
"conc_latency_p95_list": [0.0957, 0.1442, 0.1904],
"conc_latency_avg_list": [0.0889, 0.0940, 0.1320]
```

**After this PR:**

```
"conc_qps_list": [501.4105, 832.0482, 899.8662],
"conc_latency_p99_list": [0.0763, 0.1302, 0.1780],
"conc_latency_p95_list": [0.0666, 0.1041, 0.1465],
"conc_latency_avg_list": [0.0597, 0.0716, 0.0881]
```

This represents a **20%–50% improvement** in QPS and latency.

---

The chart below shows that **CPU cost decreases by ~17%** and **memory
consumption decreases by ~50%** after this optimization.
(Left side: after this PR)

![Performance
comparison](https://github.com/user-attachments/assets/a2fcfcc3-52e3-4c22-95ae-79612ffb197c)
@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 8, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 8, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 8, 2025

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit 1731994 into apache:branch-4.0 Nov 8, 2025
22 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants