-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[feat](index) Ann Index #54276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat](index) Ann Index #54276
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
34f125e to
fdd49ad
Compare
|
run buildall |
5fc6332 to
e2c8fb9
Compare
|
run buildall |
e2c8fb9 to
babc892
Compare
|
run buildall |
babc892 to
dc9091f
Compare
|
run buildall |
dc9091f to
c2bf0c2
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
11601d9 to
0f3a54c
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
46d76be to
346b9e5
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
38ee758 to
45ef37d
Compare
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run p0 |
|
run feut |
FE UT Coverage ReportIncrement line coverage |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
HappenLee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
airborne12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
dataroaring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
### What problem does this PR solve?
This PR add `CREATE INDEX` and `BUILD INDEX` sql syntax for ANN index
i.e.
```sql
CREATE INDEX [IF NOT EXISTS] <ann_index_name>
ON <table_name> (<column_name>)
USING ANN
[PROPERTIES ("<key>" = "<value>"[ , ...])]
[COMMENT '<index_comment>']
BUILD INDEX <ann_index_name> ON <table_name> [partition_list]
```
Related PR: #54276
…#55586) This PR add `CREATE INDEX` and `BUILD INDEX` sql syntax for ANN index i.e. ```sql CREATE INDEX [IF NOT EXISTS] <ann_index_name> ON <table_name> (<column_name>) USING ANN [PROPERTIES ("<key>" = "<value>"[ , ...])] [COMMENT '<index_comment>'] BUILD INDEX <ann_index_name> ON <table_name> [partition_list] ``` Related PR: apache#54276
…#55586) ### What problem does this PR solve? This PR add `CREATE INDEX` and `BUILD INDEX` sql syntax for ANN index i.e. ```sql CREATE INDEX [IF NOT EXISTS] <ann_index_name> ON <table_name> (<column_name>) USING ANN [PROPERTIES ("<key>" = "<value>"[ , ...])] [COMMENT '<index_comment>'] BUILD INDEX <ann_index_name> ON <table_name> [partition_list] ``` Related PR: apache#54276
What problem does this PR solve?
Introducing Ann index to doris.
This pull request introduces foundational support for ANN (Approximate Nearest Neighbor) vector index functionality in the storage engine, including new runtime structures, configuration options, and initial integration with the build system. The changes lay the groundwork for ANN-based search and statistics collection, and begin integrating ANN index support into various storage and query execution paths.
The implementation of ann index is based on faiss.
Faiss could return distance directly, so this pr using virtual slot ref to return result from index.
Each data segment of doris will have a faiss index if user creates a table with Ann index, and new segment generated by compaction will have a faiss index automatically.
Currently, create index and build index is not supported, index defination be added to ddl if you want it.
ANN Index Feature Integration:
AnnIndexStats,AnnIndexParam,RangeSearchParams,RangeSearchResult, and others inann_search_params.h, as well asRangeSearchRuntimeInfofor managing ANN range search context. [1] [2] [3]StorageReadOptionsandRowsetReaderContextto includeann_topn_runtimefor passing ANN runtime information through the storage read path. [1] [2] [3]OlapReaderStatisticsfor monitoring ANN index operations.Build System and Dependency Updates:
doris-faissanddoris-openblasas submodules for ANN/vector index support, and integrated the newVectorlibrary into the build process and as a dependency for relevant targets. [1] [2] [3] [4]Index Handling and Schema Integration:
SegmentFlusherto usehas_extra_index()(supporting both inverted and ANN indexes) instead ofhas_inverted_index(). [1] [2] [3] [4]Configuration:
opm_threads_limitto control the maximum number of OpenMP threads used per Doris thread, which is relevant for vectorized/ANN computation. [1] [2]These changes set up the infrastructure required for future development of ANN vector index features, including search, filtering, and statistics collection.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)