Skip to content

[Proposal] Delete index storage structure design for batch delete #4233

@ZhangYu0123

Description

@ZhangYu0123

Is your feature request related to a problem? Please describe.
The detail delete index design in storage level for batch delete function feature #4051 .
We discuss the concrete delete index structure and update the design document here.

Describe the solution you'd like
When importing and deleting according to the key, we will generate a bitmap index of the delete key for these keys. The content of the index is: the deleted line number (ordinal).
(1) The specific format of delete index footer page(DeleteIndexFooterPB):

             +--------------------+
             |        type        |
             |--------------------|
             |  uncompressed_size |
             |--------------------|
             |    content_bytes   |
             |--------------------|
             |      num_items     |
             |--------------------|
             |      checksum      |
             +--------------------+

-type: The type is DELETE_INDEX_PAGE, which means that the page of the index type is deleted
-uncompressed_size: Uncompressed page size
-content_bytes: Data size of index content
-num_items: Store the number of deleted entries recorded in the index
-checksum: Checksum of index page content

(2) SegmentFooterPB changes:
Add PagePointerPB delete_index_page to SegmentFooterPB, because delete index is Segment level index.

(3) PageTypePB changes:
Add DELETE_INDEX_PAGE to PageTypePB to describe the type of delete index.
Add DeleteIndexFooterPB to PageFooterPB as optional.

(4) encoding
The encoding format of index using roaring bitmap format.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions