Skip to content

[Subtask] [Improvement][AQE][LocalOrder] Merge continuous ShuffleDataSegment into single one #301

@zuston

Description

@zuston

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the subtask

Currently, the LocalOrderSegmentSplitter will split the index file into multiple shuffleDataSegments. But the split scope is limited in the range of local order.

For example:
The blocks are as follow

block-a (taskId-1)
block-b (taskId-2)
block-c (taskId-1)
block-d (taskId-2)

When the reader want to get the range of taskIds: [1, 3), the strategy will return two shuffleDataSegments. But we'd better to merge them into single one to reduce the network interaction times, because they are continuous.

Parent issue

#137

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions