Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GIE] Refine Partitioner trait to better support parallel scan #2753

Closed
BingqingLyu opened this issue May 26, 2023 · 0 comments · Fixed by #2744
Closed

[GIE] Refine Partitioner trait to better support parallel scan #2753

BingqingLyu opened this issue May 26, 2023 · 0 comments · Fixed by #2744
Assignees

Comments

@BingqingLyu
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Refine Partitioner trait to better support parallel scan.

Currently, Partitioner define the get_worker_partitions function to specify the partition list that the worker (i.e., threads) is going to process:

 fn get_worker_partitions(&self, job_workers: usize, worker_id: u32) -> GraphProxyResult<Option<Vec<u64>>>;

However, we may have m workers and n partitions, where m>n. Thus, each worker need to scan only part of the partition. Take this case into consideration and refine the trait.

@BingqingLyu BingqingLyu self-assigned this May 26, 2023
BingqingLyu added a commit that referenced this issue Jun 8, 2023
… trait to better support parallel processing in Runtime (#2744)

<!--
Thanks for your contribution! please review
https://github.com/alibaba/GraphScope/blob/main/CONTRIBUTING.md before
opening an issue.
-->

## What do these changes do?

<!-- Please give a short brief about these changes. -->

Redesign `PartitionInfo`, `ClusterInfo`, and `Router` trait to better
support parallel processing in Runtime, where:
* `PartitionInfo` is used to query the partition information when the
data has been partitioned.
* `ClusterInfo` is used to query the cluster information when the system
is running on a cluster.
* `Router` is used to route the data to the destination worker so that
it can be properly processed, with `PartitionInfo` and `ClusterInfo` as
input.


## Related issue number

<!-- Are there any issues opened that will be resolved by merging this
change? -->

Fixes #2753
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant