Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to merge samplers outputs #252

Merged
merged 8 commits into from
Sep 15, 2023

Conversation

kgajdamo
Copy link
Contributor

@kgajdamo kgajdamo commented Sep 5, 2023

This code belongs to the part of the whole distributed training for PyG.

Description

Distributed training requires after each layer to merge results between machines. For later algorithms, it is required that the results be sorted according to the sampling order. This PR introduces a function which purpose is to handle merge and sort operations in parallel.

Other distributed PRs:
pytorch_geometric DistLoader: #7869
pytorch_geometric DistSampler: #7974
pyg-lib: #246

@codecov
Copy link

codecov bot commented Sep 6, 2023

Codecov Report

Merging #252 (0ed8216) into master (3a4d436) will increase coverage by 0.80%.
The diff coverage is 98.21%.

@@            Coverage Diff             @@
##           master     #252      +/-   ##
==========================================
+ Coverage   83.50%   84.30%   +0.80%     
==========================================
  Files          28       30       +2     
  Lines         970     1026      +56     
==========================================
+ Hits          810      865      +55     
- Misses        160      161       +1     
Files Changed Coverage Δ
...lib/csrc/sampler/cpu/dist_merge_outputs_kernel.cpp 97.29% <97.29%> (ø)
pyg_lib/csrc/sampler/dist_merge_outputs.cpp 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@rusty1s rusty1s enabled auto-merge (squash) September 15, 2023 12:32
@rusty1s rusty1s merged commit 3bf8802 into pyg-team:master Sep 15, 2023
10 checks passed
rusty1s added a commit that referenced this pull request Sep 15, 2023
#254)

This code belongs to the part of the whole distributed training for PyG.

This PR is complementary to the
[#246](#246).

##Descrption
Perform global to local mappings using mapper and create (row, col)
based on a sampled_nodes_with_duplicates and sampled_nbrs_per_node.

**Other distributed PRs:**
pytorch_geometric DistLoader:
[#7869](pyg-team/pytorch_geometric#7869)
pytorch_geometric DistSampler:
[#7974](pyg-team/pytorch_geometric#7974)
pyg-lib [MERGED]: [#246](#246)
pyg-lib: [#252](#252)
pyg-lib: [#253](#253)

---------

Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
rusty1s added a commit to pyg-team/pytorch_geometric that referenced this pull request Oct 9, 2023
This code belongs to the part of the whole distributed training for PyG.

`DistNeighborSampler` leverages the `NeighborSampler` class from
`pytorch_geometric` and the `neighbor_sample` function from `pyg-lib`.
However, due to the fact that in case of distributed training it is
required to synchronise the results between machines after each layer,
the part of the code responsible for sampling was implemented in python.

Added suport for the following sampling methods:
- node, edge, negative, disjoint, temporal

**TODOs:**

- [x] finish hetero part
- [x] subgraph sampling

**This PR should be merged together with other distributed PRs:**
pyg-lib: [#246](pyg-team/pyg-lib#246),
[#252](pyg-team/pyg-lib#252)
GraphStore\FeatureStore:
#8083
DistLoaders:
1.  #8079
2.  #8080
3.  #8085

---------

Co-authored-by: JakubPietrakIntel <jakub.pietrak@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ZhengHongming888 <hongming.zheng@intel.com>
Co-authored-by: Jakub Pietrak <97102979+JakubPietrakIntel@users.noreply.github.com>
Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants