Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable edge based temporal sampling in torch_geometric.distributed #8428

Merged
merged 5 commits into from
Nov 29, 2023

Conversation

kgajdamo
Copy link
Contributor

@kgajdamo kgajdamo commented Nov 23, 2023

This PR enables edge based temporal distributed training for node and link sampling.

Comment about the edge temporal data definition:
In the case of distributed training, it is necessary to create a separate vector for each partition that will store the time information of the edges included in the partition. (I mention this just to point out that this works differently than with node-based temporal sampling, where we can have one vector common to each partition because we operate on node ids.)
Why:
Each partition has its own unique edge_index in COO format, which is later converted to a matrix in CSR/CSC format in the neighbor sampler. Therefore, we do not have information about the global edge IDs when sampling and we would not be able to find the correct time information for a specific edge. Therefore, this information must be local.

Changes made:

  • added support for edge_time argument
  • seed_time needs to be specified (requirement for edge level temporal sampling)
  • added unit tests
  • unit tests for link sampler are in this PR #8375

Copy link

codecov bot commented Nov 23, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (1ba2743) 87.69% compared to head (4d6c30f) 88.37%.

❗ Current head 4d6c30f differs from pull request most recent head e449483. Consider uploading reports for the commit e449483 to get more accurate results

Files Patch % Lines
...rch_geometric/distributed/dist_neighbor_sampler.py 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8428      +/-   ##
==========================================
+ Coverage   87.69%   88.37%   +0.68%     
==========================================
  Files         478      478              
  Lines       29400    29403       +3     
==========================================
+ Hits        25781    25985     +204     
+ Misses       3619     3418     -201     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rusty1s rusty1s changed the title Enable edge based temporal distributed training for homo Enable edge based temporal sampling in torch_geometric.distributed Nov 29, 2023
@rusty1s rusty1s enabled auto-merge (squash) November 29, 2023 15:52
@rusty1s rusty1s merged commit 12a2fb2 into pyg-team:master Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants