Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Add Pure DGL Dataloading benchmark #3660

Merged
merged 12 commits into from
Sep 26, 2023

Conversation

VibhuJawa
Copy link
Member

@VibhuJawa VibhuJawa commented Jun 14, 2023

This PR adds the DGL data loading benchmark:

Arguments supported:

  • dataset_path: path to the dataset
  • replication_factors: replication factors for number of edges
  • fanouts: fanouts
  • batch_sizes: batch sizes
python3 dgl_dataloading.py --dataset_path "/datasets/abarghi/ogbn_papers100M" \
--replication_factors "1,2,4" \
--fanouts "25_25,10_10_10,5_10_20" \
--batch_sizes "512,1024"

This produces the following results on a V100:

Fanout Batch Size Data Loading Time per Epoch Data Loading Time per Batch Number of Edges Number of Batches Replication Factor
[25, 25] 512 9.48 0.0031 1615685872 3022 1
[25, 25] 1024 6.39 0.0042 1615685872 1511 1
[10, 10, 10] 512 15.91 0.0053 1615685872 3022 1
[10, 10, 10] 1024 11.64 0.0077 1615685872 1511 1
[5, 10, 20] 512 17.73 0.0059 1615685872 3022 1
[5, 10, 20] 1024 13.52 0.0089 1615685872 1511 1
[25, 25] 512 19.44 0.0032 3231371744 6043 2
[25, 25] 1024 12.98 0.0043 3231371744 3022 2
[10, 10, 10] 512 32.88 0.0054 3231371744 6043 2
[10, 10, 10] 1024 24.35 0.0081 3231371744 3022 2
[5, 10, 20] 512 38.35 0.0063 3231371744 6043 2
[5, 10, 20] 1024 28.93 0.0096 3231371744 3022 2
[25, 25] 512 37.31 0.0031 6462743488 12085 4
[25, 25] 1024 25.15 0.0042 6462743488 6043 4
[10, 10, 10] 512 64.29 0.0053 6462743488 12085 4
[10, 10, 10] 1024 47.13 0.0078 6462743488 6043 4
[5, 10, 20] 512 72.90 0.0060 6462743488 12085 4
[5, 10, 20] 1024 56.70 0.0094 6462743488 6043 4
[25, 25] 512 80.99 0.0034 12925486976 24169 8
[25, 25] 1024 50.89 0.0042 12925486976 12085 8
[10, 10, 10] 512 129.49 0.0054 12925486976 24169 8
[10, 10, 10] 1024 93.66 0.0078 12925486976 12085 8
[5, 10, 20] 512 143.45 0.0059 12925486976 24169 8
[5, 10, 20] 1024 110.22 0.0091 12925486976 12085 8

@VibhuJawa VibhuJawa added non-breaking Non-breaking change feature request New feature or request improvement Improvement / enhancement to an existing function and removed feature request New feature or request improvement Improvement / enhancement to an existing function labels Jun 14, 2023
@VibhuJawa VibhuJawa marked this pull request as draft June 14, 2023 23:40
@BradReesWork BradReesWork modified the milestones: 23.08, 23.10 Jul 24, 2023
@VibhuJawa VibhuJawa marked this pull request as ready for review August 16, 2023 20:21
@VibhuJawa VibhuJawa changed the title [WIP] Add Pure DGL Dataloading benchmark [REVIEW] Add Pure DGL Dataloading benchmark Aug 16, 2023
Copy link
Member

@alexbarghi-nv alexbarghi-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alexbarghi-nv alexbarghi-nv changed the base branch from branch-23.08 to branch-23.10 August 17, 2023 15:48
@VibhuJawa
Copy link
Member Author

I would like to get a look from @tingyu66 here too

Copy link
Member

@tingyu66 tingyu66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. If being nitpicking, I'd like to fix random seed (with dgl.seed() and torch.manual_seed()) for all benchmarks.

@VibhuJawa
Copy link
Member Author

@tingyu66 , Added seed with commit: f99a518

@alexbarghi-nv , I think we should be good to merge.

@VibhuJawa
Copy link
Member Author

VibhuJawa commented Aug 25, 2023

Just dropping updated numbers for awareness as we use this in comparisons.

 python3 dgl_benchmark.py --dataset_path "/datasets/abarghi/ogbn_papers100M" --replication_factors "2,4" --fanouts "10_10_10" --batch_sizes "512,1024"
Running dgl dataloading benchmark with the following parameters:
Dataset path = /datasets/abarghi/ogbn_papers100M
Replication factors = [2, 4]
Fanouts = [[10, 10, 10]]
Batch sizes = [512, 1024]
Use UVA = True
==============================================
Loading edge index for edge type paper__cites__paperfor replication factor = 2
Graph Data compiled
Replication factor = 2
G has 3231371744 edges and took  47.45 seconds to load
Creating dataloader
Time to create dataloader = 112.72 seconds
Dataloading completed
Fanout = [10, 10, 10], batch_size = 512
Time taken 23.96  seconds for num batches 6043
==============================================
Creating dataloader
Time to create dataloader = 0.01 seconds
Dataloading completed
Fanout = [10, 10, 10], batch_size = 1024
Time taken 16.75  seconds for num batches 3022
==============================================
Benchmark completed for replication factor =  2
==============================================
Loading edge index for edge type paper__cites__paperfor replication factor = 4
Graph Data compiled
Replication factor = 4
G has 6462743488 edges and took  126.01 seconds to load
Creating dataloader
Time to create dataloader = 213.80 seconds
Dataloading completed
Fanout = [10, 10, 10], batch_size = 512
Time taken 48.75  seconds for num batches 12085
==============================================
Creating dataloader
Time to create dataloader = 0.01 seconds
Dataloading completed
Fanout = [10, 10, 10], batch_size = 1024
Time taken 33.81  seconds for num batches 6043
==============================================
Benchmark completed for replication factor =  4
==============================================
Benchmark completed for all replication factors
==============================================

@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 30, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@alexbarghi-nv
Copy link
Member

/ok to test

@BradReesWork
Copy link
Member

/ok to test

@BradReesWork
Copy link
Member

/okay to test

@BradReesWork
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit b199bf0 into rapidsai:branch-23.10 Sep 26, 2023
57 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants