LRGB: Long Range Graph Benchmark datasets #5935

vijaydwivedi75 · 2022-11-09T07:37:41Z

This PR adds the LRGB datasets from the paper Long Range Graph Benchmark. The original dataset source is in this repo.

The Long Range Graph Benchmark (LRGB) is a collection of 5 graph learning datasets that arguably require long-range reasoning to achieve strong performance in a given task. The 5 datasets in this benchmark can be used to prototype new models that can capture long range dependencies in graphs.

Dataset	Domain	Task
PascalVOC-SP	Computer Vision	Node Classification
COCO-SP	Computer Vision	Node Classification
PCQM-Contact	Quantum Chemistry	Link Prediction
Peptides-func	Chemistry	Graph Classification
Peptides-struct	Chemistry	Graph Regression

The torch_geometric.datasets.LRGBDataset can be used to access any of the 5 datasets in the benchmark.

for more information, see https://pre-commit.ci

codecov · 2022-11-09T07:43:13Z

Codecov Report

Merging #5935 (3a9e039) into master (15dbdaf) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #5935   +/-   ##
=======================================
  Coverage   84.54%   84.54%           
=======================================
  Files         361      361           
  Lines       19877    19877           
=======================================
  Hits        16806    16806           
  Misses       3071     3071

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

…etric into lrgb

for more information, see https://pre-commit.ci

EdisonLeeeee · 2022-11-09T09:14:17Z

Thanks for adding these! Left some initial comments.

torch_geometric/datasets/lrgb.py

for more information, see https://pre-commit.ci

vijaydwivedi75 · 2022-11-09T11:04:13Z

Thanks @EdisonLeeeee for your comments!
I have included the following revisions in an updated commit.

Merged process steps for the superpixels and Peptides datasets
Removed 'Task' column from the Stats
Simplified contiguous label through LabelEncoder.

for more information, see https://pre-commit.ci

torch_geometric/datasets/lrgb.py

for more information, see https://pre-commit.ci

vijaydwivedi75 · 2022-11-11T07:59:47Z

Hi @EdisonLeeeee I have updated the code based on the feedbacks -thank you!
It is ready from my side.

EdisonLeeeee · 2022-11-12T02:49:41Z

LGTM. Thanks for the update!

torch_geometric/datasets/lrgb.py

for more information, see https://pre-commit.ci

CHANGELOG.md

torch_geometric/datasets/lrgb.py

EdisonLeeeee · 2022-11-12T09:18:09Z

A gentle ping @rusty1s :)

Co-authored-by: Jintang Li <cnljt@outlook.com>

EdisonLeeeee · 2022-11-15T13:10:16Z

I'm also wondering if we could add the dataset information (as you summarized in this PR) into docstring so that users can choose specific datasets according to their tasks. FYI:

pytorch_geometric/torch_geometric/datasets/md17.py

Lines 46 to 51 in 41fd354

    
               +----------------+-----------------+------------------------------+-----------+ 
        
               | Molecule       | Level of Theory | Name                         | #Examples | 
        
               +================+=================+==============================+===========+ 
        
               | Benzene        | DFT             | :obj:`benzene`               | 49,863    | 
        
               +----------------+-----------------+------------------------------+-----------+ 
        
               | Benzene        | DFT FHI-aims    | :obj:`benzene FHI-aims`      | 627,983   |

for more information, see https://pre-commit.ci

vijaydwivedi75 · 2022-11-16T05:38:48Z

Done, added dataset info in docstring!

rusty1s

Thank you for adding <3

torch_geometric/datasets/lrgb.py

for more information, see https://pre-commit.ci

This PR adds the LRGB datasets from the paper [Long Range Graph Benchmark](https://openreview.net/pdf?id=in7XC5RcjEn). The original dataset source is [in this repo](http://github.com/vijaydwivedi75/lrgb). The Long Range Graph Benchmark (LRGB) is a collection of 5 graph learning datasets that arguably require long-range reasoning to achieve strong performance in a given task. The 5 datasets in this benchmark can be used to prototype new models that can capture long range dependencies in graphs. Dataset | Domain | Task -- | -- | -- PascalVOC-SP | Computer Vision | Node Classification COCO-SP | Computer Vision | Node Classification PCQM-Contact | Quantum Chemistry | Link Prediction Peptides-func | Chemistry | Graph Classification Peptides-struct | Chemistry | Graph Regression The `torch_geometric.datasets.LRGBDataset` can be used to access any of the 5 datasets in the benchmark. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jintang Li <cnljt@outlook.com> Co-authored-by: Jinu Sunil <jinu.sunil@gmail.com> Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>

vijaydwivedi75 and others added 4 commits November 7, 2022 03:46

add LRGB (Long Range Graph Benchmark) datasets

b4197cb

Merge branch 'pyg-team:master' into lrgb

e50b28c

add changelog

3b0fed6

[pre-commit.ci] auto fixes from pre-commit.com hooks

a23ea18

for more information, see https://pre-commit.ci

vijaydwivedi75 and others added 7 commits November 9, 2022 15:50

doc reformat

ea09f4e

Merge branch 'lrgb' of https://github.com/vijaydwivedi75/pytorch_geom…

eac7892

…etric into lrgb

[pre-commit.ci] auto fixes from pre-commit.com hooks

62a11c7

for more information, see https://pre-commit.ci

doc formatting

298a7cb

[pre-commit.ci] auto fixes from pre-commit.com hooks

7710206

for more information, see https://pre-commit.ci

doc formatting

9ffee9c

minor doc format

60335e8

EdisonLeeeee assigned vijaydwivedi75 Nov 9, 2022

EdisonLeeeee added feature 1 - Priority P1 dataset labels Nov 9, 2022

EdisonLeeeee reviewed Nov 9, 2022

View reviewed changes

torch_geometric/datasets/lrgb.py Show resolved Hide resolved

torch_geometric/datasets/lrgb.py Outdated Show resolved Hide resolved

EdisonLeeeee reviewed Nov 9, 2022

View reviewed changes

torch_geometric/datasets/lrgb.py Outdated Show resolved Hide resolved

vijaydwivedi75 and others added 2 commits November 9, 2022 19:01

revised lrgb.py

e0d96c3

[pre-commit.ci] auto fixes from pre-commit.com hooks

66790ca

for more information, see https://pre-commit.ci

vijaydwivedi75 and others added 2 commits November 9, 2022 19:20

minor typo

7e48150

[pre-commit.ci] auto fixes from pre-commit.com hooks

adb98bd

for more information, see https://pre-commit.ci

EdisonLeeeee reviewed Nov 9, 2022

View reviewed changes

vijaydwivedi75 and others added 5 commits November 11, 2022 15:41

cleanup

4b8502a

updated changelog

0bfcf3d

minor

088f0d2

changelog

e68b6f9

[pre-commit.ci] auto fixes from pre-commit.com hooks

2f4452e

for more information, see https://pre-commit.ci

vijaydwivedi75 added 2 commits November 11, 2022 16:00

Merge branch 'master' into lrgb

6bae86f

update changelog

2dae036

EdisonLeeeee reviewed Nov 12, 2022

View reviewed changes

torch_geometric/datasets/lrgb.py Outdated Show resolved Hide resolved

vijaydwivedi75 and others added 3 commits November 12, 2022 15:27

Merge branch 'master' into lrgb

58f3c10

avoid multiple calls of label_remap_coco

3d619fe

[pre-commit.ci] auto fixes from pre-commit.com hooks

e8e8a91

for more information, see https://pre-commit.ci

EdisonLeeeee reviewed Nov 12, 2022

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

EdisonLeeeee reviewed Nov 12, 2022

View reviewed changes

torch_geometric/datasets/lrgb.py Outdated Show resolved Hide resolved

EdisonLeeeee approved these changes Nov 12, 2022

View reviewed changes

vijaydwivedi75 and others added 2 commits November 12, 2022 17:18

Update CHANGELOG.md

f013aa9

Co-authored-by: Jintang Li <cnljt@outlook.com>

naming

d771d89

vijaydwivedi75 and others added 5 commits November 16, 2022 13:09

data info in docstring

26b858d

data info in docstring

4b9b5d4

minor formatting

30feeca

Merge branch 'master' into lrgb

a3a8469

[pre-commit.ci] auto fixes from pre-commit.com hooks

28a0eff

for more information, see https://pre-commit.ci

wsad1 approved these changes Nov 17, 2022

View reviewed changes

Merge branch 'master' into lrgb

e618bb3

rusty1s approved these changes Nov 17, 2022

View reviewed changes

torch_geometric/datasets/lrgb.py Outdated Show resolved Hide resolved

torch_geometric/datasets/lrgb.py Outdated Show resolved Hide resolved

rusty1s and others added 3 commits November 17, 2022 12:43

Update torch_geometric/datasets/lrgb.py

2cd95c8

Update torch_geometric/datasets/lrgb.py

4e2c168

[pre-commit.ci] auto fixes from pre-commit.com hooks

3a9e039

for more information, see https://pre-commit.ci

rusty1s merged commit 964a447 into pyg-team:master Nov 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LRGB: Long Range Graph Benchmark datasets #5935

LRGB: Long Range Graph Benchmark datasets #5935

vijaydwivedi75 commented Nov 9, 2022

codecov bot commented Nov 9, 2022 •

edited

Loading

EdisonLeeeee commented Nov 9, 2022

vijaydwivedi75 commented Nov 9, 2022

vijaydwivedi75 commented Nov 11, 2022

EdisonLeeeee commented Nov 12, 2022

EdisonLeeeee commented Nov 12, 2022

EdisonLeeeee commented Nov 15, 2022

vijaydwivedi75 commented Nov 16, 2022

rusty1s left a comment

LRGB: Long Range Graph Benchmark datasets #5935

LRGB: Long Range Graph Benchmark datasets #5935

Conversation

vijaydwivedi75 commented Nov 9, 2022

codecov bot commented Nov 9, 2022 • edited Loading

Codecov Report

EdisonLeeeee commented Nov 9, 2022

vijaydwivedi75 commented Nov 9, 2022

vijaydwivedi75 commented Nov 11, 2022

EdisonLeeeee commented Nov 12, 2022

EdisonLeeeee commented Nov 12, 2022

EdisonLeeeee commented Nov 15, 2022

vijaydwivedi75 commented Nov 16, 2022

rusty1s left a comment

Choose a reason for hiding this comment

codecov bot commented Nov 9, 2022 •

edited

Loading