Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing test labels in HGBDataset #5233

Merged
merged 3 commits into from
Aug 19, 2022
Merged

Conversation

EdisonLeeeee
Copy link
Contributor

Hi, I found a bug when using torch_geometric.dataset.HGBDataset. Here are my codes to reproduce the bug:

import torch
from torch_geometric.datasets import HGBDataset
from collections import Counter

datasets = ['acm', 'dblp', 'freebase', 'imdb']
labeled_types = ['paper', 'author', 'book', 'movie']

for name, types in zip(datasets, labeled_types):
    data = HGBDataset("~/data/pygdata/HGB", name)[0]
    if name != 'imdb':
        print(name, Counter(data[types].y[data[types].test_mask].tolist()))
    else:
        print(name, torch.any(data[types].y[data[types].test_mask]))
    

And the outputs are:

acm Counter({-1: 1059})
dblp Counter({-1: 1420})
freebase Counter({-1: 2784})
imdb tensor(False)

This indicates that all test labels are missing on four datasets.

I checked the code in process and found that it seems to have forgotten to assign values to the test labels. After adding these codes, the bug has been fixed. The output becomes:

acm Counter({2: 364, 0: 355, 1: 340})
dblp Counter({2: 404, 0: 404, 3: 354, 1: 258})
freebase Counter({1: 1372, 2: 591, 4: 347, 0: 146, 6: 141, 5: 127, 3: 60})
imdb tensor(True)

@codecov
Copy link

codecov bot commented Aug 18, 2022

Codecov Report

Merging #5233 (591ba57) into master (cd72c22) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #5233   +/-   ##
=======================================
  Coverage   83.13%   83.13%           
=======================================
  Files         336      336           
  Lines       18523    18523           
=======================================
  Hits        15399    15399           
  Misses       3124     3124           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@rusty1s rusty1s changed the title fix test labels missing in HGBDataset Add missing test labels in HGBDataset Aug 19, 2022
Copy link
Member

@rusty1s rusty1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@rusty1s rusty1s enabled auto-merge (squash) August 19, 2022 11:11
@rusty1s rusty1s merged commit 7e1a629 into pyg-team:master Aug 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants