-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add GLEM model, TAGDataset and example of GLEM #9662
base: master
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9662 +/- ##
==========================================
- Coverage 87.54% 86.92% -0.62%
==========================================
Files 482 483 +1
Lines 31414 31585 +171
==========================================
- Hits 27501 27455 -46
- Misses 3913 4130 +217 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM just get CI green
@rusty1s @akihironitta ready for your reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have type annotations all over the PR? Also, I'd suggest splitting this PR into smaller ones.
examples/llm/glem.py
Outdated
if em_phase == 'gnn': | ||
gnn_test_acc = max(gnn_test_acc, final_test_acc) | ||
model.gnn = model.gnn.to('cpu', non_blocking=True) | ||
em_phase = 'lm' | ||
else: | ||
lm_test_acc = max(lm_test_acc, final_test_acc) | ||
model.lm = model.lm.to('cpu', non_blocking=True) | ||
em_phase = 'gnn' | ||
torch.cuda.empty_cache() | ||
print(f'Best GNN acc: {gnn_test_acc}, LM acc: {lm_test_acc}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same comment as #9467 (comment), but we shouldn't pick the best metric evaluated on the test set at the end of every EM step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Akihiro,
Thanks for reviewing the code.
I think the case is different here, I agree that we should not pick best test metrics after every epoch, but the test metrics still required for every EM step. Since E step is LM training and M-step is GNN training, both step need certain number of epochs. We need to run full inference after every E and M step for finding out that which model have better result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ECMGit i think @akihironitta's point is that to "finding out that which model have better result" you should only use val accuracy, not test accuracy since if you use the test acc this could be viewed as a form of loosely fitting the model to the test set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't had a look outside the example script yet, but this addition is exciting! 🚀
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
LGTM @akihironitta @rusty1s let us know if anything else needed |
reopened #9591
Feature summary: