Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing graphGPS as a global attention wrapper around Hydra MPNNs #315

Merged
merged 11 commits into from
Dec 13, 2024

Conversation

ArCho48
Copy link
Collaborator

@ArCho48 ArCho48 commented Dec 5, 2024

This PR incorporates graphGPS within the HydraGNN framework.

# Test only models
@pytest.mark.parametrize(
"global_attn_engine",
["GPS"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArCho48
We need to try also "None" as global attention mechanism

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-12-05 at 2 28 47 PM

I thought these lines do that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. :) My apologies

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. No probs :)

Copy link
Collaborator

@allaffa allaffa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArCho48
The tests need to check the MPNNs both with and without global attention mechanisms.

@@ -101,11 +102,19 @@ def __init__(self, dirpath, config, dist=False, sampling=None):

rx = list(nsplit((dirfiles), self.world_size))[self.rank]

# LPE
self.transform = AddLaplacianEigenvectorPE(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArCho48

The Laplacian Matrix used to calculate the eigenvectors is D-A, right? Where D is the Degree matrix and A is the Adjacency Matrix? So, although this is a positional encoding, it is based solely on connectivity, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, lapPE is a purely topological encoding

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bdw, @RylieWeaver we couldnt get GPS to run on LennardJones as pytorch throws an error when backpropagating manually computed gradients (which I think is done for LJ) through the attention layers

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bdw, @RylieWeaver we couldnt get GPS to run on LennardJones as pytorch throws an error when backpropagating manually computed gradients (which I think is done for LJ) through the attention layers

Yes, LJ uses compute_grad_energy to compute forces by the gradient of energy wrt position. It is also a general functionality which could be used for other force/energy prediction examples.

As a first guess, I believe the error comes from a lack of gradient flow to the position from the Laplacian Eigenvectors.

If this is the case, a potentially easy fix might be to do the Laplacian computation with torch.no_grad(), like this:

with torch.no_grad():
    data = self.transform(data)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that can be a problem perhaps, but the error is mainly due to the incompatibility of gradient flow with the torch multiheadattention layer. I will send u the exact error shortly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes, please let me know what comes up.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArCho48 Is this error still coming up? Have you had the opportunity to make the Laplacian with torch.nograd()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Rylie, I think we decided to skip integrating LJ with GPS in this PR. I can work on this (n add ur fix if i works) in the next PR on gps. @allaffa do u have any suggestion?

# encode node positional embeddings
x = self.pos_emb(data.pe)
# if node features are available, genrate mebeddings, concatenate with positional embeddings and map to hidden dim
if self.input_dim:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArCho48

Do we expect input_dim to ever not be defined? Is this a new type of functionality in Hydra where there are solely graph features and no node features?

Copy link
Collaborator Author

@ArCho48 ArCho48 Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graph features and/or even edge features but no node features can be a possibility.

@@ -53,10 +58,12 @@ def __init__(
assert self.input_args == "inv_node_feat, equiv_node_feat, edge_index"
assert self.conv_args == "inv_node_feat, edge_index"

def get_conv(self, input_dim, _):
def get_conv(self, input_dim, _, edge_dim=None):
if not edge_dim:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArCho48

I am not sure, but it may be better to instead handle this inside __init__(), by adding a line in the section if self.global_attn_engine: that sets self.edge_dim with:

self.another_dim = self.embed_dim = self.edge_embed_dim = hidden_dim  # ensure that all input to gps have the same dimensionality

or something like it. This would get rid of the if/else in every stack's get_conv function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot change self.edge_dim between lines 134-152 of base bcoz I need the original edge_dim in line 167. But I think I can get rid of the if/else in get_conv since edge_dim is always passed to get_conv through the edge_embed_dim variable. Thanks for the suggestion

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point

@ArCho48 ArCho48 requested a review from RylieWeaver December 10, 2024 03:40
@allaffa allaffa self-requested a review December 12, 2024 03:53
@allaffa allaffa merged commit 1419dc4 into ORNL:main Dec 13, 2024
2 checks passed
@ArCho48 ArCho48 deleted the graphgps branch December 13, 2024 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants