pq #304

guillemsimeon · 2024-03-05T10:49:19Z

Add small hack to be able to test partial charges with SPICE. This can be the base PR to come up with a solution to total charge, partial charges, spins, etc.

RaulPPelaez · 2024-03-05T11:13:27Z

torchmdnet/models/model.py

@@ -378,6 +378,10 @@ def forward(
        assert z.dim() == 1 and z.dtype == torch.long
        batch = torch.zeros_like(z) if batch is None else batch

+        # trick to incorporate SPICE pqs
+        # set charge: true in yaml ((?) currently I do it)
+        q = extra_args["pq"]


I would move instead to something like removing "q" and "s" from the forward call of the representation models and instead send everything as extra_args, or atomic_labels:

atomic_labels = {} if q is not None: atomic_labels["total_charges"] = q if s is not None: atomic_labels["spin"] = s # run the potentially wrapped representation model x, v, z, pos, batch = self.representation_model( z, pos, batch, box=box, atomic_labels=atomic_labels, extra_args=extra_args )

it seems nice

RaulPPelaez · 2024-03-05T11:18:34Z

You could avoid creating a tensor of zeros with the charges here:

torchmd-net/torchmdnet/models/tensornet.py

Lines 240 to 243 in fae79bd

    
           if q is None: 
        
               q = torch.zeros_like(z, device=z.device, dtype=z.dtype) 
        
           else: 
        
               q = q[batch]

torchmd-net/torchmdnet/models/tensornet.py

Line 496 in fae79bd

X = X + dX + (1 + 0.1 * q[..., None, None, None]) * torch.matrix_power(dX, 2)

If you set "q" (please lets change the name of that variable) to something like this:

        if q is None:
            q = 0
        else:
            q = q[batch, None, None, None]

peastman · 2024-03-05T16:08:22Z

I previously tested this method for injecting partial charges and found it didn't work at all. The loss got much worse. In contrast, the method in #289 does work and makes the loss better.

guillemsimeon · 2024-03-05T17:16:25Z

We already had this discussion. This PR is to build an agnostic method to deal in general with partial charges, total charge, spin, or whatever. If your method gives better results than this one, we can decide later on, but I am running my experiments with SPICE PubChem and at the moment seems fine to me. You can ignore what is happening inside TensorNet at this time if that’s the case.

…

On Tue, 5 Mar 2024 at 17:08, Peter Eastman ***@***.***> wrote: I previously tested this method for injecting partial charges and found it didn't work at all. The loss got much worse. In contrast, the method in #289 <#289> does work and makes the loss better. — Reply to this email directly, view it on GitHub <#304 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANJMOA6JIBSVMNCTGWJSWVLYWXUYFAVCNFSM6AAAAABEG4MU7SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZZGEZDGNRTGE> . You are receiving this because you authored the thread.Message ID: ***@***.***>

peastman · 2024-03-05T17:26:50Z

Yes, we did already have this discussion. :) My point is that this shouldn't be merged until we do a full evaluation of both PRs and determine which one works better.

Also, #289 truly is a generic method that supports arbitrary sets of global and per-atom properties. This one is hardcoded to just a single property which must be called pq, and it isn't clear how it could be generalized to support multiple properties.

guillemsimeon · 2024-03-05T17:34:30Z

Sorry Peter, perhaps I was not clear enough: I meant agnostic in the sense that we are trying to come up with a unified way to feed the models with all these extra attributes, rename them to have clearer names, being compatible with priors, etc, regardless of how they are manipulated inside the models. Regarding this way of dealing with them, I have tests with total charges, partial charges, and spins, and results are pretty convincing (in our setting at least, not using any prior, I cannot specify further since I am writing a preprint, but I will be happy to discuss privately if you want to reproduce it) But yes, this is not going to be merged at the moment.

…

On Tue, 5 Mar 2024 at 18:27, Peter Eastman ***@***.***> wrote: Yes, we did already have this discussion. :) My point is that this shouldn't be merged until we do a full evaluation of both PRs and determine which one works better. Also, #289 <#289> truly is a generic method that supports arbitrary sets of global and per-atom properties. This one is hardcoded to just a single property which must be called pq, and it isn't clear how it could be generalized to support multiple properties. — Reply to this email directly, view it on GitHub <#304 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANJMOA2O6DXWZEQ5AM2NANTYWX56PAVCNFSM6AAAAABEG4MU7SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZZGI3TSMBUHA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

AntonioMirarchi · 2024-03-06T08:35:57Z

What do you think about this setup in the yaml file:

model_extra_args:
  total_charge: 
    learnable: true
    coeff: 0.1
  partial_charge: 
    learnable: true
    coeff: 0.1
  spin: 
    learnable: false
    coeff: 1

The keys of the dict will be used to define which args of extra_args should go to the model so the prior behavior is preserved as it is now. Then the coeff is used as starting parameter for the specific arg, if learnable requires_grad is set to true. In the init a nn.ParameterList is initialized with length equal to the model_extra_args.keys().

RaulPPelaez · 2024-03-06T08:56:54Z

You are assuming the extra_args are going to just be multiplied by a constant (as guillem does now), this might not be the case in the future, when perhaps we lean towards Peter's strategy of embeddings, or we use some kind of small MLP to compute the prefactors.
OTOH I do not really like the name "model_extra_args". Why would "total_charge" be extra but "box" is just an optional argument? I do not have a better alternative for you, just saying...

peastman · 2024-03-06T20:46:50Z

Here are training curves for a pair of models (TensorNet with one interaction layer). They used identical settings except for the handling of partial charges. "pq" used the method in this PR. "embedding" used the method in #289. Here is the total training loss.

And to give a sense of the errors in physical units, here is the L1 loss for energies in the validation set.

guillemsimeon added 3 commits March 5, 2024 11:42

Update tensornet.py

43d8f9f

Update model.py

9ccc2c0

Update model.py

25854fb

RaulPPelaez reviewed Mar 5, 2024

View reviewed changes

guillemsimeon closed this Mar 15, 2024

guillemsimeon deleted the pq branch March 15, 2024 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pq #304

pq #304

guillemsimeon commented Mar 5, 2024

RaulPPelaez Mar 5, 2024

guillemsimeon Mar 5, 2024

RaulPPelaez commented Mar 5, 2024

peastman commented Mar 5, 2024

guillemsimeon commented Mar 5, 2024 via email

peastman commented Mar 5, 2024

guillemsimeon commented Mar 5, 2024 via email

AntonioMirarchi commented Mar 6, 2024

RaulPPelaez commented Mar 6, 2024

peastman commented Mar 6, 2024

pq #304

pq #304

Conversation

guillemsimeon commented Mar 5, 2024

RaulPPelaez Mar 5, 2024

Choose a reason for hiding this comment

guillemsimeon Mar 5, 2024

Choose a reason for hiding this comment

RaulPPelaez commented Mar 5, 2024

peastman commented Mar 5, 2024

guillemsimeon commented Mar 5, 2024 via email

peastman commented Mar 5, 2024

guillemsimeon commented Mar 5, 2024 via email

AntonioMirarchi commented Mar 6, 2024

RaulPPelaez commented Mar 6, 2024

peastman commented Mar 6, 2024