Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nojump #4031

Merged
merged 60 commits into from
Mar 28, 2023
Merged

Nojump #4031

merged 60 commits into from
Mar 28, 2023

Conversation

jvermaas
Copy link
Contributor

@jvermaas jvermaas commented Feb 19, 2023

Fixes #3703

Changes made in this Pull Request:

  • Defines a "NoJump" transform, using the same logic as the position averaging transformation. Rather than adding frames together, the nojump transformation looks "back" to extrapolate what the new unwrapped coordinates should be. Uses the algorithm from Kulke & Vermaas

PR Checklist

  • Tests? Yup. It works with non-orthogonal boxes.
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on the developer mailing list so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS as part of this PR.

@codecov
Copy link

codecov bot commented Feb 19, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.01 🎉

Comparison is base (c4f998a) 93.57% compared to head (95f08a4) 93.58%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4031      +/-   ##
===========================================
+ Coverage    93.57%   93.58%   +0.01%     
===========================================
  Files          191      192       +1     
  Lines        25086    25132      +46     
  Branches      4051     4056       +5     
===========================================
+ Hits         23473    23519      +46     
  Misses        1093     1093              
  Partials       520      520              
Impacted Files Coverage Δ
package/MDAnalysis/transformations/__init__.py 100.00% <100.00%> (ø)
package/MDAnalysis/transformations/nojump.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jvermaas jvermaas marked this pull request as ready for review February 20, 2023 07:52
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Please see inline for quick drive-by comments. (Literally while on shuttle to conference center…)

=======================================================================================

Unwraps the trajectory such that atoms never move more than half a periodic box length.
The algorithm used is based on Kulke & Vermaas 2022, 10.1021/acs.jctc.2c00327
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cite using the bookstore Tex extension (sorry, search for examples)

package/MDAnalysis/transformations/nojump.py Show resolved Hide resolved
package/MDAnalysis/transformations/nojump.py Outdated Show resolved Hide resolved
import pytest
from numpy.testing import assert_array_almost_equal

import MDAnalysis as md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We traditionally import as mda, please change for consistency

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Test if the nojump transform is returning the correct
values when iterating forwards over the sample trajectory.
"""
ref_matrix_fwd1 = np.asarray([-3.42261, 11.28495, -1.37211])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the red values and how did you obtain them? Add a comment for provenance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My workflow for testing this was probably non-ideal. I basically made water orthogonal and non-orthogonal water boxes, checked that the results were equivalent to what I got from VMD/fastpbc. I then ran the algorithm on existing triclinic systems, and did some sanity checks. These are two atomic positions that happened to move the most. Comments are now included to highlight this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvermaas May I suggest making a test case with a single particle that travels at a constant speed through the box? That way it should be easy to verify the correcness of the unwrapping. You can do this directly in MDAnalysis using

import MDAnalysis 
import numpy as np

N = 1
u = mda.Universe.empty(N, trajectory=True)
coordinates = np.empty((1000,  # number of frames
                        u.atoms.n_atoms,
                        3))
coordinates = ... # generate wrapped coordinates for a particle moving at a constant speed in the box. 
# I can help you with this if you need. 
u.load_new(coordinates, order='fac')

You can then make this universe a test fixture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh... That is cleverer @hmacdope . I'll let you know how I make out. I'll probably need to look at how to add PBC information to a universe, but there is probably documentation somewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to make this comment a blocking review @jvermaas, let me know if you need any help, feel free to just ping me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like @hmacdope 's idea.

$$ \mathbf{r}(t) = \mathbf{v}_0 t + \mathbf{r}_0 $$

with some odd angle of the initial velocity and a smallish box and then wrap the straight shot trajectory with the wrap() transform.

Copy link
Member

@hmacdope hmacdope Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE setting PBC the following should work to give you a 10x10x10 orthorhombic box using the dimensions argument.

import MDAnalysis 
import numpy as np

N = 1
u = mda.Universe.empty(N, trajectory=True)
coordinates = np.empty((1000,  # number of frames
                        u.atoms.n_atoms,
                        3))

u.load_new(coordinates, order='fac', in_memory=True, dimensions=[10,10,10, 90.0, 90.0,90.0])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I scrapped the old benchmark and replaced it with a synthetic one. I think its still intelligible, and I can explain why the non-orthogonal number is what it is.

@jvermaas jvermaas requested a review from orbeckst February 20, 2023 18:49
Copy link
Member

@hmacdope hmacdope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking on a constant velocity test trajectory.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes.

As a side note: The user can almost certainly break the algorithm by breaking the frame sequence

[ts for ts in u.trajectory[[0, 5, 2, 10, 10, 3]] ]

but I don't know what we could do about it. We could check that ts.frame is monotonically increasing (or decreasing — u.trajectory[::-1] should work thanks to time reversibility... NPT MD is technically reversible, isn't it?) Could we check and then raise an error??

Comment on lines 78 to 80
@due.dcite(Doi("10.1021/acs.jctc.2c00327"),
description="Works through the orthogonal case, "
"and proposes the non-orthogonal approach.", path=__name__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't quite remember how the description shows up. The best thing for you would be to try it out https://docs.mdanalysis.org/stable/documentation_pages/references.html#citations-using-duecredit and see if you're happy with how your paper is presented to users.

package/MDAnalysis/transformations/nojump.py Outdated Show resolved Hide resolved
"Currently jumping between frames with a step of more than 1."
"This might be fine, but depending on the trajectory stride,"
"this might be inaccurate.",
Warning,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We typically use UserWarning instead of bare Warning.

(We have a few dedicated warning classes in https://docs.mdanalysis.org/stable/documentation_pages/exceptions.html but we try to not proliferate more)

Comment on lines 96 to 99
if self.prev is None:
self.prev = ts.positions @ np.linalg.inv(ts.triclinic_dimensions)
self.old_frame = ts.frame
return ts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work when I go through the trajectory a second time:

u.add_transformation(NoJump) 
[ts for ts in u.trajectory]          
[ts for ts in u.trajectory[2:]]     # once the trajectory resets, will prev point to the correct starting value?

I think there should be a test that it's still correct the second time around. Keeping state is tricky business and our readers' behavior is not always obvious (for instance, they should rewind when you start iteration).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes ZERO sense to unwrap an already unwrapped trajectory, and so ideally the transformation should only ever act once... I think. I could adopt something similar to what positionaveraging does, and try and remember what direction the steps are going in. However, in principle, unwrapping an already unwrapped trajectory is basically a no-op, since the round term will always be zero. I suppose I'd just need to enact a reset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the trajectory will not stay unwrapped. It’s not in memory, the coordinates are read from disk again. (Sorry if I misunderstood your comment… typing quickly before they really force me to switch to airplane mode)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if the user is using the MemoryReader then changes are permanent but that’s handled automatically with transformations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok. Now it'll throw a warning that you are going in the opposite direction (or with an unequal stride). When it encounters this, it will reset the unwrapping procedure.

Comment on lines 1 to 2
###

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove


import numpy as np
import pytest
from numpy.testing import assert_array_almost_equal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use assert_allclose nowadays but haven't updated everything.

"""
# These were determined based on determining what the TRICLINIC system
# would return on a validated version that worked on triclinic and
# orthogonal systems for a small water trajectory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mention VMD and fastpbc explicitly (include version numbers if you have!); we have other cases that we checked against a VMD implementation. We don't consider it shameful to give credit to other software ;-). More importantly, the more detail is provided for validation, the more confidence someone has in the correctness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the TRICLINIC system just was too ugly. The synthetic system is easier to understand.

Test if the nojump transform is returning the correct
values when iterating forwards over the sample trajectory.
"""
ref_matrix_fwd1 = np.asarray([-3.42261, 11.28495, -1.37211])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like @hmacdope 's idea.

$$ \mathbf{r}(t) = \mathbf{v}_0 t + \mathbf{r}_0 $$

with some odd angle of the initial velocity and a smallish box and then wrap the straight shot trajectory with the wrap() transform.

# orthogonal systems for a small water trajectory.
# These are specific to atoms 166 and 362, which moved the most in the
# trajectory.
ref_matrix_fwd1 = np.asarray([-3.42261, 11.28495, -1.37211])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of calling these variables "matrix", call them something like ref_last_pos_166 or something that makes clearer what they are.

Comment on lines 47 to 54
size = (
nojump_universes.trajectory.ts.positions.shape[0],
nojump_universes.trajectory.ts.positions.shape[1],
len(nojump_universes.trajectory),
)
parr = np.empty(size)
for ts in nojump_universes.trajectory:
parr[..., ts.frame] = ts.positions.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can replace the explicit loop with

transformed_coordinates = u.trajectory.timeseries()

By default, it has shape (n_frames, n_atoms, 3) so you have to change the lines below.

@jvermaas jvermaas requested a review from hmacdope February 21, 2023 06:10
# Step is 1 unit every 3 steps. After 99 steps from the origin,
# we'll end up at 33. However, since the unit cell is non-orthogonal,
# we'll end up at a distorted place.
assert_allclose(transformed_coordinates[99], 33 * np.array([0.5, 1, np.sqrt(3)/2]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, match the whole array.

  • Do we know the exact for of the skew factor 33 * np.array([0.5, 1, np.sqrt(3)/2] is correct? If so I would love to see the working in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof. This is a pain to do for how I constructed the non-orthogonal case. The z position shows this the most clearly.

[[ 0.          0.          0.        ]
 [ 0.33333334  0.33333334  0.33333334]
 [ 0.6666667   0.6666667   0.6666667 ]
 [ 0.5         1.          0.8660254 ]
 [ 0.8333334   1.3333334   1.1993587 ]
 [ 1.1666667   1.6666667   1.5326921 ]
 [ 1.          2.          1.7320508 ]
...snip...

The jumps aren't all the same size, since the skewed box means that a "constant" displacement in wrapped space meets up with the next box over in a weird place. Its too late at night to think about how this works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to change your box vectors freely to make the working easier if it helps. 😀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, except that you REALLY want to test on non-orthogonal boxes, since its harder. The path I chose is just inherently jagged. I've made another test based on a straight line.

transformed_coordinates = u.trajectory.timeseries()[0]
# Step is 1 unit every 3 steps. After 99 steps from the origin,
# we'll end up at 33.
assert_allclose(transformed_coordinates[99], 33 * np.ones(3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this test cover the whole array, matching along its whole length.

@jvermaas
Copy link
Contributor Author

jvermaas commented Mar 5, 2023

No worries. @hmacdope . I must have been too hasty trying to come up with a non-invertible test-case. This one ought to work, and I think I fixed the conflicts.

@jvermaas
Copy link
Contributor Author

@hmacdope , just fixed the new conflicts in the authorlist and the changelog. If you get a chance, I think we are good to go now.

@hmacdope
Copy link
Member

Can you fix the unused import? See the output of black linter. I'll try and review ASAP.

Darker complains, so I guess the import has to go.
@hmacdope
Copy link
Member

OMG im so sorry, that import should be there, apologies. Just revert that change.

This reverts commit 47b0e39.
@jvermaas
Copy link
Contributor Author

From context I thought it was fine. Linter just doesn't like it. :D

@amangupta201
Copy link

Hey can you please assign me this issue?

@hmacdope
Copy link
Member

Hey can you please assign me this issue?

No this is a current in progress PR. Try and look for another issue on the issue tracker.

Copy link
Member

@hmacdope hmacdope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good, almost there :)

package/MDAnalysis/transformations/nojump.py Show resolved Hide resolved
reference = mda.Universe.empty(Natom, trajectory=True)
reference.load_new(coordinates, order="fac")
towrap = mda.Universe.empty(Natom, trajectory=True)
towrap.load_new(coordinates, order="fac")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reference and towrap are the same? Can't you just use copy in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'll be wrapping and unwrapping towrap, and need these to be distinct memory locations so I only mutate one set. My understanding of numpy copy semantics is that you aren't actually sure that you create distinct memory locations for a "trajectory".

)


def test_nojump_constantvel(nojump_constantvel_universe):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should do this for both an orthogonal and nonorthogonal box

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is done in test_nojump_orthogonal_fwd. Its just a single allclose call in that case, since I can use a single linspace to do the matrix outer product.

@jvermaas
Copy link
Contributor Author

Ok, @hmacdope , I think we are good now. I used copy to clone the universe so that the wrapping/unwrapping is shown to be reversible against the wrapping algorithm used, and now we actually have a test that will yield a non-invertible box dimension, although codecov isn't being rerun for some reason. I'll fix the authors again, but hopefully for the last time.

@jvermaas jvermaas requested a review from hmacdope March 27, 2023 14:06
Copy link
Member

@hmacdope hmacdope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the hard work @jvermaas! Looks great to me. Thank you so much as this is something we have been needing for a long time! @orbeckst have all your concerns been addressed?

@github-actions
Copy link

github-actions bot commented Mar 28, 2023

Linter Bot Results:

Hi @jvermaas! Thanks for making this PR. We linted your code and found the following:

There are currently no issues detected! 🎉

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good, juts minor nit-picks

  • use modern f-strings
  • please add a sentence or two to the doc string about how the algorithm behaves when there are jumps in the trajectory or under which situations it fails; also mention that it works best when using every frame

package/MDAnalysis/transformations/nojump.py Show resolved Hide resolved
package/MDAnalysis/transformations/nojump.py Outdated Show resolved Hide resolved
package/MDAnalysis/transformations/nojump.py Outdated Show resolved Hide resolved
jvermaas and others added 3 commits March 28, 2023 14:58
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Added a bit more to the docstring that highlights what can go wrong when applying the transform to a trajectory that is missing box dimensions.
@jvermaas
Copy link
Contributor Author

@orbeckst , I think we are good now. I've been using C-style formatting for so long ("%03d" is so powerful for keeping NAMD trajectories sorted) that I forget that f-strings exist.

@orbeckst
Copy link
Member

Thank you @jvermaas , a very valuable contribution.

@hmacdope all yours now.

@hmacdope hmacdope merged commit 1207578 into MDAnalysis:develop Mar 28, 2023
orbeckst added a commit that referenced this pull request Dec 21, 2023
Update of AUTHORS and CHANGELOG with inferred author contributions.

*  Removed duplicate mattwthompson in 0.20.0 changelog entry.: mattwthompson was placed twice by accident, this removes this duplication.

* Addition of missing authors.

   An retrospective analysis of the authors found via `git shortlog -s -n --email --branches="develop"` found several commits by authors which were not present in the `AUTHORS.md` file.

    - Zhenbo Li: commited via PR: Started tests for gnm. #803 and Make Travis run tests on OSX. #771,
    - Jenna M. Swarthout via PR Update CoC according to suggestions from current CoC committee #4289 and Point to new reporting form link (owned by conduct@mdanalysis.org) #4298,
    - Bradley Dice via PR   Fix GSD Windows compatibility #2384 ,
    - David Minh via PR #2668

   There seemed to be no indications in the above mentioned PRs that the author did not want to be in the authors file, it looks like it was just forgotten.

* Addition of missing entries from the changelog

   Continued cross referencing of the git shortlog output but also accounting for the changelog identified several individuals that had not been included in the changelog entries for the release they contributed under. They were added to the relevant entry of the changelog based on the merge commit date.

   Please note that for Tone Bengsten, we a) had no github handle (so they were assigned @tbengtsen), and b) no specific commit. Given we know that this individual was including alongside the encore merge, they were assigned to the 0.16.0 release.

* Update CHANGELOG
* PR #1218
* PR #1284 and #1408
* PR #4109
* based on git history
* PRs #803 and #771 (author addition, changelog addition)
* PR #2255 and #2221
* PR #1225
* PR #4289 and #4298
* PR #4031
* PR #4085
* PR #3635
* PR #2356
* PR #2559
* No GH handle - Encore author addition @tbengtsen
* PR #4184
* PR #2614
* PR #2219
* PR #2384
* PR #2668
* Add missing entry for Jenna

Thanks to @fiona-naughton for helping out with digging into this data :)

Co-authored-by: Fiona Naughton <fiona@mdanalysis.org>
Co-authored-by: Oliver Beckstein <orbeckst@mdanalysis.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nojump transform using the “hybrid unwrap” scheme
5 participants