-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nojump #4031
Nojump #4031
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on the developer mailing list so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS
as part of this PR.
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #4031 +/- ##
===========================================
+ Coverage 93.57% 93.58% +0.01%
===========================================
Files 191 192 +1
Lines 25086 25132 +46
Branches 4051 4056 +5
===========================================
+ Hits 23473 23519 +46
Misses 1093 1093
Partials 520 520
... and 1 file with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Please see inline for quick drive-by comments. (Literally while on shuttle to conference center…)
======================================================================================= | ||
|
||
Unwraps the trajectory such that atoms never move more than half a periodic box length. | ||
The algorithm used is based on Kulke & Vermaas 2022, 10.1021/acs.jctc.2c00327 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cite using the bookstore Tex extension (sorry, search for examples)
import pytest | ||
from numpy.testing import assert_array_almost_equal | ||
|
||
import MDAnalysis as md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We traditionally import as mda, please change for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Test if the nojump transform is returning the correct | ||
values when iterating forwards over the sample trajectory. | ||
""" | ||
ref_matrix_fwd1 = np.asarray([-3.42261, 11.28495, -1.37211]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the red values and how did you obtain them? Add a comment for provenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My workflow for testing this was probably non-ideal. I basically made water orthogonal and non-orthogonal water boxes, checked that the results were equivalent to what I got from VMD/fastpbc. I then ran the algorithm on existing triclinic systems, and did some sanity checks. These are two atomic positions that happened to move the most. Comments are now included to highlight this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jvermaas May I suggest making a test case with a single particle that travels at a constant speed through the box? That way it should be easy to verify the correcness of the unwrapping. You can do this directly in MDAnalysis using
import MDAnalysis
import numpy as np
N = 1
u = mda.Universe.empty(N, trajectory=True)
coordinates = np.empty((1000, # number of frames
u.atoms.n_atoms,
3))
coordinates = ... # generate wrapped coordinates for a particle moving at a constant speed in the box.
# I can help you with this if you need.
u.load_new(coordinates, order='fac')
You can then make this universe a test fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh... That is cleverer @hmacdope . I'll let you know how I make out. I'll probably need to look at how to add PBC information to a universe, but there is probably documentation somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to make this comment a blocking review @jvermaas, let me know if you need any help, feel free to just ping me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like @hmacdope 's idea.
with some odd angle of the initial velocity and a smallish box and then wrap the straight shot trajectory with the wrap()
transform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE setting PBC the following should work to give you a 10x10x10 orthorhombic box using the dimensions
argument.
import MDAnalysis
import numpy as np
N = 1
u = mda.Universe.empty(N, trajectory=True)
coordinates = np.empty((1000, # number of frames
u.atoms.n_atoms,
3))
u.load_new(coordinates, order='fac', in_memory=True, dimensions=[10,10,10, 90.0, 90.0,90.0])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I scrapped the old benchmark and replaced it with a synthetic one. I think its still intelligible, and I can explain why the non-orthogonal number is what it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking on a constant velocity test trajectory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes.
- Integrate into docs by adding to package/doc/sphinx/source/documentation_pages/trajectory_transformations.rst and package/doc/sphinx/source/documentation_pages/transformations. Then check that docs look ok https://mdanalysis--4031.org.readthedocs.build/en/4031/index.html
- I think @hmacdope 's idea of creating a simple trajectory as a test case is really good. Can we do that?
- As with any transformations that keep state we need to tell the user under which circumstances it should work and under which it will fail. FOr example, positionaveraging has the
check_reset
flag and explains different cases. We need to have some documentation that outlines when/how to use it. If something does not make sense (see below) then we should explicitly say so, especially of we're not able to warn or raise an error. - Other comments inline.
As a side note: The user can almost certainly break the algorithm by breaking the frame sequence
[ts for ts in u.trajectory[[0, 5, 2, 10, 10, 3]] ]
but I don't know what we could do about it. We could check that ts.frame is monotonically increasing (or decreasing — u.trajectory[::-1]
should work thanks to time reversibility... NPT MD is technically reversible, isn't it?) Could we check and then raise an error??
@due.dcite(Doi("10.1021/acs.jctc.2c00327"), | ||
description="Works through the orthogonal case, " | ||
"and proposes the non-orthogonal approach.", path=__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't quite remember how the description shows up. The best thing for you would be to try it out https://docs.mdanalysis.org/stable/documentation_pages/references.html#citations-using-duecredit and see if you're happy with how your paper is presented to users.
"Currently jumping between frames with a step of more than 1." | ||
"This might be fine, but depending on the trajectory stride," | ||
"this might be inaccurate.", | ||
Warning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We typically use UserWarning instead of bare Warning.
(We have a few dedicated warning classes in https://docs.mdanalysis.org/stable/documentation_pages/exceptions.html but we try to not proliferate more)
if self.prev is None: | ||
self.prev = ts.positions @ np.linalg.inv(ts.triclinic_dimensions) | ||
self.old_frame = ts.frame | ||
return ts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work when I go through the trajectory a second time:
u.add_transformation(NoJump)
[ts for ts in u.trajectory]
[ts for ts in u.trajectory[2:]] # once the trajectory resets, will prev point to the correct starting value?
I think there should be a test that it's still correct the second time around. Keeping state is tricky business and our readers' behavior is not always obvious (for instance, they should rewind when you start iteration).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes ZERO sense to unwrap an already unwrapped trajectory, and so ideally the transformation should only ever act once... I think. I could adopt something similar to what positionaveraging does, and try and remember what direction the steps are going in. However, in principle, unwrapping an already unwrapped trajectory is basically a no-op, since the round term will always be zero. I suppose I'd just need to enact a reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the trajectory will not stay unwrapped. It’s not in memory, the coordinates are read from disk again. (Sorry if I misunderstood your comment… typing quickly before they really force me to switch to airplane mode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if the user is using the MemoryReader then changes are permanent but that’s handled automatically with transformations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, ok. Now it'll throw a warning that you are going in the opposite direction (or with an unequal stride). When it encounters this, it will reset the unwrapping procedure.
### | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
|
||
import numpy as np | ||
import pytest | ||
from numpy.testing import assert_array_almost_equal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use assert_allclose
nowadays but haven't updated everything.
""" | ||
# These were determined based on determining what the TRICLINIC system | ||
# would return on a validated version that worked on triclinic and | ||
# orthogonal systems for a small water trajectory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mention VMD and fastpbc explicitly (include version numbers if you have!); we have other cases that we checked against a VMD implementation. We don't consider it shameful to give credit to other software ;-). More importantly, the more detail is provided for validation, the more confidence someone has in the correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the TRICLINIC system just was too ugly. The synthetic system is easier to understand.
Test if the nojump transform is returning the correct | ||
values when iterating forwards over the sample trajectory. | ||
""" | ||
ref_matrix_fwd1 = np.asarray([-3.42261, 11.28495, -1.37211]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like @hmacdope 's idea.
with some odd angle of the initial velocity and a smallish box and then wrap the straight shot trajectory with the wrap()
transform.
# orthogonal systems for a small water trajectory. | ||
# These are specific to atoms 166 and 362, which moved the most in the | ||
# trajectory. | ||
ref_matrix_fwd1 = np.asarray([-3.42261, 11.28495, -1.37211]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calling these variables "matrix", call them something like ref_last_pos_166
or something that makes clearer what they are.
size = ( | ||
nojump_universes.trajectory.ts.positions.shape[0], | ||
nojump_universes.trajectory.ts.positions.shape[1], | ||
len(nojump_universes.trajectory), | ||
) | ||
parr = np.empty(size) | ||
for ts in nojump_universes.trajectory: | ||
parr[..., ts.frame] = ts.positions.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can replace the explicit loop with
transformed_coordinates = u.trajectory.timeseries()
By default, it has shape (n_frames, n_atoms, 3)
so you have to change the lines below.
# Step is 1 unit every 3 steps. After 99 steps from the origin, | ||
# we'll end up at 33. However, since the unit cell is non-orthogonal, | ||
# we'll end up at a distorted place. | ||
assert_allclose(transformed_coordinates[99], 33 * np.array([0.5, 1, np.sqrt(3)/2])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, match the whole array.
- Do we know the exact for of the skew factor
33 * np.array([0.5, 1, np.sqrt(3)/2]
is correct? If so I would love to see the working in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof. This is a pain to do for how I constructed the non-orthogonal case. The z position shows this the most clearly.
[[ 0. 0. 0. ]
[ 0.33333334 0.33333334 0.33333334]
[ 0.6666667 0.6666667 0.6666667 ]
[ 0.5 1. 0.8660254 ]
[ 0.8333334 1.3333334 1.1993587 ]
[ 1.1666667 1.6666667 1.5326921 ]
[ 1. 2. 1.7320508 ]
...snip...
The jumps aren't all the same size, since the skewed box means that a "constant" displacement in wrapped space meets up with the next box over in a weird place. Its too late at night to think about how this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to change your box vectors freely to make the working easier if it helps. 😀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, except that you REALLY want to test on non-orthogonal boxes, since its harder. The path I chose is just inherently jagged. I've made another test based on a straight line.
transformed_coordinates = u.trajectory.timeseries()[0] | ||
# Step is 1 unit every 3 steps. After 99 steps from the origin, | ||
# we'll end up at 33. | ||
assert_allclose(transformed_coordinates[99], 33 * np.ones(3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this test cover the whole array, matching along its whole length.
No worries. @hmacdope . I must have been too hasty trying to come up with a non-invertible test-case. This one ought to work, and I think I fixed the conflicts. |
@hmacdope , just fixed the new conflicts in the authorlist and the changelog. If you get a chance, I think we are good to go now. |
Can you fix the unused import? See the output of black linter. I'll try and review ASAP. |
Darker complains, so I guess the import has to go.
OMG im so sorry, that import should be there, apologies. Just revert that change. |
This reverts commit 47b0e39.
From context I thought it was fine. Linter just doesn't like it. :D |
Hey can you please assign me this issue? |
No this is a current in progress PR. Try and look for another issue on the issue tracker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really good, almost there :)
reference = mda.Universe.empty(Natom, trajectory=True) | ||
reference.load_new(coordinates, order="fac") | ||
towrap = mda.Universe.empty(Natom, trajectory=True) | ||
towrap.load_new(coordinates, order="fac") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reference
and towrap
are the same? Can't you just use copy
in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I'll be wrapping and unwrapping towrap
, and need these to be distinct memory locations so I only mutate one set. My understanding of numpy copy semantics is that you aren't actually sure that you create distinct memory locations for a "trajectory".
) | ||
|
||
|
||
def test_nojump_constantvel(nojump_constantvel_universe): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should do this for both an orthogonal and nonorthogonal box
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is done in test_nojump_orthogonal_fwd
. Its just a single allclose call in that case, since I can use a single linspace to do the matrix outer product.
Ok, @hmacdope , I think we are good now. I used copy to clone the universe so that the wrapping/unwrapping is shown to be reversible against the wrapping algorithm used, and now we actually have a test that will yield a non-invertible box dimension, although codecov isn't being rerun for some reason. I'll fix the authors again, but hopefully for the last time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linter Bot Results:Hi @jvermaas! Thanks for making this PR. We linted your code and found the following: There are currently no issues detected! 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good, juts minor nit-picks
- use modern f-strings
- please add a sentence or two to the doc string about how the algorithm behaves when there are jumps in the trajectory or under which situations it fails; also mention that it works best when using every frame
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Added a bit more to the docstring that highlights what can go wrong when applying the transform to a trajectory that is missing box dimensions.
@orbeckst , I think we are good now. I've been using C-style formatting for so long ("%03d" is so powerful for keeping NAMD trajectories sorted) that I forget that f-strings exist. |
Update of AUTHORS and CHANGELOG with inferred author contributions. * Removed duplicate mattwthompson in 0.20.0 changelog entry.: mattwthompson was placed twice by accident, this removes this duplication. * Addition of missing authors. An retrospective analysis of the authors found via `git shortlog -s -n --email --branches="develop"` found several commits by authors which were not present in the `AUTHORS.md` file. - Zhenbo Li: commited via PR: Started tests for gnm. #803 and Make Travis run tests on OSX. #771, - Jenna M. Swarthout via PR Update CoC according to suggestions from current CoC committee #4289 and Point to new reporting form link (owned by conduct@mdanalysis.org) #4298, - Bradley Dice via PR Fix GSD Windows compatibility #2384 , - David Minh via PR #2668 There seemed to be no indications in the above mentioned PRs that the author did not want to be in the authors file, it looks like it was just forgotten. * Addition of missing entries from the changelog Continued cross referencing of the git shortlog output but also accounting for the changelog identified several individuals that had not been included in the changelog entries for the release they contributed under. They were added to the relevant entry of the changelog based on the merge commit date. Please note that for Tone Bengsten, we a) had no github handle (so they were assigned @tbengtsen), and b) no specific commit. Given we know that this individual was including alongside the encore merge, they were assigned to the 0.16.0 release. * Update CHANGELOG * PR #1218 * PR #1284 and #1408 * PR #4109 * based on git history * PRs #803 and #771 (author addition, changelog addition) * PR #2255 and #2221 * PR #1225 * PR #4289 and #4298 * PR #4031 * PR #4085 * PR #3635 * PR #2356 * PR #2559 * No GH handle - Encore author addition @tbengtsen * PR #4184 * PR #2614 * PR #2219 * PR #2384 * PR #2668 * Add missing entry for Jenna Thanks to @fiona-naughton for helping out with digging into this data :) Co-authored-by: Fiona Naughton <fiona@mdanalysis.org> Co-authored-by: Oliver Beckstein <orbeckst@mdanalysis.org>
Fixes #3703
Changes made in this Pull Request:
PR Checklist