Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformations not applied to some frames with ChainReader #3343

Open
caioss opened this issue May 28, 2021 · 4 comments · May be fixed by #3344
Open

Transformations not applied to some frames with ChainReader #3343

caioss opened this issue May 28, 2021 · 4 comments · May be fixed by #3344
Assignees
Labels
Component-Transformations On-the-fly transformations defect Format-ChainReader ChainReader virtual trajectory reader

Comments

@caioss
Copy link
Contributor

caioss commented May 28, 2021

Expected behavior

If an Universe is created with more than one trajectory file and a MDAnalysis.transformations.fit.fit_rot_trans transformation is added to it, it is expected that all frames get fitted on-the-fly when we iterate over the trajectory.

Actual behavior

Whenever a ChainReader is used to load the trajectories, the transformation isn't applied to some frames. It seems that there is a pattern on which frames are fitted and which are not. I reproduced this behavior with independent XTC and DCD files.

Code to reproduce the behavior

This code results in some frames with low and others with high RMSD values. It was expected that all frames had low RMSD.

import MDAnalysis as mda
from MDAnalysis.analysis.rms import rmsd
from MDAnalysis.transformations.fit import fit_rot_trans
from MDAnalysis.tests.datafiles import PSF, DCD

reference = mda.Universe(PSF, DCD)
simul = mda.Universe(PSF, [DCD])  # Force the usage of a ChainReader (wrong behavior)
# simul = mda.Universe(PSF, DCD, DCD)  # Wrong behavior
# simul = mda.Universe(PSF, [DCD, DCD])  # Wrong behavior
# simul = mda.Universe(PSF, DCD)  # Correct behavior

# Un-fit the reference
reference.atoms.rotateby(100, [0, 0, 1])
reference.atoms.translate([10, 10, 10])

# Fitting transform
ref_sel = reference.select_atoms("name CA")
fit_sel = simul.select_atoms("name CA")
transform = fit_rot_trans(fit_sel, ref_sel)
simul.trajectory.add_transformations(transform)

for ts in simul.trajectory[:5]:
    print(rmsd(fit_sel.positions, ref_sel.positions))

Output:

27.879789573782062
0.42571286245739176
27.965311103398278
0.7381344995787231
28.00443710176041

If we don't use the ChainReader by un-commenting the following line, all frames get fitted.

simul = mda.Universe(PSF, DCD)  # Expected behavior

Output:

1.280204933315509e-06
0.4234302855368412
0.5936585854007296
0.7368312603393679
0.8277202818678987

If we keep using the ChainReader, but explicitly apply the transformation in the trajectory loop, all frames get fitted.

for ts in simul.trajectory[:5]:
    ts = transform(ts)
    print(rmsd(fit_sel.positions, ref_sel.positions))

Output:

1.3684785310354626e-06
0.4234302377147153
0.5936586440093501
0.7368312953807447
0.8277203745278706

Just for completeness, here is a case without the fitting stuff and the correct behavior (the example trajectory was already fitted).

import MDAnalysis as mda
from MDAnalysis.analysis.rms import rmsd
from MDAnalysis.tests.datafiles import PSF, DCD

simul = mda.Universe(PSF, [DCD])
ref_sel = reference.select_atoms("name CA")
fit_sel = simul.select_atoms("name CA")

for ts in simul.trajectory[:5]:
    print(rmsd(fit_sel.positions, ref_sel.positions))

Output:

0.0
0.4257129087755455
0.5951654476464195
0.7384635729638336
0.8350828168683272

Current version of MDAnalysis

  • Which version are you using?
    The above codes were tested on versions 1.1.1 and 2.0.0-dev0.
  • Which version of Python ()?
    Versions 3.6.9 and 3.8.5.
  • Which operating system?
    Ubuntu 18.04 and 20.04
@mnmelo
Copy link
Member

mnmelo commented May 30, 2021

Hi @caioss, thanks for the super-detailed report.

I haven't yet found the ultimate cause, but I've tracked it down to a mismatch of the Timesteps that are used for the alignment. It seems that the mobile atoms are considered still as if in the previous timestep.

@orbeckst orbeckst added Component-Transformations On-the-fly transformations defect Format-ChainReader ChainReader virtual trajectory reader labels May 30, 2021
@mnmelo
Copy link
Member

mnmelo commented May 30, 2021

Ok, found the problem.

The mobile group involved in the transformation holds a reference to universe.trajectory.ts. This is a ChainReader Timestep. When the ChainReader requests a frame from the subreader -- and before it is set as the ChainReader's current ts -- the transformation is applied. Because the mobile group reflects the ChainReader's ts it will be wrongly using the not-yet-updated timestep.

The issue here is that ChainReader._read_frame is fetching subreader frames via indexing (__getitem__), which triggers a transformation. This should be done via subreader._read_frame too, to avoid premature transformation (and to be consistent with the initial call to ChainReader._read_frame, which in principle also expects no transform to take place.)

@mnmelo mnmelo self-assigned this May 30, 2021
@mnmelo
Copy link
Member

mnmelo commented May 30, 2021

Ok, this goes a bit deeper in that the ChainReader defers all transformations to the subreaders, and does no transform on its own. Alternatives are:

  1. to instead do all transformations at the ChainReader level; the subreaders wouldn't be aware of transformations. (Any obvious drawbacks here? Possibly if we have abstraction-breaking, Reader-aware transformations?);
  2. to make the ChainReader ts be a direct pointer to the active subreader's ts. I find this interesting and perhaps more elegant than what we have now, in which the ChainReader ts has to be explicitly set after advancing the subreader.

@mnmelo
Copy link
Member

mnmelo commented May 31, 2021

In solving this I implemented option 2. above, and created a shallow ChainReader.Timestep class that defers all requests to the subreader.ts objects, except for ts.frame, which reflects the global frame index rather than the subreader's.

It seems to work, but highlights a limitation (that already exists) in that frame-aware transforms never get to see the global frame number, only the subreader's. I imagine the position-averaging transform might already suffer from this when used on ChainReaders.

To counter this it seems that option 1. is the best way forward. Is there any drawback to moving the transforms to be applied at the ChainReader level rather than at the subreader level?

mnmelo added a commit that referenced this issue May 31, 2021
Closes #3343

Takes into account over-transformation of SingleFrameReaders, which
never re-create their ts when re-accessed.
@mnmelo mnmelo linked a pull request May 31, 2021 that will close this issue
4 tasks
mnmelo added a commit that referenced this issue May 31, 2021
Closes #3343

Takes into account over-transformation of SingleFrameReaders, which
never re-create their ts when re-accessed.
mnmelo added a commit that referenced this issue Jun 8, 2021
Closes #3343

Takes into account over-transformation of SingleFrameReaders, which
never re-create their ts when re-accessed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component-Transformations On-the-fly transformations defect Format-ChainReader ChainReader virtual trajectory reader
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants