Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change _single_frame by using timeseries #4028

Closed
wants to merge 12 commits into from

Conversation

SophiaRuan
Copy link
Contributor

@SophiaRuan SophiaRuan commented Feb 15, 2023

Fixes #3993

Changes made in this Pull Request:

  • changed _single_frame() function, using .timeseries instead of atom group positions.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on the developer mailing list so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS as part of this PR.

Copy link
Member

@hmacdope hmacdope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _single_frame method is called repeatedly over the trajectory by run() to access the values in each frame. The equivalent is done in timeseries() so you don't need to call timeseries every frame, only once. _prepare() would be a good place to do this. Have a look at the AnalysisBase class for hints and check the results of any failing tests for more clues.

@SophiaRuan
Copy link
Contributor Author

The _single_frame method is called repeatedly over the trajectory by run() to access the values in each frame. The equivalent is done in timeseries() so you don't need to call timeseries every frame, only once. _prepare() would be a good place to do this. Have a look at the AnalysisBase class for hints and check the results of any failing tests for more clues.

Thank you for the hints! I submitted another change to this pull request.

@SophiaRuan
Copy link
Contributor Author

Hi, I tried a few times to solve the array shape issue in _prepare(). I tried to make the _position_array to be exactly the same shape as the original one. However, I still got errors after a few attempts. Is the shape of the array the only issue that causes the error?

@SophiaRuan
Copy link
Contributor Author

@hmacdope I have tried to change the _prepare() by using _trajectory.timeseries(). I also tried to add the start, stop, step to solve the array shape issue. Is there any other thing that I didn't notice? Thank you for your help!

@orbeckst
Copy link
Member

@SophiaRuan , since PR #4004 was closed, your PR is now considered the active one addressing issue #3993 .

# self.results.timeseries not set here
start, stop, step = self._trajectory.check_slice_indices(self.start, self.stop, self.step)
self._position_array = self._trajectory.timeseries()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use the arguments to timeseries to control start, stop and step.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful of the inclusive range bug (#3893)

start, stop, step = self._trajectory.check_slice_indices(self.start, self.stop, self.step)
self._position_array = self._trajectory.timeseries()
self._position_array = self._position_array[start: stop: step]
self._position_array = np.transpose(self._position_array, (1, 0, 2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the order argument to timeseries to control the output order so that you don't need this transpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestions! I will try to fix them.

@SophiaRuan
Copy link
Contributor Author

Hi @hmacdope @orbeckst, I already looked into the #3893 and added the timeseries arguments start, stop, step and order. However, the shape is still not right. What else did I ignore?

@hmacdope
Copy link
Member

hmacdope commented Mar 3, 2023

I think it's to do with the timeseries indexing being inclusive. See my other comment. You may need to add one to the index

@hmacdope
Copy link
Member

hmacdope commented Mar 3, 2023

Sorry I meant remove one.

@SophiaRuan
Copy link
Contributor Author

@hmacdope I've tried to remove 1 from stop but there's even 1 more test that failed in msd_test.py.

@codecov
Copy link

codecov bot commented Mar 7, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -8.16 ⚠️

Comparison is base (55a5abd) 93.54% compared to head (1262570) 85.38%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4028      +/-   ##
===========================================
- Coverage    93.54%   85.38%   -8.16%     
===========================================
  Files          191      177      -14     
  Lines        25065    23957    -1108     
  Branches      4042     3460     -582     
===========================================
- Hits         23446    20456    -2990     
- Misses        1099     3022    +1923     
+ Partials       520      479      -41     
Impacted Files Coverage Δ
package/MDAnalysis/analysis/msd.py 93.33% <100.00%> (-6.67%) ⬇️
package/MDAnalysis/coordinates/ParmEd.py 0.00% <0.00%> (-100.00%) ⬇️
package/MDAnalysis/topology/ParmEdParser.py 0.00% <0.00%> (-100.00%) ⬇️
package/MDAnalysis/converters/ParmEdParser.py 13.95% <0.00%> (-83.73%) ⬇️
package/MDAnalysis/converters/RDKitParser.py 14.50% <0.00%> (-82.47%) ⬇️
package/MDAnalysis/converters/RDKit.py 16.66% <0.00%> (-82.08%) ⬇️
package/MDAnalysis/converters/ParmEd.py 13.79% <0.00%> (-76.44%) ⬇️
package/MDAnalysis/coordinates/H5MD.py 20.91% <0.00%> (-72.17%) ⬇️
package/MDAnalysis/auxiliary/EDR.py 28.26% <0.00%> (-70.66%) ⬇️
package/MDAnalysis/analysis/hole2/hole.py 14.21% <0.00%> (-68.69%) ⬇️
... and 54 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@SophiaRuan
Copy link
Contributor Author

Hi @hmacdope and @orbeckst , I made some changes by slicing the array with n_particles and dim_fac. There are more tests passed, but still, some checks were not successful. Could you take a look? Thank you very much!

@MohitKumar020291
Copy link
Contributor

@SophiaRuan As an advice I will say try removing _single_frame as it is one of the objective of issue , give start , stop , step in __init__ . After defining _position_array into _prepare , you may need give these coordinates into conclude . Please keep it as beginner advice. Thanks.

@SophiaRuan
Copy link
Contributor Author

@SophiaRuan As an advice I will say try removing _single_frame as it is one of the objective of issue , give start , stop , step in __init__ . After defining _position_array into _prepare , you may need give these coordinates into conclude . Please keep it as beginner advice. Thanks.

Hi @MohitKumar020291, thank you for your suggestion! I'll look into that!

@SophiaRuan
Copy link
Contributor Author

@SophiaRuan As an advice I will say try removing _single_frame as it is one of the objective of issue , give start , stop , step in __init__ . After defining _position_array into _prepare , you may need give these coordinates into conclude . Please keep it as beginner advice. Thanks.

Hi @MohitKumar020291 @hmacdope, I am not sure if I need to remove _single_frame as I thought this function is a standard function defined in Analysisbase for collecting data across different frames of the trajectory.

@MohitKumar020291
Copy link
Contributor

@SophiaRuan There is something like that the issue want to insert Reader.timeseries but _single_frame uses Results.timeseries. I am also bit confused that run uses _single_frame so , how can we remove that? But I think (not sure) that it may solved by doing variations in run . Please check closed https://github.com/MDAnalysis/mdanalysis/pull/4004 . Although it is closed but I had good advices from mentors. Thanks.

@orbeckst
Copy link
Member

@SophiaRuan I am going to close your PR, following the discussion in #3993. Sorry that we asked you to do all the work. I really appreciated how you asked important questions that ultimately convinced us that we had not thought the suggested approach through.

Please ping @hmacdope and myself once you have a new PR up so that we can give you feedback.

@SophiaRuan
Copy link
Contributor Author

[ENH]: Add a progress bar to postprocessing in _conclude() functions in MSD.py #4070

No problem! I learned a lot from this issue as well! I will try the following issue #4070.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH]: Change MSD code to accumulate values using .timeseries.
4 participants