Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve xdr cython #3892

Merged
merged 29 commits into from
Nov 30, 2022
Merged

Improve xdr cython #3892

merged 29 commits into from
Nov 30, 2022

Conversation

hmacdope
Copy link
Member

Fixes #3891

Changes made in this Pull Request:

  • Improves cython in libmdaxdr

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@hmacdope hmacdope self-assigned this Oct 28, 2022
@hmacdope hmacdope added the CZI-performance performance track of CZIEOSS4 grant label Oct 28, 2022
@codecov
Copy link

codecov bot commented Oct 28, 2022

Codecov Report

Base: 94.39% // Head: 94.38% // Decreases project coverage by -0.01% ⚠️

Coverage data is based on head (7202bea) compared to base (9e19533).
Patch coverage: 90.90% of modified lines in pull request are covered.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3892      +/-   ##
===========================================
- Coverage    94.39%   94.38%   -0.02%     
===========================================
  Files          194      194              
  Lines        25152    25226      +74     
  Branches      3399     3402       +3     
===========================================
+ Hits         23743    23809      +66     
- Misses        1360     1368       +8     
  Partials        49       49              
Impacted Files Coverage Δ
package/MDAnalysis/coordinates/XDR.py 99.24% <ø> (-0.05%) ⬇️
package/MDAnalysis/lib/formats/libmdaxdr.pyx 87.92% <88.23%> (-1.14%) ⬇️
package/MDAnalysis/coordinates/TRR.py 100.00% <100.00%> (ø)
package/MDAnalysis/coordinates/XTC.py 100.00% <100.00%> (ø)
package/MDAnalysis/coordinates/timestep.pyx 100.00% <0.00%> (+0.68%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@richardjgowers
Copy link
Member

@hmacdope does this get rid of the yellow lines from the annotation? One lazy way of smoking out Python bits is to put code blocks into with nogil: statements, as Cython throws a tantrum when there's Python (slow) sections inside nogil blocks.

@hmacdope hmacdope marked this pull request as ready for review November 6, 2022 10:39
@hmacdope
Copy link
Member Author

hmacdope commented Nov 6, 2022

This is ready for a first review.

Reading directly into a memoryview of ts.positions in the read_direct functions is a modest improvement (5% or so.)

As I see it the main remaining problem is in the TRReader which allocates enough memory for positions, velocities and forces every frame despite them frequently being absent.

This is a bit of a chicken and eg problem though and I don't see an easy way around this without the ability to "read" from a TRR frame whether there are forces, velocities present etc.

I had a browse of the XDR source but couldn't see a flag that readily indicates presence or absence of data, do_trn (xdrfile_trr.c) only checks that the input memory is not NULL or that the effective size of the array is 0.

Query @tylerjereddy @kain88-de as to whether it may be possible to read some kind of flagged state from the file?

Cheers

@kain88-de
Copy link
Member

There is no flag known to me, the format is really basic. However, we already generated a map of the different frames to support random seeks. Nothing is stopping us from extending that map to also include information about the velocities and forces for each frame.

@hmacdope
Copy link
Member Author

hmacdope commented Nov 9, 2022

There is no flag known to me, the format is really basic. However, we already generated a map of the different frames to support random seeks. Nothing is stopping us from extending that map to also include information about the velocities and forces for each frame.

Okay, thanks for that. That was my understanding also that there were no flags. Thanks for confirming.

@hmacdope
Copy link
Member Author

hmacdope commented Nov 9, 2022

This now has a "fat timestep" with positions, velocities and forces at every frame, as the arrays are allocated anyway in reading. These are then read using memoryviews.

Its about 5-8% more efficient in my testing.

@hmacdope
Copy link
Member Author

@orbeckst I have added an IOError raised on retcode != EOK and no errors seem to come up in tests, so seems relatively harmless.

I may have to #pragma nocover it perhaps, unless someone knows how to make a botched file, which seems a bit excessive.

@hmacdope hmacdope requested a review from orbeckst November 20, 2022 22:19
Copy link
Member

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I'd make the direct versions the only implementations, but this would be. a break so it'll have to be a 3.0 thing

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor things + note that TRR and XTC reader may now file more frequently (CHANGELOG and add versionchanged where user visible)

* Improve C content of libxdr Cython, add `read_direct` methods to read
coordinates, velocities and forces directly into memoryviews of `Timestep`
attributes, make `TRR` timestep have positions, velocities and forces on
`__init__`. (Issue #3891 PR #3892).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add under Fixes: XTC and TRR readers now fail with IOError when a status except EOK (=0) is detected on reading a frame instead of silently continuing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some readers had enough redundancy built in to avoid not continuing with retcode != EOK but at least its now done consistently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the pattern that protected some branches

        if return_code != EOK and return_code != EENDOFFILE \
           and return_code != EINTEGER:
            raise IOError('TRR read error = {}'.format(
                **error_message[return_code]))**

        if return_code == EENDOFFILE or return_code == EINTEGER:
            self.reached_eof = True
            raise StopIteration

# can only be EOK at this point but it was a bit opaque. 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good to know that we probably do not change a lot of behavior. Nevertheless, the current approach is cleaner.

(Btw, while somewhere I came across my own explanation of why TRR stops with EINTEGER and I think it was related to not being able to read the magic number.)

package/MDAnalysis/lib/formats/libmdaxdr.pyx Outdated Show resolved Hide resolved
package/MDAnalysis/lib/formats/libmdaxdr.pyx Outdated Show resolved Hide resolved
package/MDAnalysis/lib/formats/libmdaxdr.pyx Outdated Show resolved Hide resolved
package/MDAnalysis/lib/formats/libmdaxdr.pyx Outdated Show resolved Hide resolved
@hmacdope
Copy link
Member Author

@richardjgowers I'm going to make the executive decision to not implement buffers, because they are only relevant for methods that we plan to deprecate? Thoughts? LMK if you would prefer to do it.

@richardjgowers
Copy link
Member

@hmacdope yeah, because the buffers we're using are already the Timestep ones right? That sounds fine

@hmacdope hmacdope requested a review from orbeckst November 27, 2022 01:04
@orbeckst
Copy link
Member

Should be good to go for a squash-merge.

@hmacdope
Copy link
Member Author

With two approvals I will go ahead and merge. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

Improve XDR typing and performance
5 participants