Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 4039 large dcd #4048

Merged
merged 7 commits into from
Mar 29, 2023
Merged

Issue 4039 large dcd #4048

merged 7 commits into from
Mar 29, 2023

Conversation

richardjgowers
Copy link
Member

Fixes #4039

Changes made in this Pull Request:

  • use fio_size_t for variables related to filesize

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

skipped by pytest by default unless LARGEDCD env var set
use fio_size_t for all variables related to filesizes

fixes for #4039
@IAlibay
Copy link
Member

IAlibay commented Mar 1, 2023

yup this works re: bugfix, let's just make sure we squash merge

IAlibay
IAlibay previously requested changes Mar 1, 2023
Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually care about that skipif so much?

yield newf, nreps_reqs


@pytest.mark.skipif(not os.environ.get('LARGEDCD', False),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's kinda confusing logic, and looks undocumented, are we really expecting to use it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can just remove the test if you like, it was handy while I was fixing the bug

Copy link
Member

@IAlibay IAlibay Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test would be good to keep, I just don't really know why you'd need a skipif that isn't really documented. Did we not already have a high memory flag from the EDR tests? Can we just use that instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should run the test in at least one runner every time.

And as I said in the original issue, eventually every reader should be tested with a large trajectory so that we have a better chance catching these kind of issues.

@IAlibay
Copy link
Member

IAlibay commented Mar 1, 2023

FYI lint failure is optional, I might make it print comments instead when I get time to play with the GH API again

@codecov
Copy link

codecov bot commented Mar 1, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (5794c82) 93.57% compared to head (9cbe8e8) 93.57%.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4048   +/-   ##
========================================
  Coverage    93.57%   93.57%           
========================================
  Files          192      192           
  Lines        25133    25135    +2     
  Branches      4056     4056           
========================================
+ Hits         23517    23521    +4     
+ Misses        1095     1094    -1     
+ Partials       521      520    -1     
Impacted Files Coverage Δ
package/MDAnalysis/lib/formats/libdcd.pyx 90.85% <100.00%> (-0.26%) ⬇️

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

@@ -436,3 +437,38 @@ def test_pathlib():
# we really only care that pathlib
# object handling worked
assert u.atoms.n_atoms == 3341


@pytest.fixture
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a module level fixture so that it really only runs once? Unfortunately will need to use the tmpdir factory

@@ -391,7 +391,9 @@ cdef class DCDFile:
if frame == 0:
offset = self._header_size
else:
offset = self._header_size + self._firstframesize + self._framesize * (frame - 1)
offset = self._header_size
offset += self._firstframesize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this ensure that the overflow cannot happen?

Btw, frames was declared as int in the methods signature. Should that be changed, too, or is that a Python int with infinite size?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've repro'd the exact bug (with the contentious test) and this fixes it. I've not looked at the raw c and followed all the types.. but by eye promoting some variables to the correct datatype seemed to jiggle it into place

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orbeckst the def(blah: int ) syntax in cython allows it to switch between int or PyInt depending on how much it knows about types. I think @richardjgowers approach of changing the size of the directly declared C types is the correct one.

fsize = 3.8 # mb
nreps_reqs = int(2100 // fsize) # times to duplicate traj to hit 2.1Gb

newf = str(tmpdir / "jabba.dcd")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name approved!

@pytest.fixture
def large_dcdfile(tmpdir):
# creates a >2Gb DCD file
fsize = 3.8 # mb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be super-flexible, get the size from DCD itself. Totally optional



@pytest.mark.skipif(
not os.environ.get("LARGEDCD", False), reason="Skipping large file test"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the env bar is supposed to skip the test then better call it SKIPLATGEFILETESTS or something like that. In any case, update CI so that it runs somewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Env bar = env var… sorry typing from mobile

@orbeckst
Copy link
Member

orbeckst commented Mar 1, 2023

@IAlibay if you don't want to be in charge please assign someone else, but given that this is related to releases etc I thought you'd be the best person.

@IAlibay
Copy link
Member

IAlibay commented Mar 1, 2023

@IAlibay if you don't want to be in charge please assign someone else, but given that this is related to releases etc I thought you'd be the best person.

no worries, I'm happy to be in charge of merging, will make sure I don't forget to release

@IAlibay
Copy link
Member

IAlibay commented Mar 1, 2023

So one thing to be aware of here - there are only 14 GB disk space available on GitHub runners. We'll need to make sure we're 100% sure with clearing up space, especially when dealing with pytest-xdist.

@richardjgowers
Copy link
Member Author

I'm not sure it's a good idea to run a large file test for every format for every run. They're slow to create for one (2Gb of I/O) and it's probably not necessary.

I'm (we're) not going to have time to solve the entire issue of testing large files I/O in this PR, but I might suggest that we take this patch and fix a popular format in a bugfix release

@IAlibay
Copy link
Member

IAlibay commented Mar 1, 2023

I'm not sure it's a good idea to run a large file test for every format for every run. They're slow to create for one (2Gb of I/O) and it's probably not necessary.

I'm (we're) not going to have time to solve the entire issue of testing large files I/O in this PR, but I might suggest that we take this patch and fix a popular format in a bugfix release

fair, do you want to just raise an issue with the current state of things?

@orbeckst
Copy link
Member

orbeckst commented Mar 1, 2023

I don’t suggest that this PR should solve the testing big files for every format.

But I think the PR should make some changes to the CI files that ensure that this test is run — either in at least one runner or at an absolute minimum in the cron. @IAlibay might have a better idea of when we should run it. But I’d want to avoid seeing such a bad regression again.

@IAlibay
Copy link
Member

IAlibay commented Mar 1, 2023

May I counter @orbeckst and ask that we don't deal with CI here? This is a good cherry pickable PR. Add CI and it's going to be a pain (changes a decent chunk between releases).

@IAlibay
Copy link
Member

IAlibay commented Mar 1, 2023

I'm happy to take on the responsibility of fixing up a CI entry for this if @richardjgowers would prefer not opening a second PR.

@IAlibay
Copy link
Member

IAlibay commented Mar 1, 2023

Also note that I approve but leave red so that I can just fixup stuff here directly so we don't need a second pre-2.4.3 PR.

@hmacdope
Copy link
Member

hmacdope commented Mar 1, 2023

I'm not sure it's a good idea to run a large file test for every format for every run. They're slow to create for one (2Gb of I/O) and it's probably not necessary.

I'm (we're) not going to have time to solve the entire issue of testing large files I/O in this PR, but I might suggest that we take this patch and fix a popular format in a bugfix release

I agree, I more was thinking that this is a potential class of bug we should investigate, especially with the XDR reader that I cythonised in #3892. We (I) can raise an issue and we can work from there?

Copy link
Member

@hmacdope hmacdope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @richardjgowers!

@@ -391,7 +391,9 @@ cdef class DCDFile:
if frame == 0:
offset = self._header_size
else:
offset = self._header_size + self._firstframesize + self._framesize * (frame - 1)
offset = self._header_size
offset += self._firstframesize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orbeckst the def(blah: int ) syntax in cython allows it to switch between int or PyInt depending on how much it knows about types. I think @richardjgowers approach of changing the size of the directly declared C types is the correct one.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that @IAlibay prefers the PR in this form for easier handling and given that he also committed to getting the test to run on CI somehow, I have no further objections.

@orbeckst orbeckst mentioned this pull request Mar 9, 2023
@orbeckst
Copy link
Member

orbeckst commented Mar 9, 2023

@IAlibay can we merge this into develop and you cherry-pick once you are ready for a hotfix 2.4.2 #4061 ?

I'd be keen to close #4048 and #4039 .

@IAlibay
Copy link
Member

IAlibay commented Mar 9, 2023

@IAlibay can we merge this into develop and you cherry-pick once you are ready for a hotfix 2.4.2 #4061 ?

I'd be keen to close #4048 and #4039 .

I still need to make a couple of changes, I'll try to deal with this tomorrow sorry about the delay.

@orbeckst
Copy link
Member

Progress?

@IAlibay
Copy link
Member

IAlibay commented Mar 23, 2023

Progress?

Sorry I've been swamped lately and I mostly need a good empty half day to do the release, depending on how metrics generation & fixing darker lint goes I'll try to do so today or tomorrow.

@orbeckst
Copy link
Member

I understand that the release requires a solid chunk of time. My (poorly worded) question was more along the lines of what needs to be done to be able to merge the PR into develop — when you said

I still need to make a couple of changes

Once it's merged then we can offer at least a working development version and you can cherry-pick from develop when you can fit it in. At least that's how I understood your comment for the process.

@IAlibay
Copy link
Member

IAlibay commented Mar 23, 2023

Once it's merged then we can offer at least a working development version and you can cherry-pick from develop when you can fit it in. At least that's how I understood your comment for the process.

Yeah sorry, there's a significant element of "I don't remember fully what I need fixed to cherry pick easily" (I have the medium term memory of a goldfish lately...), so I was trying to get a bit of time to review what I needed before having to make a mess out of this.

I've booked off the evening for this, so let's try to get this done now.

@IAlibay IAlibay dismissed their stale review March 29, 2023 14:24

completed

@IAlibay IAlibay merged commit 628e0f7 into develop Mar 29, 2023
@IAlibay IAlibay deleted the issue-4039_large_DCD branch March 29, 2023 16:24
IAlibay pushed a commit that referenced this pull request Mar 29, 2023
Fixes #4039
* Fixes DCD seeking for large (2Gb+) files.
@IAlibay IAlibay added the defect label Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when loading multiple large DCD trajectories
4 participants