-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lazy PMap
reader
#917
base: master
Are you sure you want to change the base?
Add lazy PMap
reader
#917
Conversation
49efdfb
to
3e69a4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read through the PR, very interesting work! My comments at the moment are mostly cosmetic/about tests.
I have read all the new lazy functions, but my knowledge of generators is a bit murky so I need to take more time to understand exactly how they work here in the pmap loader.
invisible_cities/io/pmaps_io_test.py
Outdated
|
||
assert isinstance(lazy, Generator) | ||
element = next(lazy) | ||
assert len(element ) == 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this spacing after element
is the way it is? It doesn't appear to align with anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a leftover from a previous version. Will fix.
invisible_cities/io/pmaps_io_test.py
Outdated
assert_PMap_equality(pmap_lazy, pmaps_eager[evt]) | ||
|
||
|
||
def test_load_pmaps_eager_lazy(KrMC_pmaps_filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test appears to be ensuring that the content from load_pmaps
is as expected (type and length-wise), but the test name is a big vague in that sense. Especially since the previous test (test_load_pmaps_lazy
) is almost the same name, but accomplishes a different task. Maybe this should be changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling to find a good name. Do you have any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess test_load_pmaps_returns_correct_format
or ...correct_type_and_shape
? The lazy
and eager
are both tested so I don't think they need to be specified distinctly.
invisible_cities/io/pmaps_io_test.py
Outdated
@@ -203,7 +233,7 @@ def test_load_pmaps_as_df_without_ipmt(KrMC_pmaps_without_ipmt_filename, KrMC_pm | |||
|
|||
|
|||
@given(stg.data()) | |||
def test_load_pmaps(output_tmpdir, data): | |||
def test_load_pmaps_eager(output_tmpdir, data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise in naming convention for this test, its a bit vague since it matches the other related tests in naming convention but differs in content.
It appears to be writing a pmap, then reading it in and comparing it to itself 'pre-writing'. Is that correct? In that case, is there a reason that the pmap is generated here explicitly but in the other tests KrMC_pmaps_filename
is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I didn't read the test, actually, I just renamed it and refactored it to make it work with the new version. But I don't see a reason why it shouldn't follow the same logic as the other ones, so I will change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that some of other tests are wrong, as they compare the output of load_pmaps_as_df_eager
with the output of a fixture, which only reads pmaps using that very function. So it's testing nothing essentially. I will refactor this test to make it simpler (but keep the logic) and fix the other ones.
3e69a4a
to
310e378
Compare
... and related tests
This now includes an explicit pmap coded in a fixture instead of using the composite implemented in a different file. Hopefully this makes the test more clear.
This test tested nothing. Now at least it makes sense.
This might come back in the future, but it will have to be refactored anyway. Moreover, this test was not doing anything sensible. so it will be removed
310e378
to
dc94d68
Compare
Cities that read
PMap
s consume a large amount of RAM because all events are kept in memory simultaneously. This PR adds new functions to read thePMap
s on an event-by-event basis, in a lazy manner. The old readers have been renamed using the suffix_eager
for clarity, while the new ones contain the suffix_lazy
. The names of the old readers have been reused in the interfacing functions that allow the user to switch between the lazy and the eager readers. The readers' default to the eager mode for backward compatibility. However, the cities will read them lazily to avoid large memory allocations.This PR is currently being tested in production mode.