-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EDFIO: Alleviate EDF single handle problem #1584
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of annoying, but I think I would prefer symmetry. Either delete the close
and only have _close_reader
or make an open
function that calls the _open_reader
internally. Does that make sense?
I would rather erase close but that is a public method that I did not add. Would break backwards compatibility: python-neo/neo/test/rawiotest/test_edfrawio.py Lines 30 to 41 in ade13db
If you are OK with that I can do it. I think that is a good thing, when we fix this the close method will stop making sense. For that reason, I also think that adding an open function would be a bad idea. That said, I changed the private method to be |
But I would like to deprecate the close method in another PR. I don't want this to derail the conversation when Sam reviews it. |
That all works for me we can think about that later. |
Co-authored-by: Zach McKenzie <92116279+zm711@users.noreply.github.com>
I just changed the title since I'm compiling release notes it just helps me to quickly see which IO it is. This isn't an official repo rule, just a help me :) |
|
||
self._t_stop = self.edf_reader.datarecord_duration * self.edf_reader.datarecords_in_file | ||
# use sample count of first signal in stream | ||
self._stream_index_samples = {stream_index : self.edf_reader.getNSamples()[chidx][0] for stream_index, chidx in self.stream_idx_to_chidx.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is [0]
coming from?
this whole block seems like a great idea, but I don't see the extra [0] in previous _get_signal_size
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
def _get_signal_size(self, block_index, seg_index, stream_index): | ||
chidx = self.stream_idx_to_chidx[stream_index][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zm711 here is the zero. The idea is to use the first channel of the stream to get the samples.
OK for me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool me too.
This should close #1557
This PR moves the logic of opening the edf reader to the functions that extract the data (
_get_analogsignal_chunk
).This avoids the problem of someone initializing the io (in a notebook for example) and then having a failure to initialize it again because the handle was left dangling. Moreover, it would allow parallel processing to work better (but not perfect as it might be concurrent access crashes).
The performance costs are minimal in this case because the io is really slow (I think this can be fixed but that's a matter for another PR). For example, I have this mid-sized file 7000 MiB, 276 channels, and around 20 minutes. It takes 10.8 seconds to make a reading of the whole data in master. With the new changes it takes ... 11.0 seconds:
The moral is that when the data extraction itself takes most of the time, the
io.open
instruction cost does not count.As mentioned in #1557 the best way would be to handle this with
mne
library but this should be an improvement over the current state of affair where this io does not play nice with a lot of downstream usage patterns (spike interface re-opens the io to get streams for example which precludes get the streams and then opening the reader again for the extractor).