-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return a structured array for the spikes in python #285
Conversation
Thanks @jorblancoa. I think the interface is really nice. I'm however not very happy with the convoluted path of the data, which I know it's not from this PR, but I would expect better from an "optimized library". |
I expect there needs to be a new c++ method that builds these separated vectors which is waht I was getting at here: https://github.com/BlueBrain/neurodamus/pull/8/files#discussion_r1273085740
That copy should be fixed. The filtering, however, is important.
On the C++ side, I think it's fine; keeping that API stable is important. On the python side, a list of tuples is not great, but API stability is more important. Once we have the new python methods in place, we can start issuing |
d5be763
to
4b30e15
Compare
Shall we merge this in order to create a tag together with the LFP changes? @mgeplf |
I think @ferdonline's point of " I'm however not very happy with the convoluted path of the data, which I know it's not from this PR, but I would expect better from an "optimized library"." still stands. I think we should do it with a less convoluted data path before it gets committed. |
Does that belong to this PR? |
I would say so; the extra array created by https://github.com/BlueBrain/libsonata/pull/285/files#diff-cc07100b7c7235ddf263fda0d515a7b40ed1fe5b871714e0ab5c1ec5298ddb0dR1193 can be avoided if some internal methods were added to get the id and timestamp separately |
Makes sense, but then the best would be to have them as members in the report_reader.h right? Otherwise the spikes_ need to be used anyway to filter and then converted to the 2 arrays. |
Yeah, I think I understand what you mean and that would make sense. |
Since we cant reuse the existing code for filtering and fernando's usecase only needs the 2 raw arrays, would you be ok for that? @ferdonline @mgeplf |
Don't we still have the |
@mgeplf IIUC at line 162 there's the fast path. If the client wants to filter data then the copy happens to |
But we want to be able to filter things; that’s an important use case. It seems weird to me to have users of a library have to know that there is a On the neurodamus side, isn’t it useful to only load spikes that are going to be used in the simulation rather than loading all of them? For instance, if the simulation is going to run for 1s, but the spikes file contains 10s worth of spikes? What about the case where the simulation is only doing a subset of node_ids? Wouldn’t filtering the data before it all gets loaded be valuable? These are the sorts of optimizations that can be done here, and then the would benefit everyone using the library. They’re also useful right now, as when people do analysis, a subset of the data is usually looked at. |
It makes sense, I wanted to reuse code but is true in case of calling getArrays it doesnt make sense to call get() to make the pairs only for the filtering. I pushed some changes to filter directly in the getArrays() method. |
In neurodamus we can't always filter ahead of time because of setup like coreneuron+save-restore, so it was ok. But I get it that maybe for other uses we want filters more often. |
- Modify Spike object from std::pair to struct
51993e6
to
3394927
Compare
Comparing the
Gives:
I'm not a big fan of different access mechanisms having much different execution profiles, because that means library consumers have to benchmark everything to know what to use - which isn't very ergonomic. In this case, we'll have to fix it later. |
I should also add that the fast path (ie: |
- Move createSpikes to private scope - Use a struct instead of a pair for the SpikeTimes - Return a const ref when getting the raw arrays
Nice, fixing the regression makes a big difference:
|
Thanks @jorblancoa! |
## Context Use libsonata instead of h5py in order to read the spikes. The new 'get_dict()' method is used to retrieve the 'node_ids' and the 'timestamps' of a spikes report. (BlueBrain/libsonata#285) ## Review * [x] PR description is complete * [x] Coding style (imports, function length, New functions, classes or files) are good * [x] Unit/Scientific test added * [ ] Updated Readme, in-code, developer documentation
Following the discussions in [BBPBGLIB-1044]