Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Extraction: implement data fetching from KG #334

Open
6 tasks
alex4200 opened this issue Jul 31, 2018 · 23 comments
Open
6 tasks

Feature Extraction: implement data fetching from KG #334

alex4200 opened this issue Jul 31, 2018 · 23 comments
Assignees
Labels
Topic_Trace_FeatureExtraction Issues related to feature extraction Type_Feature Label to identify new/changed features for an existing use case
Milestone

Comments

@alex4200
Copy link
Contributor

alex4200 commented Jul 31, 2018

Use Case Update

Aspect Detail
Summary the current feature extraction web app will be modified in order to fetch the electrophysiological data (from which the feature are extracted) from the HBP dataset (container/KG). At the moment the data are stored (and fetched from) the web-app backend storage.
Expert @lbologna
Deadline Start expected in May 2019. Expected date of the update first release: TBD. It depends on the curation process and implementation of the new API as well as on the format used to store the data in HBP containers: formats currently not read by the BluePyEfe tool will require its modification.
Dependencies
Duration 2-3 weeks full time required to implement this feature

List of additional/changed features

Data filtering and selection for feature extraction will be performed on data stored in the HBP datasets through the KG.

Tasks

  • TODO: Need to check what information is in the KG (stimuli?)
  • TODO: How to fetch the data from KG?
  • Link the data filtering/selection to the KG search engine - mandatory
  • Fetch data from HBP container through the KG - mandatory
  • Integrate HBP dataset format to BluePyEfe/WebApp (when needed) - mandatory
  • Extract features from HBP datasets - mandatory

Acceptance Criteria

  • Data are correctly fetched from HBP datasets through KG and features are correctly exctracted
  • First you search the KG, then you fetch the data

Extra Requirements

System

BluePyEfe modification will be needed for data currently not read by the tool

Performance

Additional performance requirements (if any) will be clarified after the first load tests

@alex4200 alex4200 added the Type_Feature Label to identify new/changed features for an existing use case label Jul 31, 2018
@alex4200 alex4200 added this to the Issues for 2018-12 milestone Jul 31, 2018
@alex4200 alex4200 modified the milestones: Issues for 2018-12, M24: March 2020 Aug 20, 2018
@antonelepfl
Copy link
Collaborator

Working on adapting the WebUI to be able to read the format of KG

@lbologna
Copy link
Collaborator

lbologna commented Feb 4, 2019

The BluePyEfe software, used for the feature extraction, reads the following format: .abf, .smr, .igor, .txt (with metadata).
At the moment, only a limited number of .abf traces is available in the KG (no .smr, no .igor, .txt to be checked) that have to be tested and are different from the ones available at the moment on the backend server.
For the moment we will leave the current status (data on the backend).

By M12 we will implement a first version (dev) of the needed interface to fetch a subset of .abf files available on the KG, provided that these files have the needed requirements (step currents stimuli, .abf v2 files).

@alex4200
Copy link
Contributor Author

This ticket will be probably postponed to M18

@lbologna
Copy link
Collaborator

lbologna commented Mar 4, 2019

Postponed to M18

@alex4200
Copy link
Contributor Author

@lbologna Please do not forget to update this task with some more details

@lbologna lbologna changed the title Feature extraction GUI update Feature Extraction: implement data fetching from KG May 17, 2019
@lbologna
Copy link
Collaborator

lbologna commented May 17, 2019

@lbologna Please do not forget to update this task with some more details

That's done (in the main issue description).

@apdavison
Copy link
Collaborator

Are you aware that we are developing in SP5 a Python library for querying and fetching data from the KG? I suggest we have a video meeting to discuss the requirements.

@lbologna
Copy link
Collaborator

Are you aware that we are developing in SP5 a Python library for querying and fetching data from the KG? I suggest we have a video meeting to discuss the requirements.

@apdavison no, I was not aware you were on it. Yes, for sure a call would be very useful. What about Thursday (23 May) or Friday (24 May) any time before 13? Thanks

@apdavison
Copy link
Collaborator

How about Friday 24th at 12 noon?

@lbologna
Copy link
Collaborator

Fine with us! Thanks.

@apdavison
Copy link
Collaborator

Update: the NAR ("Neural Activity Resource") library has now been released under the name "fairgraph" - https://github.com/HumanBrainProject/fairgraph (also available with pip install fairgraph). This can be used to retrieve models and data from the KG - see tutorial notebook at https://collab.humanbrainproject.eu/#/collab/1635/nav/380436

@alex4200
Copy link
Contributor Author

alex4200 commented Jul 1, 2019

@apdavison I have run the example (command line and notebook), but in the section An activity dataset with minimal metadata it does not find any dataset.

Can you look into that and either fix it, or update README.md and the notebook?

@clupascu
Copy link
Collaborator

@apdavison For the 26 cells retrieved based on the brain region ("hippocampus CA1")
cells_in_ca1 = PatchedCell.list(client, brain_region=BrainRegion("hippocampus CA1"))
is there any possibility to use one other filter in order to retrieve one specific cell, instead of doing
example_cell = cells_in_ca1[3] ?

I see no other metadata apart for the CellType and BrainRegion

PatchedCell('hbp00011_Sub2_Samp4__ExpE8', CellType('hippocampus CA1 pyramidal cell', 'http://uri.neuinfo.org/nif/nifstd/sao830368389'), BrainRegion('hippocampus CA1', 'http://purl.obolibrary.org/obo/UBERON_0003881'), KGQuery(<class 'fairgraph.electrophysiology.PatchedCellCollection'>, {'path': 'prov:hadMember', 'op': 'in', 'value': ['https://nexus.humanbrainproject.org/v0/data/neuralactivity/experiment/patchedcell/v0.1.0/75c761ba-ce25-4659-aa2d-c4b6e2a417fc']}), https://nexus.humanbrainproject.org/v0/data/neuralactivity/experiment/patchedcell/v0.1.0/75c761ba-ce25-4659-aa2d-c4b6e2a417fc)

Thanks.

@clupascu
Copy link
Collaborator

@apdavison I have run the example (command line and notebook), but in the section An activity dataset with minimal metadata it does not find any dataset.

Can you look into that and either fix it, or update README.md and the notebook?

@alex4200 in the section An activity dataset with minimal metadata I get 29 datasets.

@apdavison
Copy link
Collaborator

is there any possibility to use one other filter in order to retrieve one specific cell

If you know the cell id (either UUID or full URI) or the cell name, you can retrieve it with PatchedCell.from_uuid(uuid), PatchedCell.from_uri(uri) or PatchedCell.by_name(name).

Using PatchedCell.list(), only one filter at a time is currently allowed. I've created a ticket to fix this: HumanBrainProject/fairgraph#9

Can you tell me exactly what properties you need to filter on?

I see no other metadata apart for the CellType and BrainRegion

There are many other metadata attributes, see https://github.com/HumanBrainProject/fairgraph/blob/master/fairgraph/electrophysiology.py#L136

If information you need is missing, please let me know so we can add it!

@apdavison
Copy link
Collaborator

@alex4200 do you still get an empty list? If so, it may be a permissions problem. I presume you have an SGA2 accreditation? (https://collab.humanbrainproject.eu/#/me/accreditations)

@clupascu
Copy link
Collaborator

clupascu commented Jul 12, 2019

is there any possibility to use one other filter in order to retrieve one specific cell

If you know the cell id (either UUID or full URI) or the cell name, you can retrieve it with PatchedCell.from_uuid(uuid), PatchedCell.from_uri(uri) or PatchedCell.by_name(name).

Using PatchedCell.list(), only one filter at a time is currently allowed. I've created a ticket to fix this: HumanBrainProject/fairgraph#9

More than one filter would be needed. Thanks.

Can you tell me exactly what properties you need to filter on?

I see no other metadata apart for the CellType and BrainRegion

There are many other metadata attributes, see https://github.com/HumanBrainProject/fairgraph/blob/master/fairgraph/electrophysiology.py#L136

If information you need is missing, please let me know so we can add it!

Can I retrieve cells based on the contributor name or the lab for example?

@lbologna
Copy link
Collaborator

Hi,
@apdavison I am experiencing the same problem @alex4200 has: empty lists for all searches (based on brain region, based on species, based on cell type). I have the SGA2 accreditation.

@alex4200
Copy link
Contributor Author

@apdavison I have the SGA1 and SGA2 accreditation. I just tried the query again in a notebook and from the command line, but still get an empty list only.

@apdavison
Copy link
Collaborator

@lbologna and @alex4200 I've changed the permissions, please could you try again?

@lbologna
Copy link
Collaborator

@apdavison thank you. It works for me but only up till the "Search based on cell type" cell.
The "An activity dataset with minimal metadata" and "An activity dataset with extended metadata" section seem to fetch nothing and give the errors below:

  • Error for "An activity dataset with minimal metadata"

IndexError Traceback (most recent call last)

----> 1 dataset = activity_datasets[-1]
2 dataset.owners[0].resolve(client)

IndexError: list index out of range


  • Error for "An activity dataset with extended metadata"

AttributeError Traceback (most recent call last)
5 "path": "nsg:partOf",
6 "op": "eq",
----> 7 "value": dataset.id
8 }
9 context = {

AttributeError: 'NoneType' object has no attribute 'id'

@alex4200
Copy link
Contributor Author

@apdavison Have you looked into this issue yet?

@apdavison
Copy link
Collaborator

I'm looking into it; waiting for some feedback from the KG team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Topic_Trace_FeatureExtraction Issues related to feature extraction Type_Feature Label to identify new/changed features for an existing use case
Projects
None yet
Development

No branches or pull requests

5 participants