Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Spectra to array of python Spectrum objects #3

Open
jorainer opened this issue Feb 25, 2022 · 10 comments
Open

Convert Spectra to array of python Spectrum objects #3

jorainer opened this issue Feb 25, 2022 · 10 comments

Comments

@jorainer
Copy link
Member

I assume the Spectrum class is defined in matchms? is this a general object also used in other packages? @hechth can you maybe provide more info here?

Also, is there a documentation on the additional attributes and the expected data types for this type of spectrum?

@michaelwitting
Copy link
Collaborator

The documentation of the Spectrum class can be found here:
https://matchms.readthedocs.io/en/latest/api/matchms.Spectrum.html

@hechth
Copy link
Collaborator

hechth commented Feb 25, 2022

The additional attributes are just stored in a dict in the Metadata class - I'm not sure how dicts are managed in reticulate, bu the individual attributes can be accessed.

Maybe we can schedule an initial meeting to discuss the aims of this project? I think that might be more efficient :)

@michaelwitting
Copy link
Collaborator

Python dict equals to an R named list

@hechth
Copy link
Collaborator

hechth commented Feb 25, 2022

@jorainer The Spectrum object is likely used in Spec2Vec, MS2DeepScore and also in most of our python based tools - the actual peaks are stored in a separate object.

So you want something that takes a Spectra R object and gives you a list of Spectrum python objects?

@jorainer
Copy link
Member Author

Exactly @hechth ! Ideally with an option to set all, or only a reduced set of additional spectra variables we have in Spectra.

And then also the way back would be cool (list of Spectrum -> Spectra).

@michaelwitting
Copy link
Collaborator

Super simple solution, without carrying any metadata for the moment:

# function to convert an R Spectra to a Python Spectrum
rspect_to_pyspec <- function(x, reference = import("matchms")) {
  
  reference$Spectrum(mz = np_array(x$mz[[1]]),
                      intensities = np_array(x$intensity[[1]]))
  
}

# convert spectra to python spectrum
spectrum1 <- spectrum2 <- spectrapply(sps, rspect_to_pyspec) %>% unname()

Does any function within matchms, Spec2Vec, MS2DeepScore etc, use any other information, e.g. precursor m/z etc? I guess this has to go to the Metadata, right?

@hechth
Copy link
Collaborator

hechth commented Feb 25, 2022

I don't think Spec2Vec and MS2DeepScore do, the MetadataMatch class does

@jorainer
Copy link
Member Author

jorainer commented Mar 2, 2022

So, maybe a good solution would be to have a converter function that has a parameter to allow defining which (if any) spectra variables should be converted to Python. I would assume converting just the peaks matrix would be faster than considering also metadata, and if matchms does not use metadata anyway we should also allow to avoid this additional conversion (for performance reasons).

@michaelwitting
Copy link
Collaborator

I have in my branch already some functions. I can modify them to have a selection of metadata to be carried along.

@hechth
Copy link
Collaborator

hechth commented Mar 2, 2022

Matchms does use metadata - it can be used for filters - since metadata is just holding a dict, any kind of metadata can be stored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants