Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MsBackendMetaboLights #3

Merged
merged 5 commits into from
Sep 7, 2024
Merged

Add MsBackendMetaboLights #3

merged 5 commits into from
Sep 7, 2024

Conversation

jorainer
Copy link
Member

@jorainer jorainer commented Sep 5, 2024

Add the MsBackendMetaboLights backend.

- Add functionality to download and cache MS data files from MetaboLights.
@jorainer jorainer changed the title Add vignette Add MsBackendMetaboLights Sep 5, 2024
Copy link
Collaborator

@philouail philouail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my comments might be a bit confusing.
Last one on the overall structure. Some helper functions are in Metabolights.R why not just putting everything in one file, MsBackendMetabolights.R also has helper functions there. Would kinda make it easier to read overall to have either everything together or completely separate function and helper functions.

I can also come back to you once I try to combine that with the loading of the MsExperiment code. But thanks i'm super excited for people to be able to use Metabolights datasets so easily.

setClass("MsBackendMetaboLights",
contains = "MsBackendMzR",
slots = c(mtblsId = "character",
assays = "character"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm a bit confused, where is "assays" defined ? shouldn't there be a @slot for it in the documentation ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valid point. I on purpose did not include a documentation of the MsBackendMetaboLights class because I don't want users to access the internal information in the class. Both the @mtblsId and @assays slots might actually be useless, because we're adding the respective information as spectra variable. So, yes, I could drop these slots alltogether because we don't need/use them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think that would make it clearer for sure ! thanks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the slots now. Thanks for pointing me to that.

R/MsBackendMetaboLights.R Show resolved Hide resolved
#' This `data.frame` has one row per data file with columns:
#' - `"rid"`: the BiocFileCache ID of each file.
#' - `"mtbls_id"`: the MTBLS ID
#' - `"mtbls_assay_name"`: the name of the assay file for each data file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe assayId shoud be renamed to assayName ? especially if we expect character() input ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, indeed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed with the last commit.

@@ -0,0 +1,186 @@
---
title: "Seamless Integration of Mass Spectrometry Data from MetaboLights"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the backend helps to reach a "seamless integration" but by itself not really.
Together with re-building the full MsExperiment() with sample data then starting the preprocessing. That is seamless integration i think no ? And that is really really cool.
Kind of continuing on this, does it make sense to separate this backend in one package and then the MsExperiment() object creation in another ?.. I don't know what to think. I just find it a bit convoluted... wouldn't it make sense to regroup them somehow ? Just raising the idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be a title Using Mass Spectrometry Data From MetaboLights be a better title?

For where to put the whole functionality - I also thought on that. I would not like to put the MsBackendMetaboLights backend into MsIO - we usually have backends in separate packages to clearly separate any potential dependencies and have easier maintenance. I would like to keep the MsBackendMetaboLights package independent of MsIO (and all eventual dependencies from there).

Same thing with putting the code for MsExperiment into MsBackendMetaboLights: I think it fits better into MsIO (for now?) because the readMsObject method is defined there. The backend classes/packages should only depend/import the Spectra package but ideally not any package that imports/depends also Spectra (this could lead to cyclic dependencies).

Or you have a better suggestion where to put the code?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or 'importing', 'loading',..? but yes otherwise that fits better i think.
No no i agree completely with you. But i'll think about if there is a possible better option. It's just a lot of packages overall and i wonder if we are not complexifying the ecosystem, but it is the best option for now. We can discuss it further later on :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, we had discussions on whether more or less packages would be better. I'm on the more than less packages side - so only those users using one certain type of data (like MetaboLights) need to install/load a specific package. But yes, we should/could discuss this further.

The MsIO package is currently more a playground to develop the import/export functions independently (without affecting e.g. the Spectra or MsExperiment packages) - in future we might to move the functionality into the respective packages... but still unsure about that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes maybe once we have it (MsIO) more developed we can try to "map" out how everything could be distributed essentially. I do like the idea of the import/export in one package though, it's also much easier to explain to people

vignettes/MsBackendMetaboLights.Rmd Show resolved Hide resolved
@jorainer
Copy link
Member Author

jorainer commented Sep 5, 2024

The reason that I had two R files was that I thought I will have more general (exported) functions to work with MetaboLights. But I agree, it's a bit odd to have helpers in both functions.

Copy link
Collaborator

@philouail philouail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this is super nice !! I'll integrate it in MsIO

@jorainer jorainer merged commit 881e2d7 into main Sep 7, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants