From 1efb9093dfcbdc1b21af0acfe62df924547e347c Mon Sep 17 00:00:00 2001 From: Johannes Rainer Date: Thu, 3 Oct 2024 15:21:09 +0200 Subject: [PATCH] Add MsBackendMetaboLights to the list of backends --- DESCRIPTION | 2 +- NEWS.md | 4 +++ README.md | 73 ++++++++++++++++++++++++++++--------------- vignettes/Spectra.Rmd | 51 +++++++++++++++++++----------- 4 files changed, 85 insertions(+), 45 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 91db6af4..1892f972 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: Spectra Title: Spectra Infrastructure for Mass Spectrometry Data -Version: 1.15.10 +Version: 1.15.11 Description: The Spectra package defines an efficient infrastructure for storing and handling mass spectrometry spectra and functionality to subset, process, visualize and compare spectra data. It provides different diff --git a/NEWS.md b/NEWS.md index 3cc44fb5..c3cf888c 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,9 @@ # Spectra 1.15 +## Changes in 1.15.11 + +- Add reference to `MsBackendMetaboLights`. + ## Changes in 1.15.10 - Add new `extractSpectra()` generic and implementation for `MsBackend`. Fixes diff --git a/README.md b/README.md index be839639..78d7efb9 100644 --- a/README.md +++ b/README.md @@ -19,58 +19,81 @@ footprint. A (possibly incomplete) list of available backends (along with a link to the R package providing it) is shown below: -- `MsBackendMemory` (package: *Spectra*): *default* backend which keeps all data - in memory. Optimized for fast processing. +- `MsBackendCompDb` (package + [*CompoundDb*](https://github.com/rformassspectrometry/CompoundDb): provides + access to spectra data (spectra and peaks variables) from a *CompDb* + database. Has a small memory footprint because all data (except precursor m/z + values) are retrieved on-the-fly from the database. + - `MsBackendDataFrame` (package: *Spectra*): alternative to the `MsBackendMemory` also keeping all data in memory, but supporting `S4` objects as spectra variables because the data is stored internally in a `DataFrame`. -- `MsBackendMzR` (package: *Spectra*): by using the `mzR` package it supports - import of MS data from mzML, mzXML and CDF files. This backend keeps only - general spectra variables in memory and retrieves the peaks data (m/z and - intensity values) on-the-fly from the original data files. The backend has - thus a smaller memory footprint compared to in-memory backends. + - `MsBackendHdf5Peaks` (package: *Spectra*): on-disk backend similar to `MsBackendMzR`, but the peaks data is stored in HDF5 files (general spectra variables are kept in memory). -- `MsBackendMgf` (package - [*MsBackendMgf*](https://github.com/rformassspectrometry/MsBackendMgf): allows - to import/export data in mascot generic format (MGF). Extends the - `MsBackendDataFrame` and keeps thus all data, after import, in memory. -- `MsBackendMsp` (package - [*MsbackendMsp*](https://github.com/rformassspectrometry/MsBackendMsp): allows - to import/export data in NIST MSP format. Extends the `MsBackendDataFrame` and - keeps thus all data, after import, in memory. + +- `MsBackendHmdbXml` (package + [*MsbackendHmdb*](https://github.com/rformassspectrometry/MsBackendHmdb)): + allows import of MS data from xml files of the Human Metabolome Database + (HMDB). Extends the `MsBackendDataFrame` and keeps thus all data, after + import, in memory. + - `MsBackendMassbank` (package [*MsBackendMassbank*](https://github.com/rformassspectrometry/MsBackendMassbank)): allows to import/export data in MassBank text file format. Extends the `MsBackendDataFrame` and keeps thus all data, after import, in memory. + - `MsBackendMassbankSql` (package [*MsBackendMassbank*](https://github.com/rformassspectrometry/MsBackendMassbank)): allows to directly connect to a MassBank SQL database to retrieve all MS data and variables. Has a minimal memory footprint because all data is retrieved on-the-fly from the SQL database. + +- `MsBackendMemory` (package: *Spectra*): *default* backend which keeps all data + in memory. Optimized for fast processing. + +- `MsBackendMetaboLights` (package + [*MsBackendMetaboLights*](https://github.com/rformassspectrometry/MsBackendMetaboLights)): + retrieves and caches MS data files from MetaboLights. + +- `MsBackendMgf` (package + [*MsBackendMgf*](https://github.com/rformassspectrometry/MsBackendMgf)): allows + to import/export data in mascot generic format (MGF). Extends the + `MsBackendDataFrame` and keeps thus all data, after import, in memory. + +- `MsBackendMsp` (package + [*MsbackendMsp*](https://github.com/rformassspectrometry/MsBackendMsp)): allows + to import/export data in NIST MSP format. Extends the `MsBackendDataFrame` and + keeps thus all data, after import, in memory. + +- `MsBackendMzR` (package: *Spectra*): by using the `mzR` package it supports + import of MS data from mzML, mzXML and CDF files. This backend keeps only + general spectra variables in memory and retrieves the peaks data (m/z and + intensity values) on-the-fly from the original data files. The backend has + thus a smaller memory footprint compared to in-memory backends. + +- `MsBackendOfflineSql` (package + [*MsBackendSql*](https://github.com/rformassspectrometry/MsBackendSql)): + stores all MS data in a SQL database and has thus a minimal memory footprint. + Does, in contrast to `MsBackendSql`, not keep an active SQL database + connection and can thus support parallel processing. + - `MsBackendRawFileReader` (package [*MsBackendRawFileReader*](https://github.com/fgcz/MsBackendRawFileReader)): implements a backend for reading MS data from Thermo Fisher Scientific's raw data files using the manufacturer's NewRawFileReader .Net libraries. The package generalizes the functionality introduced by the `rawrr` package. -- `MsBackendHmdbXml` (package - [*MsbackendHmdb*](https://github.com/rformassspectrometry/MsBackendHmdb)): - allows import of MS data from xml files of the Human Metabolome Database - (HMDB). Extends the `MsBackendDataFrame` and keeps thus all data, after - import, in memory. + - `MsBackendSql` (package [*MsBackendSql*](https://github.com/rformassspectrometry/MsBackendSql)): stores all MS data in a SQL database and has thus a minimal memory footprint. -- `MsBackendCompDb` (package - [*CompoundDb*](https://github.com/rformassspectrometry/CompoundDb): provides - access to spectra data (spectra and peaks variables) from a *CompDb* - database. Has a small memory footprint because all data (except precursor m/z - values) are retrieved on-the-fly from the database. + - `MsBackendTimsTof` (package [*MsBackendTimsTof*](https://github.com/rformassspectrometry/MsBackendTimsTof): allows import of data from Bruker TimsTOF raw data files (using the `opentimsr` R package). + - `MsBackendWeizMass` (package [*MsBackendWeizMass*](https://github.com/rformassspectrometry/MsBackendWeizMass): allows to access MS data from WeizMass MS/MS spectral databases. diff --git a/vignettes/Spectra.Rmd b/vignettes/Spectra.Rmd index 8d383700..35e0dfbb 100644 --- a/vignettes/Spectra.Rmd +++ b/vignettes/Spectra.Rmd @@ -1244,38 +1244,51 @@ head(basename(dataStorage(sps_tmt))) A (possibly incomplete) list of R packages providing additional backends that add support for additional data types or storage options is provided below: -- `r BiocStyle::Biocpkg("MsBackendMgf")`: support for import/export of mass - spectrometry files in mascot generic format (MGF). -- `r BiocStyle::Biocpkg("MsBackendMsp")`: allows to import/export data in NIST - MSP format. Extends the `MsBackendDataFrame` and keeps thus all data, after - import, in memory. -- `MsBackendMassbank` (package `r BiocStyle::Biocpkg("MsBackendMassbank")`): - allows to import/export data in MassBank text file format. Extends the - `MsBackendDataFrame` and keeps thus all data, after import, in memory. -- `MsBackendMassbankSql` (package `r BiocStyle::Biocpkg("MsBackendMassbank")`): - allows to directly connect to a MassBank SQL database to retrieve all MS data - and variables. Has a minimal memory footprint because all data is retrieved - on-the-fly from the SQL database. -- `r BiocStyle::Biocpkg("MsBackendSql")`: stores all MS data in a SQL database - and has thus a minimal memory footprint. - `MsBackendCompDb` (package `r BiocStyle::Biocpkg("CompoundDb")`): provides access to spectra data (spectra and peaks variables) from a *CompDb* database. Has a small memory footprint because all data (except precursor m/z values) are retrieved on-the-fly from the database. -- `r Biocpkg("MsBackendRawFileReader")`: implements a backend for reading MS - data from Thermo Fisher Scientific's raw data files using the manufacturer's - NewRawFileReader .Net libraries. The package generalizes the functionality - introduced by the `r Biocpkg("rawrr")` package, see also - [@kockmann_rawrr_2021]. + - `MsBackendHmdbXml` (package [`MsbackendHmdb`](https://github.com/rformassspectrometry/MsBackendHmdb)): allows import of MS data from xml files of the Human Metabolome Database (HMDB). Extends the `MsBackendDataFrame` and keeps thus all data, after import, in memory. + +- `MsBackendMassbank` (package `r BiocStyle::Biocpkg("MsBackendMassbank")`): + allows to import/export data in MassBank text file format. Extends the + `MsBackendDataFrame` and keeps thus all data, after import, in memory. + +- `MsBackendMassbankSql` (package `r BiocStyle::Biocpkg("MsBackendMassbank")`): + allows to directly connect to a MassBank SQL database to retrieve all MS data + and variables. Has a minimal memory footprint because all data is retrieved + on-the-fly from the SQL database. + +- `MsBackendMetaboLights` (package `r + BiocStyle::Biocpkg("MsBackendMetaboLights")`): retrieves and caches MS data + files from the MetaboLights repository. + +- `MsBackendMgf`: (package `r BiocStyle::Biocpkg("MsBackendMgf")`): support for + import/export of mass spectrometry files in mascot generic format (MGF). + +- `MsBackendMsp`: (package `r BiocStyle::Biocpkg("MsBackendMsp")`): allows to + import/export data in NIST MSP format. Extends the `MsBackendDataFrame` and + keeps thus all data, after import, in memory. + +- `MsBackendRawFileReader` (package `r Biocpkg("MsBackendRawFileReader")`): + implements a backend for reading MS data from Thermo Fisher Scientific's raw + data files using the manufacturer's NewRawFileReader .Net libraries. The + package generalizes the functionality introduced by the `r Biocpkg("rawrr")` + package, see also [@kockmann_rawrr_2021]. + +- `MsBackendSql` (package `r BiocStyle::Biocpkg("MsBackendSql")`): stores all MS + data in a SQL database and has thus a minimal memory footprint. + - `MsBackendTimsTof` (package [`MsBackendTimsTof`](https://github.com/rformassspectrometry/MsBackendTimsTof): allows import of data from Bruker TimsTOF raw data files (using the `opentimsr` R package). + - `MsBackendWeizMass` (package [`MsBackendWeizMass`](https://github.com/rformassspectrometry/MsBackendWeizMass): allows to access MS data from WeizMass MS/MS spectral databases.