Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add function to coerce from Spectra to MSpectra #605

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: MSnbase
Title: Base Functions and Classes for Mass Spectrometry and Proteomics
Version: 2.31.1
Version: 2.31.2
Description: MSnbase provides infrastructure for manipulation,
processing and visualisation of mass spectrometry and
proteomics data, ranging from raw to quantitative and
Expand Down Expand Up @@ -121,7 +121,8 @@ Suggests:
XML,
shiny,
magrittr,
SummarizedExperiment
SummarizedExperiment,
Spectra
LinkingTo: Rcpp
License: Artistic-2.0
LazyData: yes
Expand All @@ -132,7 +133,6 @@ URL: https://lgatto.github.io/MSnbase
biocViews: ImmunoOncology, Infrastructure, Proteomics, MassSpectrometry,
QualityControl, DataImport
Roxygen: list(markdown=TRUE)
RoxygenNote: 7.3.1
Collate:
'AllClassUnions.R'
'AllGenerics.R'
Expand Down Expand Up @@ -222,3 +222,4 @@ Collate:
'utils.R'
'writeMSData.R'
'zzz.R'
RoxygenNote: 7.2.3
6 changes: 5 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
# MSnbase 2.31

## MSnbase 2.31.2

- Add functionality to convert a `Spectra` object to a `MSpectra`.

## MSnbase 2.31.1

- Disable nested parallel processing for `chromatogram()` method.
- Fix Rd notes.

# MSnbase 2.31.0
## MSnbase 2.31.0

- New Bioconductor devel.

Expand Down
16 changes: 16 additions & 0 deletions R/DataClasses.R
Original file line number Diff line number Diff line change
Expand Up @@ -503,6 +503,22 @@ setClass("MChromatograms",
#' `S4Vectors` package. This includes `lapply` and other data manipulation
#' and subsetting operations.
#'
#' @note
#'
#' Note that the [Spectra](https://bioconductor.org/packages/Spectra) package
#' provides a more robust and efficient infrastructure for mass spectrometry
#' data handling and analysis. So, wherever possible, the newer *Spectra*
#' package should be used instead of the *MSnbase*.
#'
#' For backward compatibility, it is however possible to convert between the
#' `MSpectra` and the newer `Spectra` objects:
#'
#' - A `Spectra` object can be coerced to a `MSpectra` using
#' `as(sps, "MSpectra")` where `sps` is a `Spectra` object.
#' - The [extractSpectraData()] function can be used to extract the data from
#' a `MSpectra` as a `DataFrame`, which can then be used to create a
#' `Spectra` object.
#'
#' @param object For all functions: a `MSpectra` object.
#'
#' @param x For all functions: a `MSpectra` object.
Expand Down
97 changes: 92 additions & 5 deletions R/functions-MSpectra.R
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,20 @@ MSpectra <- function(..., elementMetadata = NULL) {
res
}

#' @title Extract data from MSnbase objects for use in Spectra
#' @title Conversion between objects from the Spectra and MSnbase packages
#'
#' @name extractSpectraData
#'
#' @aliases coerce,Spectra,MSpectra-method
#'
#' @description
#'
#' The [Spectra](https://bioconductor.org/packages/Spectra) package
#' provides a more robust and efficient infrastructure for mass spectrometry
#' data handling and analysis. So, wherever possible, the newer *Spectra*
#' package should be used instead of the *MSnbase*. The functions listed here
#' allow to convert between objects from the *MSnbase* and *Spectra* packages.
#'
#' `extractSpectraData` extracts the spectra data (m/z and intensity values
#' including metadata) from [MSnExp-class], [OnDiskMSnExp-class],
#' [Spectrum1-class], [Spectrum2-class] objects (or `list` of such objects) and
Expand All @@ -104,11 +114,25 @@ MSpectra <- function(..., elementMetadata = NULL) {
#' to convert data from the *old* `MSnbase` package to the newer `Spectra`
#' package.
#'
#' To convert a `Spectra` object to a `MSpectra` object use
#' `as(sps, "MSpectra")` where `sps` is a `Spectra` object.
#'
#' @note
#'
#' Coercion from `Spectra` to a `MSpectra` will only assign values to the
#' contained `Spectrum1` and `Spectrum2` objects, but will not add all
#' eventually spectra variables present in `Spectra`.
#'
#' @param x a `list` of [Spectrum-class] objects or an object extending
#' [MSnExp-class] or a [MSpectra-class] object.
#'
#' @return [DataFrame()] with the full spectrum data that can be passed to the
#' [Spectra::Spectra()] function to create a `Spectra` object.
#' @return
#'
#' - `extracSpectraData()` returns a [DataFrame()] with the full spectrum data
#' that can be passed to the [Spectra::Spectra()] function to create a
#' `Spectra` object.
#' - `as(x, "MSpectra")` returns a `MSpectra` object with the content of the
#' `Spectra` object `x`.
#'
#' @author Johannes Rainer
#'
Expand All @@ -125,10 +149,14 @@ MSpectra <- function(..., elementMetadata = NULL) {
#' res <- extractSpectraData(data)
#' res
#'
#' library(Spectra)
#' ## This can be used as an input for the Spectra constructor of the
#' ## Spectra package:
#' ## sps <- Spectra::Spectra(res)
#' ## sps
#' sps <- Spectra::Spectra(res)
#' sps
#'
#' ## A Spectra object can be coerced to a MSnbase MSpectra object using
#' msps <- as(sps, "MSpectra")
extractSpectraData <- function(x) {
if (inherits(x, "MSpectra")) {
df <- DataFrame(do.call(rbind, lapply(x, .spectrum_header)))
Expand All @@ -150,3 +178,62 @@ extractSpectraData <- function(x) {
colnames(df)[colnames(df) == "seqNum"] <- "scanIndex"
df
}

#' Coercion method from `Spectra` to `MSnbase::MSpectra`.
#'
#' @noRd
.spectra_to_spectrum_list <- function(x, chunkSize = 100) {
requireNamespace("Spectra", quietly = TRUE)
Spectra::spectrapply(x, function(z) {
msl <- Spectra::msLevel(z)
r <- vector("list", length = length(msl))
i <- which(msl == 1L)
j <- which(msl > 1L)
if (length(i)) {
z_1 <- z[i]
mzs <- Spectra::mz(z_1)
ints <- Spectra::intensity(z_1)
l <- lengths(mzs)
r[i] <- Spectra1_mz_sorted(
peaksCount = l,
rt = Spectra::rtime(z_1),
acquisitionNum = Spectra::acquisitionNum(z_1),
scanIndex = Spectra::scanIndex(z_1),
tic = sum(ints),
mz = unlist(mzs),
intensity = unlist(ints),
fromFile = rep(NA_integer_, length(i)),
centroided = Spectra::centroided(z_1),
smoothed = Spectra::smoothed(z_1),
polarity = Spectra::polarity(z_1),
nvalues = l)
}
if (length(j)) {
z_2 <- z[j]
mzs <- Spectra::mz(z_2)
ints <- Spectra::intensity(z_2)
l <- lengths(mzs)
r[j] <- Spectra2_mz_sorted(
msLevel = msl[j],
peaksCount = l,
rt = Spectra::rtime(z_2),
acquisitionNum = Spectra::acquisitionNum(z_2),
scanIndex = Spectra::scanIndex(z_2),
tic = sum(ints),
mz = unlist(mzs),
intensity = unlist(ints),
fromFile = rep(NA_integer_, length(j)),
centroided = Spectra::centroided(z_2),
smoothed = Spectra::smoothed(z_2),
polarity = Spectra::polarity(z_2),
merged = rep(1, length(j)),
precScanNum = Spectra::precScanNum(z_2),
precursorMz = Spectra::precursorMz(z_2),
precursorIntensity = Spectra::precursorIntensity(z_2),
precursorCharge = Spectra::precursorCharge(z_2),
collisionEnergy = Spectra::collisionEnergy(z_2),
nvalues = l)
}
r
}, chunkSize = chunkSize)
}
4 changes: 4 additions & 0 deletions R/methods-MSpectra.R
Original file line number Diff line number Diff line change
Expand Up @@ -601,6 +601,10 @@ setAs("MSpectra", "MSnExp", function(from) {
processingData = process)
})

setAs("Spectra", "MSpectra", function(from) {
MSpectra(.spectra_to_spectrum_list(from, chunkSize = 1000))
})

#' @rdname MSpectra
#'
#' @examples
Expand Down
58 changes: 45 additions & 13 deletions R/utils.R
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
##' @title Format Retention Time
##'
##' @description
##'
##' This function is used to convert retention times. Conversion is
##' seconds to/from the more human friendly format "mm:sec". The
##' implementation is from [MsCoreUtils::formatRt()].
##'
##' @title Format Retention Time
##'
##' @param rt retention time in seconds (`numeric`) or "mm:sec"
##' (`character`).
##'
Expand Down Expand Up @@ -272,9 +274,13 @@ fillUp <- function(x) {
return(x)
}


##' @title Return a variable name
##'
##' @description
##'
##' Return the name of variable \code{varname} in call \code{match_call}.
##'
##' @title Return a variable name
##' @param match_call An object of class \code{call}, as returned by \code{match.call}.
##' @param varname An \code{character} of length 1 which is looked up in \code{match_call}.
##' @return A \code{character} with the name of the variable passed as parameter
Expand Down Expand Up @@ -598,6 +604,10 @@ logging <- function(object, msg, date. = TRUE) {
return(object)
}

##' @title Returns the matching column names of indices.
##'
##' @description
##'
##' Given a text spread sheet \code{f} and a \code{pattern} to
##' be matched to its header (first line in the file), the function
##' returns the matching columns names or indices of the
Expand All @@ -616,7 +626,6 @@ logging <- function(object, msg, date. = TRUE) {
##' These functions are useful to check the parameters to be provided to
##' \code{\link{readMSnSet2}}.
##'
##' @title Returns the matching column names of indices.
##' @param f A connection object or a \code{character} string to be
##' read in with \code{readLines(f, n = 1)}.
##' @param pattern A \code{character} string containing a regular
Expand Down Expand Up @@ -884,10 +893,13 @@ utils.removeNoIdAndMultipleAssignments <-
return(object)
}

##' @title Tests equality of list elements class
##'
##' @description
##'
##' Compares equality of all members of a list.
##'
##' @title Tests equality of list elements class
##' @param x A code{list}.
##' @param x A \code{list}.
##' @param class A \code{character} defining the expected class.
##' @param valid A \code{logical} defining if all elements should be
##' tested for validity. Default is \code{TRUE}.
Expand All @@ -905,6 +917,11 @@ listOf <- function(x, class, valid = TRUE) {
cla & val
}


##' @title Non-parametric coefficient of variation
##'
##' @description
##'
##' Calculates a non-parametric version of the coefficient of
##' variation where the standard deviation is replaced by the median
##' absolute deviations (see \code{\link{mad}} for details) and
Expand All @@ -913,8 +930,6 @@ listOf <- function(x, class, valid = TRUE) {
##' Note that the \code{mad} of a single value is 0 (as opposed to
##' \code{NA} for the standard deviation, see example below).
##'
##'
##' @title Non-parametric coefficient of variation
##' @param x A \code{numeric}.
##' @param na.rm A \code{logical} (default is \code{TRUE} indicating
##' whether \code{NA} values should be stripped before the computation
Expand All @@ -933,10 +948,13 @@ npcv <- function(x, na.rm = TRUE) {
mdx/abs(mean(x, na.rm = na.rm))
}

##' @title Compare two MSnSets
##'
##' @description
##'
##' Compares two \code{\linkS4class{MSnSet}} instances. The
##' \code{qual} and \code{processingData} slots are generally omitted.
##'
##' @title Compare two MSnSets
##' @param x First MSnSet
##' @param y Second MSnSet
##' @param qual Should the \code{qual} slots be compared? Default is
Expand Down Expand Up @@ -1115,11 +1133,14 @@ countAndPrint <- function(x) {
}


##' @title Converts factors to strings
##'
##' @description
##'
##' This function produces the opposite as the \code{stringsAsFactors}
##' argument in the \code{data.frame} or \code{read.table} functions;
##' it converts \code{factors} columns to \code{characters}.
##'
##' @title Converts factors to strings
##' @param x A \code{data.frame}
##' @return A \code{data.frame} where \code{factors} are converted to
##' \code{characters}.
Expand All @@ -1137,10 +1158,13 @@ factorsAsStrings <- function(x) {
data.frame(x, stringsAsFactors = FALSE)
}

##' @title Convert to camel case by replacing dots by captial letters
##'
##' @description
##'
##' Convert a \code{vector} of characters to camel case by replacing
##' dots by captial letters.
##'
##' @title Convert to camel case by replacing dots by captial letters
##' @param x A \code{vector} to be transformed to camel case.
##' @param prefix An optional \code{character} of length one. Any
##' additional elements are ignores.
Expand All @@ -1157,6 +1181,10 @@ makeCamelCase <- function(x, prefix) {
}


##' @title Reduce a data.frame
##'
##' @description
##'
##' Reduce a data.frame so that the (primary) key column contains only
##' unique entries and other columns pertaining to that entry are
##' combined into semicolon-separated values into a single
Expand All @@ -1167,7 +1195,6 @@ makeCamelCase <- function(x, prefix) {
##' are collapsed to a semi-column separated value (even if only one
##' value is present) as soon as one observation of transformed.
##'
##' @title Reduce a data.frame
##' @param x A \code{data.frame}.
##' @param key The column name (currenly only one is supported) to be
##' used as primary key.
Expand Down Expand Up @@ -1325,10 +1352,13 @@ windowIndices <- function(i, hws, n) {
brks
}

##' @title Checks if raw data files have any spectra or chromatograms
##'
##' @description
##'
##' Helper functions to check whether raw files contain spectra or
##' chromatograms.
##'
##' @title Checks if raw data files have any spectra or chromatograms
##' @param files A `character()` with raw data filenames.
##' @return A `logical(n)` where `n == length(x)` with `TRUE` if that
##' files contains at least one spectrum, `FALSE` otherwise.
Expand All @@ -1350,6 +1380,8 @@ hasChromatograms <- function(files) {

#' @title Get the index of the particular element for each level
#'
#' @description
#'
#' `levelIndex` returns the index of the first, middle or last element for
#' each level of a factor within the factor.
#'
Expand Down
Loading
Loading