Skip to content

Commit

Permalink
Merge pull request #270 from jpquast/developer
Browse files Browse the repository at this point in the history
Developer
  • Loading branch information
jpquast authored Oct 22, 2024
2 parents d5f7503 + fb0dd77 commit c252354
Show file tree
Hide file tree
Showing 23 changed files with 136 additions and 91 deletions.
8 changes: 5 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: protti
Title: Bottom-Up Proteomics and LiP-MS Quality Control and Data Analysis Tools
Version: 0.9.0
Version: 0.9.1
Authors@R:
c(person(given = "Jan-Philipp",
family = "Quast",
Expand Down Expand Up @@ -43,7 +43,7 @@ Imports:
methods,
R.utils,
stats
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
Suggests:
testthat,
covr,
Expand All @@ -67,7 +67,9 @@ Suggests:
iq,
scales,
farver,
ggforce
ggforce,
xml2,
jsonlite
Depends:
R (>= 4.0)
URL: https://github.com/jpquast/protti, https://jpquast.github.io/protti/
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ importFrom(purrr,pluck)
importFrom(purrr,pmap)
importFrom(purrr,reduce)
importFrom(purrr,set_names)
importFrom(readr,read_csv)
importFrom(readr,read_tsv)
importFrom(readr,write_csv)
importFrom(readr,write_tsv)
Expand Down
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# protti 0.9.1

## Bug fixes
* `try_query()` now correctly handles errors that don't return a response object. We also handle gzip decompression problems better since some databases compressed responses were not handled correctly.

# protti 0.9.0

## New features
Expand Down
7 changes: 3 additions & 4 deletions R/calculate_protein_abundance.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,11 @@
#' for a protein to be included in the analysis. The default value is 3, which means
#' proteins with fewer than three unique peptides will be excluded from the analysis.
#' @param method a character value specifying with which method protein quantities should be
#' calculated. Possible options include \code{"sum"}, which takes the sum of all precursor
#' intensities as the protein abundance. Another option is \code{"iq"}, which performs protein
#' calculated. Possible options include `"sum"`, which takes the sum of all precursor
#' intensities as the protein abundance. Another option is `"iq"`, which performs protein
#' quantification based on a maximal peptide ratio extraction algorithm that is adapted from the
#' MaxLFQ algorithm of the MaxQuant software. Functions from the
#' \href{https://academic.oup.com/bioinformatics/article/36/8/2611/5697917}{\code{iq}} package are
#' used. Default is \code{"iq"}.
#' `iq` package (\doi{10.1093/bioinformatics/btz961}) are used. Default is `"iq"`.
#' @param for_plot a logical value indicating whether the result should be only protein intensities
#' or protein intensities together with precursor intensities that can be used for plotting using
#' \code{peptide_profile_plot()}. Default is \code{FALSE}.
Expand Down
8 changes: 4 additions & 4 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
#' @format A data frame containing peptide level data from a Spectronaut report.
#' @source Piazza, I., Beaton, N., Bruderer, R. et al. A machine learning-based chemoproteomic
#' approach to identify drug targets and binding sites in complex proteomes. Nat Commun 11, 4200
#' (2020). https://doi.org/10.1038/s41467-020-18071-x
#' (2020). \doi{10.1038/s41467-020-18071-x}
"rapamycin_10uM"

#' Rapamycin dose response example data
Expand All @@ -47,13 +47,13 @@
#' @format A data frame containing peptide level data from a Spectronaut report.
#' @source Piazza, I., Beaton, N., Bruderer, R. et al. A machine learning-based chemoproteomic
#' approach to identify drug targets and binding sites in complex proteomes. Nat Commun 11, 4200
#' (2020). https://doi.org/10.1038/s41467-020-18071-x
#' (2020). \doi{10.1038/s41467-020-18071-x}
"rapamycin_dose_response"

#' Structural analysis example data
#'
#' Example data used for the vignette about structural analysis. The data was obtained from
#' \href{https://www.sciencedirect.com/science/article/pii/S0092867420316913}{Cappelletti 2021}
#' Cappelletti et al. 2021 (\doi{10.1016/j.cell.2020.12.021})
#' and corresponds to two separate experiments. Both experiments were limited proteolyis coupled to
#' mass spectrometry (LiP-MS) experiments conducted on purified proteins. The first protein is
#' phosphoglycerate kinase 1 (pgk) and it was treated with 25mM 3-phosphoglyceric acid (3PG).
Expand All @@ -69,7 +69,7 @@
#' @source Cappelletti V, Hauser T, Piazza I, Pepelnjak M, Malinovska L, Fuhrer T, Li Y, Dörig C,
#' Boersema P, Gillet L, Grossbach J, Dugourd A, Saez-Rodriguez J, Beyer A, Zamboni N, Caflisch A,
#' de Souza N, Picotti P. Dynamic 3D proteomes reveal protein functional alterations at high
#' resolution in situ. Cell. 2021 Jan 21;184(2):545-559.e22. doi: 10.1016/j.cell.2020.12.021.
#' resolution in situ. Cell. 2021 Jan 21;184(2):545-559.e22. \doi{10.1016/j.cell.2020.12.021}.
#' Epub 2020 Dec 23. PMID: 33357446; PMCID: PMC7836100.
"ptsi_pgk"

Expand Down
3 changes: 1 addition & 2 deletions R/fetch_eco.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,7 @@
#' essential to navigating the ever-growing (in size and complexity) corpus of scientific
#' information."
#'
#' More information can be found in their
#' \href{https://academic.oup.com/nar/article/47/D1/D1186/5165344?login=true}{publication}.
#' More information can be found in their publication (\doi{10.1093/nar/gky1036}).
#'
#' @param return_relation a logical value that indicates if relational information should be returned instead
#' the main descriptive information. This data can be used to check the relations of ECO terms to each other.
Expand Down
2 changes: 1 addition & 1 deletion R/fetch_mobidb.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
#' @return A data frame that contains start and end positions for disordered and flexible protein
#' regions. The \code{feature} column contains information on the source of this
#' annotation. More information on the source can be found
#' \href{https://mobidb.bio.unipd.it/about/mobidb}{here}.
#' \href{https://mobidb.org/about/mobidb}{here}.
#' @import progress
#' @importFrom rlang .data
#' @importFrom purrr map_dfr keep
Expand Down
44 changes: 41 additions & 3 deletions R/try_query.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
#' @param type a character value that specifies the type of data at the target URL. Options are
#' all options that can be supplied to httr::content, these include e.g.
#' "text/tab-separated-values", "application/json" and "txt/csv". Default is "text/tab-separated-values".
#' Default is "tab-separated-values".
#' @param timeout a numeric value that specifies the maximum request time. Default is 60 seconds.
#' @param accept a character value that specifies the type of data that should be sent by the API if
#' it uses content negotiation. The default is NULL and it should only be set for APIs that use
Expand All @@ -22,6 +21,7 @@
#'
#' @importFrom curl has_internet
#' @importFrom httr GET timeout http_error message_for_status http_status content accept
#' @importFrom readr read_tsv read_csv
#'
#' @return A data frame that contains the table from the url.
try_query <-
Expand Down Expand Up @@ -77,18 +77,56 @@ try_query <-
return(invisible("No internet connection"))
}

if (httr::http_error(query_result)) {
# If response was an error return that error message
if (inherits(query_result, "response") && httr::http_error(query_result)) {
if (!silent) httr::message_for_status(query_result)
return(invisible(httr::http_status(query_result)$message))
}

# Handle other types of errors separately from query errors
if (inherits(query_result, "character")) {
if (!silent) message(query_result)
return(invisible(query_result))
}

# Record readr progress variable to set back later
readr_show_progress <- getOption("readr.show_progress")
on.exit(options(readr.show_progress = readr_show_progress))
# Change variable to not show progress if readr is used
options(readr.show_progress = FALSE)

result <- suppressMessages(httr::content(query_result, type = type, encoding = "UTF-8", ...))
# Retrieve the content as raw bytes using httr::content
raw_content <- httr::content(query_result, type = "raw")
# Check for gzip magic number (1f 8b) before decompression
compressed <- length(raw_content) >= 2 && raw_content[1] == as.raw(0x1f) && raw_content[2] == as.raw(0x8b)

# Check if the content is gzip compressed
if (!is.null(query_result$headers[["content-encoding"]]) && query_result$headers[["content-encoding"]] == "gzip" && compressed) {
# Decompress the raw content using base R's `memDecompress`
decompressed_content <- memDecompress(raw_content, type = "gzip")

# Convert the raw bytes to a character string
text_content <- rawToChar(decompressed_content)

# Read the decompressed content based on the specified type
if (type == "text/tab-separated-values") {
result <- readr::read_tsv(text_content, ...)
} else if (type == "text/html") {
result <- xml2::read_html(text_content, ...)
} else if (type == "text/xml") {
result <- xml2::read_xml(text_content, ...)
} else if (type == "text/csv" || type == "txt/csv") {
result <- readr::read_csv(text_content, ...)
} else if (type == "application/json") {
result <- jsonlite::fromJSON(text_content, ...) # Using jsonlite for JSON parsing
} else if (type == "text") {
result <- text_content # Return raw text as-is
} else {
stop("Unsupported content type: ", type)
}
} else {
result <- suppressMessages(httr::content(query_result, type = type, encoding = "UTF-8", ...))
}

return(result)
}
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ knitr::opts_chunk$set(

The goal of **protti** is to provide flexible functions and workflows for proteomics quality control and data analysis, within a single, user-friendly package. It can be used for label-free DDA, DIA and SRM data generated with search tools and software such as Spectronaut, MaxQuant, Proteome Discoverer and Skyline. Both limited proteolysis mass spectrometry (LiP-MS) and regular bottom-up proteomics experiments can be analysed.

**protti** is developed and maintained by members of the lab of Paola Picotti at ETH Zurich. Our lab is focused on protein structural changes that occur in response to perturbations such as metabolite, drug and protein binding-events, as well as protein aggregation and enzyme activation ([Piazza 2018](https://www.sciencedirect.com/science/article/pii/S0092867417314484), [Piazza 2020](https://www.nature.com/articles/s41467-020-18071-x#additional-information), [Cappelletti, Hauser & Piazza 2021](https://www.sciencedirect.com/science/article/pii/S0092867420316913)). We have devoloped mass spectrometry-based structural and chemical proteomic methods aimed at monitoring protein conformational changes in the complex cellular milieu ([Feng 2014](https://www.nature.com/articles/nbt.2999)).
**protti** is developed and maintained by members of the lab of Paola Picotti at ETH Zurich. Our lab is focused on protein structural changes that occur in response to perturbations such as metabolite, drug and protein binding-events, as well as protein aggregation and enzyme activation ([Piazza 2018](https://doi.org/10.1016/j.cell.2017.12.006), [Piazza 2020](https://doi.org/10.1038/s41467-020-18071-x), [Cappelletti, Hauser & Piazza 2021](https://doi.org/10.1016/j.cell.2020.12.021)). We have devoloped mass spectrometry-based structural and chemical proteomic methods aimed at monitoring protein conformational changes in the complex cellular milieu ([Feng 2014](https://doi.org/10.1038/nbt.2999)).

There is a wide range of functions **protti** provides to the user. The main areas of application are:

Expand Down
110 changes: 58 additions & 52 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,12 @@ be analysed.
Picotti at ETH Zurich. Our lab is focused on protein structural changes
that occur in response to perturbations such as metabolite, drug and
protein binding-events, as well as protein aggregation and enzyme
activation ([Piazza
2018](https://www.sciencedirect.com/science/article/pii/S0092867417314484),
[Piazza
2020](https://www.nature.com/articles/s41467-020-18071-x#additional-information),
[Cappelletti, Hauser & Piazza
2021](https://www.sciencedirect.com/science/article/pii/S0092867420316913)).
We have devoloped mass spectrometry-based structural and chemical
proteomic methods aimed at monitoring protein conformational changes in
the complex cellular milieu ([Feng
2014](https://www.nature.com/articles/nbt.2999)).
activation ([Piazza 2018](https://doi.org/10.1016/j.cell.2017.12.006),
[Piazza 2020](https://doi.org/10.1038/s41467-020-18071-x), [Cappelletti,
Hauser & Piazza 2021](https://doi.org/10.1016/j.cell.2020.12.021)). We
have devoloped mass spectrometry-based structural and chemical proteomic
methods aimed at monitoring protein conformational changes in the
complex cellular milieu ([Feng 2014](https://doi.org/10.1038/nbt.2999)).

There is a wide range of functions **protti** provides to the user. The
main areas of application are:
Expand Down Expand Up @@ -201,15 +197,17 @@ protein intensities.
set.seed(42) # Makes example reproducible

# Create synthetic data
data <- create_synthetic_data(n_proteins = 100,
frac_change = 0.05,
n_replicates = 4,
n_conditions = 2,
method = "effect_random",
additional_metadata = FALSE)

# The method "effect_random" as opposed to "dose-response" just randomly samples
# the extend of the change of significantly changing peptides for each condition.
data <- create_synthetic_data(
n_proteins = 100,
frac_change = 0.05,
n_replicates = 4,
n_conditions = 2,
method = "effect_random",
additional_metadata = FALSE
)

# The method "effect_random" as opposed to "dose-response" just randomly samples
# the extend of the change of significantly changing peptides for each condition.
# They do not follow any trend and can go in any direction.
```

Expand Down Expand Up @@ -252,10 +250,12 @@ contains the normalised intensities.
normalise it another time.*

``` r
normalised_data <- data %>%
normalise(sample = sample,
intensity_log2 = peptide_intensity_missing,
method = "median")
normalised_data <- data %>%
normalise(
sample = sample,
intensity_log2 = peptide_intensity_missing,
method = "median"
)
```

#### Assign Missingness
Expand Down Expand Up @@ -284,16 +284,18 @@ thresholds if you want to be more or less conservative with how many
data points to retain.

``` r
data_missing <- normalised_data %>%
assign_missingness(sample = sample,
condition = condition,
grouping = peptide,
intensity = normalised_intensity_log2,
ref_condition = "condition_1",
retain_columns = c(protein, change_peptide))

# Next to the columns it generates, assign_missingness only contains the columns
# you provide as input in its output. If you want to retain additional columns you
data_missing <- normalised_data %>%
assign_missingness(
sample = sample,
condition = condition,
grouping = peptide,
intensity = normalised_intensity_log2,
ref_condition = "condition_1",
retain_columns = c(protein, change_peptide)
)

# Next to the columns it generates, assign_missingness only contains the columns
# you provide as input in its output. If you want to retain additional columns you
# can provide them in the retain_columns argument.
```

Expand All @@ -317,16 +319,18 @@ missingness cutoffs also in order to define which comparisons are too
incomplete to be trustworthy even if significant.

``` r
result <- data_missing %>%
calculate_diff_abundance(sample = sample,
condition = condition,
grouping = peptide,
intensity_log2 = normalised_intensity_log2,
missingness = missingness,
comparison = comparison,
filter_NA_missingness = TRUE,
method = "moderated_t-test",
retain_columns = c(protein, change_peptide))
result <- data_missing %>%
calculate_diff_abundance(
sample = sample,
condition = condition,
grouping = peptide,
intensity_log2 = normalised_intensity_log2,
missingness = missingness,
comparison = comparison,
filter_NA_missingness = TRUE,
method = "moderated_t-test",
retain_columns = c(protein, change_peptide)
)
```

Next we can use a Volcano plot to visualize significantly changing
Expand All @@ -335,15 +339,17 @@ interactive plot with the `interactive` argument. Please note that this
is not recommended for large datasets.

``` r
result %>%
volcano_plot(grouping = peptide,
log2FC = diff,
significance = pval,
method = "target",
target_column = change_peptide,
target = TRUE,
legend_label = "Ground Truth",
significance_cutoff = c(0.05, "adj_pval"))
result %>%
volcano_plot(
grouping = peptide,
log2FC = diff,
significance = pval,
method = "target",
target_column = change_peptide,
target = TRUE,
legend_label = "Ground Truth",
significance_cutoff = c(0.05, "adj_pval")
)
```

<img src="man/figures/README-volcano-1.png" width="100%" />
4 changes: 1 addition & 3 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
## Submission

* We specifically addressed and fixed the issue raised by Prof. Brian Ripley:
* The `analyse_functional_network()` function did not fail gracefully.
* We implemented a `try_catch()` that specifically rescues the cases in which the `STRINGdb` package does not fail gracefully. This fixes the issue.
* Additionally we added new features and fixed bugs.
* We updated `try_query()` to also handle request unrelated errors successfully.

## Test environments
* macOS-latest (on GitHub actions), R 4.4.1
Expand Down
3 changes: 1 addition & 2 deletions man/calculate_protein_abundance.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 1 addition & 2 deletions man/fetch_eco.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/fetch_mobidb.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file modified man/figures/README-volcano-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit c252354

Please sign in to comment.