HistonePTM

09 December 2024

Overview
Installation
Contributing
Getting help
Workflows
1. PTMs

Overview

The goal of histonePTM is to make histone PTM analysis less tedious by offering a whole workflow analysis or functions that help build a workflow based on whatever software you are using.

Not only this, other functions allow retreiving data from the internet, manipulate mgf files, visualize results in addition to some quality control assessments.

Some functions rely heavily on other functions from well-established packages.

Installation

You can install the development version of histonePTM from GitHub with:

# install.packages("remotes")
remotes::install_github("HijaziHassan/histonePTM") # you only have to run the code once to install it on 
                                                  #your hard disk. After that use `library(histonePTM)`.

Contributing

Any contribution is very welcomed. The first version is more adapted to Proline software output. But it tried to generalize each function to be generic and very flexible to be applicable for other software outputs.

Getting help

If you encouter any bug, a problem, a weired behavior, or have a feature request, please open an issue.

If you would like to discuss questions related to histone analysis using mass spectrometry, please open a discussion here discussion.

Workflows

Analysis of DDA results using `analyzeHistone()`

If you are using Proline software to validate identifications (IDs) resulted from search engines such as Mascot, the function analyzeHistone() can:

Isolate histone peptides based on user-defined histone protein(s).
Normalize intensities to the total area or intensity within peptide families or total filttered peptides.
Abbreviate histone peptides.
Rename PTMs strings into Proforma, Brno nomenclature, or other more simplistic representation.
Calculate mean, standard deviation, and coefficient of variation for each ID in each condition.
Remove and store duplications.
Mark with (*) and store IDs where the software assigns the same peak apex (i.e. same intensities) to isobaric positional isomers (i.e. K18acK23un and K18unK23ac) which nearly co-elute.
Filter unwanted and/or IDs that are -more often than not- false positives (e.g. H3 K37mod).
Filter some IDs if they are not quantified in a user-defined number of samples.

This results in 3 Excel file:

File 1: Containing the raw data with several sheets. Sheet 1 contains the raw data of isolated histone peptides without any transformation. The rest of the sheets are filtered data from the original data in the first sheet . E.g. Only N-terminally labeled peptides.
File(s) 2: Peptide-centric. An excel file per histone protein with each sheet containing IDs from the same peptide.
File 3: PTM-centric. An excel file summarizing PTMs with each sheet containing IDs with specific PTM.

All this with flexibility to:

choose only to analyze (and output results) of user-defined histone protein (e.g. only H3).
filter IDs with cut-off threshold of missing values.
output File(s) 2 with either removing all unlabeled me1, K37mod (for H3K27R40 peptide) or both.
group File(s) 2 into one file or save each protein results in a separate file.

Pre-requisites

Proline excel output file containing the sheets:
Best PSM from protein sets which includes IDs and their intensities in each sample. This assumes that IDs with multiple charge states are already summed using post-processing functionality inside Proline.
Search settings and infos which includes information about RAW files’ names and their corresponding search result files’ names.
An excel file containing at least three columns:
- SampleName: custom samples names
- file: names of RAW files.
- Condition: concentration, WT vs disease … other recognized optional columns: BioReplicate, and/or TechReplicate depending on the experimental design.

For further detailed of this fucntion and other use ? behind the function name without paraenthesis in R console to get the full documentation (i.e. ?analyzeHistone).

library(histonePTM)

# analyzeHistone(analysisfile, # file name
#                 metafile, #metafile name
#                 hist_prot= c('All','H3', 'H4', 'H2A', 'H2B'), #choose one these options
#                 labeling = c("PA", "TMA", "PIC_PA", "none") # allow reversing labeling when renaming PTMs
#                 NA_threshold, #numeric #optional
#                 norm_method = c('peptide_family', 'peptide_total'),
#                 extra_filter = c("none", 'no_me1', "K37un", "no_me1_K37un"), #optional
#                 output_result= c('single', 'multiple'), #optional

Some functions used to build-up this workflow among others are shown below:

1. PTMs

Rename PTM strings from Proline or Skyline to have a shorthanded representation.

1.1 `ptm_beautify()`

Proline

#PTM from Proline export, from 'modifications' column of sheet 'Best PSM from protein sets'.
PTM_Proline <- 'Propionyl (Any N-term); Propionyl (K1); Butyryl (K10); Butyryl (K11)'

ptm_beautify(PTM_Proline, lookup = histptm_lookup, software = 'Proline', residue = 'keep')
#> [1] "prNt-K1pr-K10bu-K11bu"

 
ptm_beautify(PTM_Proline, lookup = histptm_lookup, software = 'Proline', residue = 'remove')
#> [1] "prNt-pr-bu-bu"

Skyline

Skyline PTMs are enclosed between square brackets (e.g. [+28.0313]) and sometimes they are rounded (e.g [+28]). We don’t support rounded numbers since some PTMs like [Ac] and [3Me] are rounded to the same number: +42. Use instead: ‘Peptide Modified Sequence Monoisotopic Masses’ column. Modified peptides in the ‘isolation list’ output file (‘Comment’ column) from Skyline always contains monoisotopic masses of PTMs as well.

PTM_Skyline <- "K[+124.05243]SVPSTGGVK[+56.026215]K[+56.026215]PHR"
 

ptm_beautify(PTM_Skyline, lookup = shorthistptm_mass, software = 'Skyline', residue = 'keep')
#> [1] "prNt-KcrSVPSTGGVKprKprPHR"

ptm_beautify(PTM_Skyline, lookup = shorthistptm_mass, software = 'Skyline', residue = 'remove')
#> [1] "prNt-cr-pr-pr"

1.2 `misc_clearLabeling`

Remove the chemical labeling like propionyl (PA) or TMA which are not biologically relevant.

misc_clearLabeling("prNt-cr-pr-pr", labeling = "PA")
#> [1] "cr-un-un"

1.3 `ptm_toProForma()`

Convert PTM string to ProForma ProForma (Proteoform and Peptidoform Notation)

histonePTM::ptm_toProForma(seq = "KSAPATGGVKKPHR",
                mod = "Propionyl (Any N-term); Lactyl (K1); Dimethyl (K10); Propionyl (K11)")
#> [1] "[UNIMOD:58]-K[UNIMOD:2114]SAPATGGVK[UNIMOD:36]K[UNIMOD:58]PHR"

ptm_toProForma(seq = "KSAPATGGVKKPHR",
               mod = "TMAyl_correct (Any N-term); Butyryl (K1); Trimethyl (K10); Propionyl (K11)")
#> [1] "[TMAyl_correct]-K[UNIMOD:1289]SAPATGGVK[UNIMOD:37]K[UNIMOD:58]PHR"

ptm_toProForma(  seq = "KQLATKVAR",
                 mod = "Propionyl (Any N-term); Propionyl (K1); Propionyl (K6)")
#> [1] "[UNIMOD:58]-K[UNIMOD:58]QLATK[UNIMOD:58]VAR"

1.4 `ptm_labelingAssessment()`

Lysine derivatization can go rogue and can label other residues such as S, T, and Y. When using propionic anhydride, this is called ’ Overpropionylation’. Hydroxylamine is used to remove this adventitous labeling, so-called “reverse propionylation’. This function help for a quick visual review to see if overpropionylation is limited or enormous.

This for sure assumes that the database search results was run with Propionyl (STY) or any other labeling modification as varaible modification.

Name		Name	Last commit message	Last commit date
Latest commit History 212 Commits
R		R
data		data
inst/python		inst/python
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
histonePTM.Rproj		histonePTM.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

HistonePTM

Overview

Installation

Contributing

Getting help

Workflows

Analysis of DDA results using `analyzeHistone()`

1. PTMs

1.1 `ptm_beautify()`

Proline

Skyline

1.2 `misc_clearLabeling`

1.3 `ptm_toProForma()`

1.4 `ptm_labelingAssessment()`

About

Licenses found

Releases

Packages

Languages

License

Licenses found

HijaziHassan/histonePTM

Folders and files

Latest commit

History

Repository files navigation

HistonePTM

Overview

Installation

Contributing

Getting help

Workflows

Analysis of DDA results using analyzeHistone()

1. PTMs

1.1 ptm_beautify()

Proline

Skyline

1.2 misc_clearLabeling

1.3 ptm_toProForma()

1.4 ptm_labelingAssessment()

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Analysis of DDA results using `analyzeHistone()`

1.1 `ptm_beautify()`

1.2 `misc_clearLabeling`

1.3 `ptm_toProForma()`

1.4 `ptm_labelingAssessment()`

Packages