MetFrag Hands-on

Topic: Using MetFrag for compound identification with MS/MS data and additional information.

A online training presentation is available here.

In this hands-on session you will learn how to use MetFrag to annotate MS/MS spectra as a first step to identify a molecular structure given MS and MS/MS information. Furthermore, we will use additional experimental and meta data to support a putative identification.

MetFrag webservice workflow

In this example we have extracted a feature from a water (river) sample from a LC-MS/MS measurement with a precursor m/z 230.1162 at retention time 10.1 minutes. The data is acquired on a LTQ Orbitrap XL with a high mass accuracy (<5ppm) in positive ion mode. The adduct type of the selected precursor ion is known as [M+H]⁺.

Please download the prepared data:

MS1: ms1_mz230.1162_rt10.1.txt
MS2: ms2_mz230.1162_rt10.1.txt

Step 1 - Run initial MetFrag processing

1 a) Retrieve Candidates from database

visit the MetFragWeb tool in your browser https://msbi.ipb-halle.de/MetFrag
define database settings to retrieve candidates given the MS1 information:
1. use the precursor m/z value and type to calculate the neutral monoisotopic mass
2. check mass accuracy
3. select PubChemLite in the "Local Databases" section as compound database
start a candidate retrieval by clicking "Retrieve Candidates"

Go to live demo

MetFrag searches candidates matching the information given by the "Database settings" (here: Neutral Mass and 5 ppm deviation)
after the retrieval you can download the candidate list as CSV or XLS to get a first overview about the retrieved data set

1 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

use the "Fragmentation settings" tab to add the given MS2 peak list
you can visualize the peak list by clicking on the "Show Spectrum" button
keep the settings for the in silico fragmentation and start the processing by clicking "Process Candidates"

MetFrag now generates fragments for each candidate up to the specified tree depth
the fragments are mapped to the MS/MS peak list (based on mass) which is used to calculate a score for each candidate
after the processing is finished you see the ranked candidates list in the "Results" tab
here you have different possibilities:
- you can filter candidates by explained peaks
- investigate explained fragments and calculated scores for each candidate
- download ranked candidate list as CSV or XLS file

Questions:

Q1: How many different molecular formulas are present?

Q2: What do you think is the correct molecular formula?

Q3: What else could you do to verify the molecular formula besides using the given MetFrag results?

--

Visit http://www.envipat.eawag.ch/index.php and verify your molecular formula.

Step 2 - Run MetFrag processing using molecular formula

2 a) Retrieve Candidates from database

use the same settings as in 1 a) but add the molecular formula
also select "Include references" when using PubChem

Go to live demo

2 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

use the same settings as in 1 b) and process the candidates

Questions:

Q4: Looking at the results, what has changed compared to using the monoisotopic mass as candidate filter?

Q5: Is the molecular formula helpful here?

Step 3 - Run MetFrag adding additional experimental information

3 a) Add the retention time data model to the MetFragWeb tool

adding additional information available from the experimental context is often helpful to verify a putative identification
we want to add retention time as additional experimental information

MetFrag includes a retention time model
linear correlation of n-octanol/water partition coefficient(logP) and retention time
candidate logP is predicted by XLogP3(retrieved from PubChem) or calculated by CDK's XLogP
rt_XlogP.csv contains a data set of measured rt and XLogP3 values of 254 Eawag standards:

upload the data set to the MetFragWeb tool in the "Candidate Filter & Score Settings" tab using the "Retention Time" panel on the right side (direct download: rt_XlogP.csv)

after the file upload set the retention time of the precursor and select XLogP3 as partition coefficient which is used for correlation
this results in an additional scoring term in the scoring function of MetFrag

3 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

use the same settings as in 2 b) and process the candidates

Questions:

Q6: What has changed compared to the previous run?

Q7: Use the weight sliders in the "Results" tab. Does it change anything?

Q8: Is the retention time information helpful here?

Step 4 - Run MetFrag adding additional meta information

4 a) Add the additional scoring terms

meta information can help to verify putative identifications depending on the experimental context
however, you need to be careful when using this information which is not related to your acquired data
in the "Candidate Filter & Score Settings" tab select the additional "Database Scoring Terms"
- PubChemNumberPubMedReferences
- PubChemNumberPatents

4 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

use the same settings as in 3 b) and process the candidates

Questions:

Q9: What has changed compared to the previous run?

Q10: Would the number of references and patents have helped for a metabolomics experiment?

Q11: Investigate the high intensity fragments of the first ranked candidate. Are they plausible compared to fragment structures of other candidates?

Step 5 - Search in spectral libraries

5 a) Investigate MS/MS peaks in MassBank

visit MassBank EU (https://massbank.eu)
select the "Peak Search" and add the most intense explained peaks

hitting "Search" to find spectra with matching peaks in the database

Questions:

Q12: Investigate the results and compare them to your MetFrag result list. Any conclusions?

Step 6 - Combine Spectra library search and MetFrag

6 a) Enable spectral similarity in MetFrag

in the "Candidate Filter & Score Settings" tab enable "Spectral Similarity"
MetFrag will now query the MS/MS peak list against a spectral library mirror to search for similar spectra of known compounds

6 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

use the same settings as in 4 b) and process the candidates

Questions:

Q13: Discard the meta information scores to just use the results based on experimental data. Any conclusions?

Exercise

visit the CASMI contest site (http://www.casmi-contest.org/2017/challenges_1-45.shtml)
try to identify some of the compounds
check your results here

Advanced exercise: using MetFrag on command line

MetFrag can be used on command line to process batches of annotation tasks -- its called MetFragCLI
parameter files for MetFragCLI can be created by web interface
get your copy of MetFragCLI from
http://ipb-halle.github.io/MetFrag/
setup one example calculation to retrieve a set of valid parameter

Prepare one directory with the required files for each annotation task


~/course$ ls *
MetFrag2.4.2-CL.jar  MetFragWeb_Parameters.zip

data:
challenge-001-msms.txt  challenge-003-msms.txt  challenge-005-msms.txt  challenge-007-msms.txt  challenge-009-msms.txt
challenge-001-ms.txt    challenge-003-ms.txt    challenge-005-ms.txt    challenge-007-ms.txt    challenge-009-ms.txt
challenge-002-msms.txt  challenge-004-msms.txt  challenge-006-msms.txt  challenge-008-msms.txt
challenge-002-ms.txt    challenge-004-ms.txt    challenge-006-ms.txt    challenge-008-ms.txt

MetFragWeb_Parameters:
MetFragWeb_Parameters.cfg  MetFragWeb_Peaklist.txt  README.txt

Prepare one directory with the required files for each annotation task

slightly adjust MetFragWeb_Parameters.cfg to use ionized precursor mass
works well in conjunction with "PrecursorIonMode" option


# 1 for M+H and -1 for M-H
PrecursorIonMode = 1 
FragmentPeakMatchRelativeMassDeviation = 5.0
SampleName = MetFragWeb_Sample
MetFragCandidateWriter = XLS
DatabaseSearchRelativeMassDeviation = 5.0
FragmentPeakMatchAbsoluteMassDeviation = 0.001
MetFragDatabaseType = PubChem
ResultsPath = .
#NeutralPrecursorMass = 272.068624
IonizedPrecursorMass = 272.068624
MetFragScoreTypes = FragmenterScore
MetFragScoreWeights = 1.0
MetFragPreProcessingCandidateFilter = UnconnectedCompoundFilter,IsotopeFilter
IsPositiveIonMode = true
MaximumTreeDepth = 2
NumberThreads = 1
UseSmiles = true
PeakListPath = MetFragWeb_Peaklist.txt

Prepare one directory with the required files for each annotation task

create the directories and populate with files


for x in `seq -f %03g 1 9`; do
 mkdir challenge-${x};
 cp data/challenge-${x}* challenge-${x};
 cp MetFragWeb_Parameters/MetFragWeb_Parameters.cfg challenge-${x};
 ln -s challenge-${x}-msms.txt challenge-${x}/MetFragWeb_Peaklist.txt;
done

Prepare one directory with the required files for each annotation task

inject precursor mass


for x in `seq -f %03g 1 9`; do
 mass=`head -n1 challenge-${x}/challenge-${x}-ms.txt | cut -f1`;
 sed -i 's|IonizedPrecursorMass =.*|IonizedPrecursorMass ='${mass}'|g' challenge-${x}/MetFragWeb_Parameters.cfg
done

Run all MetFrag processes


for x in `seq -f %03g 1 9`; do
 cd challenge-$x;
 java -jar ../MetFrag2.4.2-CL.jar MetFragWeb_Parameters.cfg
 cd ..;
done

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
handson_data		handson_data
media		media
reveal.js		reveal.js
MetFrag hands-on manual.docx		MetFrag hands-on manual.docx
README.md		README.md
index.html		index.html
metfrag_overview.html		metfrag_overview.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MetFrag Hands-on

MetFrag webservice workflow

Step 1 - Run initial MetFrag processing

1 a) Retrieve Candidates from database

1 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Step 2 - Run MetFrag processing using molecular formula

2 a) Retrieve Candidates from database

2 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Step 3 - Run MetFrag adding additional experimental information

3 a) Add the retention time data model to the MetFragWeb tool

3 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Step 4 - Run MetFrag adding additional meta information

4 a) Add the additional scoring terms

4 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Step 5 - Search in spectral libraries

5 a) Investigate MS/MS peaks in MassBank

Step 6 - Combine Spectra library search and MetFrag

6 a) Enable spectral similarity in MetFrag

6 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Exercise

Advanced exercise: using MetFrag on command line

Prepare one directory with the required files for each annotation task

Prepare one directory with the required files for each annotation task

Prepare one directory with the required files for each annotation task

Prepare one directory with the required files for each annotation task

Run all MetFrag processes

About

Releases

Packages

Contributors 2

Languages

ipb-halle/MetFragTraining

Folders and files

Latest commit

History

Repository files navigation

MetFrag Hands-on

MetFrag webservice workflow

Step 1 - Run initial MetFrag processing

1 a) Retrieve Candidates from database

1 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Step 2 - Run MetFrag processing using molecular formula

2 a) Retrieve Candidates from database

2 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Step 3 - Run MetFrag adding additional experimental information

3 a) Add the retention time data model to the MetFragWeb tool

3 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Step 4 - Run MetFrag adding additional meta information

4 a) Add the additional scoring terms

4 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Step 5 - Search in spectral libraries

5 a) Investigate MS/MS peaks in MassBank

Step 6 - Combine Spectra library search and MetFrag

6 a) Enable spectral similarity in MetFrag

6 b) Process candidates by performing in silico fragmentation and matching to MS/MS data

Exercise

Advanced exercise: using MetFrag on command line

Prepare one directory with the required files for each annotation task

Prepare one directory with the required files for each annotation task

Prepare one directory with the required files for each annotation task

Prepare one directory with the required files for each annotation task

Run all MetFrag processes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages