spec2vec_gnps_data_analysis

Analysis and benchmarking of mass spectra similarity measures using gnps data set.

If you use spec2vec for your research, please cite the following references:

F Huber, L Ridder, S Verhoeven, JH Spaaks, F Diblen, S Rogers, JJJ van der Hooft, "Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships", bioRxiv, https://doi.org/10.1101/2020.08.11.245928

(and if you use matchms as well: F. Huber, S. Verhoeven, C. Meijer, H. Spreeuw, E. M. Villanueva Castilla, C. Geng, J.J.J. van der Hooft, S. Rogers, A. Belloum, F. Diblen, J.H. Spaaks, (2020). matchms - processing and similarity evaluation of mass spectrometry data. Journal of Open Source Software, 5(52), 2411, https://doi.org/10.21105/joss.02411 )

Thanks!

Tutorial on matchms and Spec2Vec

Possibly the easiest way to learn how to run Spec2Vec is to follow our tutorial on matchms and Spec2Vec.

Create environment

Current spec2vec works with Python 3.7 or 3.8, it might also work with earlier versions but we haven't tested.

conda create --name spec2vec_analysis python=3.7  # or 3.8 if you prefer
conda activate spec2vec_analysis
conda install --channel nlesc --channel bioconda --channel conda-forge spec2vec
pip install jupyter

Clone this repository and run notebooks

git clone https://github.com/iomega/spec2vec_gnps_data_analysis
cd spec2vec_gnps_data_analysis
jupyter notebook

Download data

Original data was obtained from GNPS: https://gnps-external.ucsd.edu/gnpslibrary/ALL_GNPS.json
Cleaned and processed GNPS dataset for positive mode spectra (raw data accessed on 2020-05-11), can be found on zenodo: https://zenodo.org/record/3978072

Download pre-trained models

Pretrained Word2Vec models to be used with Spec2Vec can be found on zenodo.

Model trained on UniqueInchikey subset (12,797 spectra): https://zenodo.org/record/3978054
Model trained on AllPositive set of all positive ionization mode spectra (after filtering): https://zenodo.org/record/4173596

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.github/workflows		.github/workflows
custom_functions		custom_functions
notebooks		notebooks
tests		tests
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
__version__.py		__version__.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spec2vec_gnps_data_analysis

Tutorial on matchms and Spec2Vec

Create environment

Clone this repository and run notebooks

Download data

Download pre-trained models

About

Releases

Packages

Languages

License

iomega/spec2vec_gnps_data_analysis

Folders and files

Latest commit

History

Repository files navigation

spec2vec_gnps_data_analysis

Tutorial on matchms and Spec2Vec

Create environment

Clone this repository and run notebooks

Download data

Download pre-trained models

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages