Our Python library is specialised in the application of machine and deep learning (ML/DL) in the field of biospectroscopic applications. In recent years, most of the applications have been based on proprietary software solutions that were in the way of the FAIR principles of open science. With strong scientific partners from the USA, England, Norway and Germany we want to pave the way for a free source code use of the common standards and latest deep learning models in biospectroscopy.
- Manchester Institute of Biotechnology, The University of Manchester, (https://www.research.manchester.ac.uk/portal/en/researchers/alex-henderson(662eea20-e79b-424a-ab85-336adc2d9eb0)/contact.html)
- Biospectroscopy and Data Modeling Group, Norwegian University of Life Sciences, Ås, (https://www.nmbu.no/en/faculty/realtek/research/groups/biospectroscopy)
- Chemical Imaging and Structures Laboratory, University of Illinois at Urbana-Champaign, (http://chemimage.illinois.edu/group.html)
- Leibniz IPHT, Jena, (https://www.leibniz-ipht.de/en/departments/photonic-data-science-2/)
- Quasar (https://quasar.codes/)
- Biomedical Analysis Group, National Academy of Sciences of Belarus, (https://image.org.by/)
OpenVibSpec offers the possibility to handle everything from data import of raw measurements to ML/DL based data analysis in one ecosystem. We draw on established groundwork in the Python ecosystem to guarantee the best possible longevity.
You can find all pre-trained models in the wiki.
At this point, a distinction can be made between the fields of the established ML models and the new DL models. The former is based on the model-based correction of Mie scattering. The Mie correction based worflow is based on the fundamental paradigm of making the data understandable to the individual domain experts from spectroscopy. There are several options for correcting Mie scattering, all of which we would like to introduce. But first, for an overview, see Fig.1., here we see what the workflow is based on.
In short:
- After the measurement, the raw data is imported into OpenVibSpec.
- The correction of the Mie scattering in OpenVibSpec.
- Selection of the training data and the training of the machine learning algorithms in OpenVibSpec (mostly Random Forest).
- Validation of the training data and the models on independent data.
The literature on Mie-corrected data analysis in biospectroscopy, predominantly shows the use of Random Forest class classifiers after correction and selection of the training data. Currently, data can be imported into OpenVibSpec via the HDF5 and Matlab interfaces. Further, the raw data import is currently available for the Agilent FTIR and the DRS Daylight Solutions QCL spectroscopes.
The second major area includes the latest methods around deep learning. Here, the immense amount of data could be used to generate models and approaches that do not require direct correction of the IR spectroscopic data. This makes the model as a whole transferable and also significantly faster to analyse. This gives us two possible workflows for segmenting DNNs in Openvibspec, as shown in Figure 2.
Option A) shows that there are FTIR spectra based on the FFPE embedding of tissue and originating from the entity colon that can be classified in this way.
Possibility B) shows the procedure e.g. for spectra from other entities or embeddings like Fresh Frozen Tissue. This results in a short transfer learning stage to make the models transferable for the own data.
What your are reading right now is the main README. We advise you to read the guides in the wiki section of this repository. The first article to start with is: https://github.com/RUB-Bioinf/OpenVibSpec/wiki/Getting-Started
Exhaustive Example data is documented and available to download for free in the wiki. This includes real world data used for training and predicting.
You can also download an image from DockerHub.
Miniconda Version | Anaconda3 Version |
---|---|