Authors: Jesse Zaneveld1, Nia Prabhu*1, Aziz Bajouri*1,2, Ayomikun Akinrinade*1,3, Dr. Mushtaq Bilal*4
* Chapter and Vignette authors contributed equally and are listed in chronological order of first contribution.
1 Division of Biological Sciences, School of STEM, University of Washington, Bothell, Washington, USA
2 Division of Computer and Software Systems, School of STEM, University of Washington, Bothell, Washington, USA
3 Division of Health Studies, School of Nursing and Health Studies, University of Washington, Bothell, Washington, USA
4
Full Spectrum Bioinformatics is a free online text designed to introduce key topics in Bioinformatics using the Python programming language. The text is written in interactive Jupyter Notebooks, which allow you to try out and modify example code and analyses.
In addition to explanations of concepts, Full Spectrum Bioinformatics also includes Bioinformatics Vignettes written by readers of the text. Each vignette is focused around a particular core concept, and show how readers have applied that concepts to their research projects.
If you happen to already be familiar with GitHub and Jupyter Notebooks, you can download the entire project and run it interactively, or click the 'Open in Colab' links (they looks like this: ) to open interactive versions of each section in Google Colab (you will need to 'Save as' your own copy in order to change code).
If you would just like to read a chapter, you can also view a static version of each section using the nbviewer
links (they look like this: ). nbviewer
stands for 'notebook viewer', so this is just a way to view chapters with code in them without actually running the code. This will generally be the best way to view the chapters non-interactively.
Finally, you can also use the direct GitHub links (the link that's the name of each chapter) to view any chapeter. This shows the chapter on GitHub. It usually works well, but you may sometimes get a GitHub error message. Usually hitting reload page or using the link avoids this issue.
The text is currently in prototype status. Chapters with content you can preview are linked below:
-
Chapter 1. Foreword
-
Chapter 2. Introduction
- The Many Paths to Bioinformatics
- Speaking Each Other's Language
- An Absurdly Brief Introduction to Biology
- An Absurdly Brief Introduction to Computer Science
- An Absurdly Brief Introduction to Statistics
-
Chapter 3. The Command Line
-
Chapter 4. Exploring Python
-
Chapter 5. Project Design
-
Chapter 6. Biological Sequences
- An introduction to Biological Sequences
- Representing and Manipulating Biological Sequences as Python Strings
- Analyzing Biological Sequences with For Loops and If Statements
- Reading and writing FASTA files using Python
- Bioinformatics Vignette (Aziz Bajouri): Using set objects to find circular RNAs involved in multiple diseases
- Exercise: Error Bingo
- Error Messages in Python
- Bioinformatics Vignette (Nia Prabhu): Using For Loops and Dictionaries to Compare Nucleotide Composition in Pandemic and Non-Pandemic Causing Influenza Strains
- Capstone: testing for depletion of CG dinucleotides in the human genome
-
Chapter 7. 'Omics
- An Introduction to 'Omics
- Working with Tabular 'Omic data in Python using Pandas
- Joining and Filtering Pandas DataFrames
- Analyzing Microbiome Alpha Diversity in Python
- Analyzing Microbiome Beta Diversity in Python
- Simulating the Effect of Sequencing Depth on Diversity Estimates
-
Level Up: Taking Stock of your Project and Revising your Process
- Reflecting on your Project so Far
- Project Organization Strategies for Collaborative and Reproducible Research
- Test Code: a powerful strategy for ensuring your results aren't lies.
-
Chapter 8. Visualization
- Graphs as a Visual Language
- Exercise: Anger Tufte
- Representing Correlation
- Representing Distribution
-
Chapter 9. Alignment and Phylogenetics
-
9a. Alignment
-
Homology and Alignment
-
Local Alignment with the Smith-Waterman algorithm
-
BLAST and the k-mer trick
-
9b. Phylogenetics
-
Tree thinking
-
Working with Traits on Trees
-
Maximum Parsimony Ancestral State Reconstruction
-
Phylogenetic Comparative Methods
-
Trait prediction
-
-
Chapter 10. Simulation
- Simulating Biological Networks
- Simulating the Population Genetics of Natural Selection and Genetic Drift
- Simulating the Evolution of Social Behavior
-
Chapter 11. Statistics
- Linear Models - a Statistical Swiss Army Knife
- Monte Carlo simulation and the Fundamental Unity of Statistical Hypothesis Tests
- Statistical Distributions and Parametric Tests
- Rank Transformations
- Monte Carlo simulation of Effect Size, Sample Size, and Significance
- Dealing with Multiple Comparisons
- Exercise: Revising your writing about statistical results
- An Introduction to Maximum Likelihood optimization
- The Best Model of A Cat is a Cat - model complexity, overfitting, and the AIC
- An Introduction to Bayesian Approaches
-
Chapter 12. Multivariate Statistics and Machine Learning
- Unsupervised Classification: of ordination, clustering and fishtanks
- Supervised Classification: from lines to trees to forests.
- Bioinformatics Vignette (Ayomikun Akinrinade): Using K-Nearest Neighbors and Binary Decision Tree Algorithms to Predict Enzyme Function from Protein Sequences
-
Chapter 13. Presenting Research
- Presentations as Verbal Chess
-
Chapter 14. Polishing and Publishing
- Presenting Research
- From Data to Conclusion: building a research manuscript brick by brick
- Resistance is Futile: becoming a language Borg
- Exercise: generating a targeted title using templating
- The Inverted Pyramid: optimizing your text from a reader's perspective
-
Chapter 15. Careers that draw on Bioinformatics
- Fighting for an Inclusive Workplace
- Examining Privilege and Identity
- Making Your Science and Teaching Accessible and Inclusive
- Campus and Local Activism
- Improving University Policy
- Happiness Matters
- Radical Collaboration
- Cognitive Bias and Networking
- Open-source Science as Shield and Sword
- Applying for Grants
- Fighting for an Inclusive Workplace
-
Appendices:
- Appendix A - Data Sources for Bioinformatics Projects
- Appendix B - Timesaving Starter Code
- Template Script with Interface and Test Code
- IUPAC codes in python
- Standard Translation Tables in Python
- Appendix C - Contributing a Community Example
- Appendix D - Paper Formatting Kit
- Appendix E - Project Specifications
This project is being developed with support from NSF Integrative and Organismal Systems award .
You can submit feedback about completed chapters at the following link