-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Linux tutorial
Parts I and II of the Linux tutorial are for in-person participants only. Part III applies to anyone. After completing the Linux tutorial, you can make a directory in /afs/cas.unc.edu/users/y/o/yourname/public to hold any boot camp work you complete on a department Linux machine. If your initial disk space allocation runs out, we can arrange additional space, but it's likely to be sufficient. You can work directly on any Linux machine in Hell's Kitchen (the astro computing lab). It's also fine to just work on your laptop, albeit it may be less powerful.
Linux bonus tracks
vi tutorial
Even if you prefer nano or emacs or another programming editor, you should learn the basics of vi, because you may sometimes find yourself inadvertently dumped into vi when using git or linux. Note that vi is installed by default for Linux/Mac and comes with Git Bash for Windows (see "Git prep" above under Basics).
optional: emacs installation
Emacs is comparable to nano as just a code editor, but it may be more useful to you down the line -- it has many powerful features after decades of development from the Gnu community. Optionally install emacs for Windows (go into the directory with the latest version and download the appropriate installer) or Mac and run the built-in tutorial in the emacs help menu. (FWIW, your instructor uses emacs.)
Git and GitHub tutorial -- NOTE: this tutorial is best done with a partner.
- Visit this Python Basic Data Analysis Tutorial for instructions on installing Anaconda Python 3 [REQUIRED] and getting started on Python data analysis [as appropriate to your level]
- As appropriate to your level, work on (parts of) this Python Programming Tutorial.
- As appropriate to your level, complete this Jupyter Notebook Tutorial to get a simple introduction to Jupyter notebooks, which we will be using in several Boot Camp tutorials
- Follow along with these notes on vector math, broadcasting, and vectorizing your code
- Complete this Best Practices Tutorial to learn how to debug, speed up, and otherwise optimize code -- NOTE: this tutorial is best done in consultation with other Boot Campers and Boot Camp instructors.
- Optionally complete this Tutorial on Pandas, a powerful data analysis and manipulation package
Laws of Probability, Probability Distributions, Random Sampling, Uncertainties, and Confidence Intervals
- For background, look at these slides on basic statistics
- Complete this Monte Carlo Methods tutorial, including examples involving confidence intervals, determining areas, and inverse transform sampling
- Complete this Tutorial on Hypothesis Tests for Correlations and Distributions including an introduction to Kernel Density Estimation (KDE) as an alternative to histograms
This is a complicated topic (!) and we'll take it one step at a time.
-
Study these slides for a broad overview of the three topics in the tutorials.
-
For a deeper understanding, first complete this Tutorial on Interpreting Chi-Squared, in which you will generate and fit fake data using the parameter-free function y=1/x
-
Next complete this Tutorial on Frequentist Parameter Estimation (a.k.a., Parameter Estimation by Maximum Likelihood Model Fitting), in which you will generate and fit fake data using the function y = slope*x + intercept
-
Next complete this Tutorial on Bayesian Parameter Estimation, which serves as a counterpoint to the frequentist tutorial above
-
Finally, complete this Frequentist & Bayesian Model Selection Tutorial to go through an example of how to decide between two different models (in this case, first and second order polynomials) in the frequentist and Bayesian paradigms
-
If you just want to do quick and dirty line fitting using frequentist methods, okay, but please be aware of these issues for Realistic Line Fitting in the Frequentist Paradigm
- Complete this Tutorial on Bootstrapping
- Repeated bootstrapping can get computationally demanding -- optionally take a look at this Un-Tutorial on Multiprocessing, which explores how to speed up such an embarrassingly parallel computing task [Warning: not updated from 2017, may contain Python 2.7 code.]
- If you are interested, check out this tutorial on Convolutional Neural Networks (CNNs).
- Check out this tutorial on a powerful classification algorithm called random forest.
- Here is a tutorial on different types of samplers (including MCMC samplers) that you can use in Python.
- While this bootcamp mainly used matplotlib to create plots, check out this link that compares the Pros and Cons of various plotting packages that can be used to make plots in Python.