Computational Mechanics 02 - Analyze Data

Learning some statistics and data processing skills in Python

Welcome to Computational Mechanics Module #2 - Analyze Data

There are four modules and one final project. The modules will get us started on our exploration of computational mechanics using Python, listed below each module are the learning objectives.

01_Cheers_Stats_Beers

Read data from a csv file using pandas.
The concepts of Data Frame and Series in pandas.
Clean null (NaN) values from a Series using pandas.
Convert a pandas Series into a numpy array.
Compute maximum and minimum, and range.
Revise concept of mean value.
Compute the variance and standard deviation.
Use the mean and standard deviation to understand how the data is distributed
Plot frequency distribution diagrams (histograms).
Normal distribution and 3-sigma rule.

02_Seeing_Stats

You should always plot your data.
The concepts of quantitative and categorical data.
Plotting histograms directly on columns of dataframes, using pandas.
Computing variance and standard deviation using NumPy built-in functions.
The concept of median, and how to compute it with NumPy.
Making box plots using pyplot.
Five statistics of a box plot: the quartiles Q1, Q2 (median) and Q3 (and
interquartile range Q3$-$Q1), upper and lower extremes.
Visualizing categorical data with bar plots.
Visualizing multiple data with scatter plots and bubble charts.
pandas is awesome!

03_Linear_Regression_with_Real_Data

Making our plots more beautiful
Defining and calling custom Python functions
Applying linear regression to data
NumPy built-ins for linear regression
The Earth is warming up!!!

04_Stats_and_Montecarlo

How to generate "random" numbers in Python$^+$
The definition of a Monte Carlo model
How to calculate $\pi$ with Monte Carlo
How to model Brownian motion with Monte Carlo
How to propagate uncertainty in a model with Monte Carlo

$^+$ Remember, the computer only generates pseudo-random numbers. For further information and truly random numbers check www.random.org

Project #02 - NYSE random walk predictor

In the Stats and Monte Carlo module, you created a Brownian motion model to predict the motion of particles in a fluid. The Monte Carlo model took steps in the x- and y-directions with random magnitudes.

This random walk can be used to predict stock prices. Let's take a look at some data from the New York Stock Exchange NYSE from 2010 through 2017.

Important Note: I am not a financial advisor and these models are purely for academic exercises. If you decide to use anything in these notebooks to make financial decisions, it is at your own risk. I am not an economist/financial advisor/etc., I am just a Professor who likes to learn and exeriment.

Here, I will show an example workflow to analyze and predict the Google stock price [GOOGL] from 2010 - 2014. Then, you can choose your own stock price to evaluate and create a predictive model.

Explore data and select data of interest
Find statistical description of data: mean and standard deviation
Create random variables
Generate random walk for [GOOGL] stock opening price

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Computational Mechanics 02 - Analyze Data

Learning some statistics and data processing skills in Python

01_Cheers_Stats_Beers

02_Seeing_Stats

03_Linear_Regression_with_Real_Data

04_Stats_and_Montecarlo

HW_02

Project #02 - NYSE random walk predictor

Files

README.md

Latest commit

History

README.md

File metadata and controls

Computational Mechanics 02 - Analyze Data

Learning some statistics and data processing skills in Python

01_Cheers_Stats_Beers

02_Seeing_Stats

03_Linear_Regression_with_Real_Data

04_Stats_and_Montecarlo

HW_02

Project #02 - NYSE random walk predictor