Skip to content

Commit 4015086

Browse files
MeMe
Me
authored and
Me
committed
Initial mirror
0 parents  commit 4015086

File tree

9,660 files changed

+112982
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

9,660 files changed

+112982
-0
lines changed

01-intro.Rmd

+71
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
---
2+
output:
3+
html_document: default
4+
pdf_document: default
5+
---
6+
# Introduction {#intro}
7+
8+
In the era of large scale data collection we are trying to make meaningful intepretation of data.
9+
10+
There are two ways to meaningfully intepret data and they are
11+
12+
1. Mechanistic or mathematical modeling based
13+
2. Descriptive or Data Driven
14+
15+
We are here to discuss the later approach using machine learning (ML) approaches.
16+
17+
## What is machine learning?
18+
19+
We use - computers - more precisely - algorithms to see patterns and learn concepts from data - without being explicitly programmed.
20+
21+
For example
22+
23+
1. Google ranking web pages
24+
2. Facebook or Gmail classifying Spams
25+
3. Biological research projects that we are doing - we use ML approaches to interpret effects of mutations in the noncoding regions.
26+
27+
We are given a set of
28+
29+
1. Predictors
30+
2. Features or
31+
3. Inputs
32+
33+
that we call 'Explanatory Variables'
34+
35+
and we ask different statistical methods, such as
36+
37+
1. Linear Regression
38+
2. Logistic Regression
39+
3. Neural Networks
40+
41+
to formulate an hypothesis i.e.
42+
43+
1. Describe associations
44+
2. Search for patterns
45+
3. Make predictions
46+
47+
for the Outcome Variables
48+
49+
A bit of a background: ML grew out of AI and Neural Networks
50+
51+
## Aspects of ML
52+
53+
There are two aspects of ML
54+
55+
1. Unsupervised learning
56+
2. Supervised learning
57+
58+
**Unsupervised learning**: When we ask an algorithm to find patterns or structure in the data without any specific outcome variables e.g. clustering. We have little or no idea how the results should look like.
59+
60+
**Supervised learning**: When we give both input and outcome variables and we ask the algorithm to formulate an hypothesis that closely captures the relationship.
61+
62+
## What actually happened under the hood
63+
The algorithms take a subset of observations called as the training data and tests them on a different subset of data called as the test data.
64+
65+
The error between the prediction of the outcome variable the actual data is evaulated as test error. The objective function of the algorithm is to minimise these test errors by tuning the parameters of the hypothesis.
66+
67+
Models that successfully capture these desired outcomes are further evaluated for **Bias** and **Variance** (overfitting and underfitting).
68+
69+
All the above concepts will be discussed in detail in the following lectures.
70+
71+

01-intro.utf8.md

+71
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
---
2+
output:
3+
html_document: default
4+
pdf_document: default
5+
---
6+
# Introduction {#intro}
7+
8+
In the era of large scale data collection we are trying to make meaningful intepretation of data.
9+
10+
There are two ways to meaningfully intepret data and they are
11+
12+
1. Mechanistic or mathematical modeling based
13+
2. Descriptive or Data Driven
14+
15+
We are here to discuss the later approach using machine learning (ML) approaches.
16+
17+
## What is machine learning?
18+
19+
We use - computers - more precisely - algorithms to see patterns and learn concepts from data - without being explicitly programmed.
20+
21+
For example
22+
23+
1. Google ranking web pages
24+
2. Facebook or Gmail classifying Spams
25+
3. Biological research projects that we are doing - we use ML approaches to interpret effects of mutations in the noncoding regions.
26+
27+
We are given a set of
28+
29+
1. Predictors
30+
2. Features or
31+
3. Inputs
32+
33+
that we call 'Explanatory Variables'
34+
35+
and we ask different statistical methods, such as
36+
37+
1. Linear Regression
38+
2. Logistic Regression
39+
3. Neural Networks
40+
41+
to formulate an hypothesis i.e.
42+
43+
1. Describe associations
44+
2. Search for patterns
45+
3. Make predictions
46+
47+
for the Outcome Variables
48+
49+
A bit of a background: ML grew out of AI and Neural Networks
50+
51+
## Aspects of ML
52+
53+
There are two aspects of ML
54+
55+
1. Unsupervised learning
56+
2. Supervised learning
57+
58+
**Unsupervised learning**: When we ask an algorithm to find patterns or structure in the data without any specific outcome variables e.g. clustering. We have little or no idea how the results should look like.
59+
60+
**Supervised learning**: When we give both input and outcome variables and we ask the algorithm to formulate an hypothesis that closely captures the relationship.
61+
62+
## What actually happened under the hood
63+
The algorithms take a subset of observations called as the training data and tests them on a different subset of data called as the test data.
64+
65+
The error between the prediction of the outcome variable the actual data is evaulated as test error. The objective function of the algorithm is to minimise these test errors by tuning the parameters of the hypothesis.
66+
67+
Models that successfully capture these desired outcomes are further evaluated for **Bias** and **Variance** (overfitting and underfitting).
68+
69+
All the above concepts will be discussed in detail in the following lectures.
70+
71+

02-dimensionality-reduction.knit.md

+214
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)