Skip to content

Commit

Permalink
Added week 4
Browse files Browse the repository at this point in the history
  • Loading branch information
jtleek committed Feb 10, 2013
1 parent f7c80e4 commit a9367a7
Show file tree
Hide file tree
Showing 4,012 changed files with 156,124 additions and 0 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
Binary file modified .DS_Store
Binary file not shown.
Binary file added pdfs/basicLeastSquares.pdf
Binary file not shown.
Binary file added pdfs/clusteringExample.pdf
Binary file not shown.
Binary file added pdfs/factorVariables.pdf
Binary file not shown.
Binary file added pdfs/inferenceBasics.pdf
Binary file not shown.
Binary file added pdfs/multipleVariables.pdf
Binary file not shown.
Binary file added pdfs/pValues.pdf
Binary file not shown.
Binary file added pdfs/realData.pdf
Binary file not shown.
Empty file.
14 changes: 14 additions & 0 deletions week4/001clusteringExample/assets/css/custom.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
strong, b {
font-weight: bolder;
}
em {
font-style: italic;
}

img.center {
display: block;
margin: auto auto;
}
redtext {
color: red;
}
50 changes: 50 additions & 0 deletions week4/001clusteringExample/assets/css/ribbons.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
/*Github Ribbon Test*/
/* Source: https://github.com/dciccale/css3-github-ribbon */
/* Define classes for example, definition, problem etc. */
/* Choose meaningful colors for background and text */

.example {
background-color: #121621;
top: 1.2em;
right: -3.2em;
-webkit-transform: rotate(45deg);
-moz-transform: rotate(45deg);
transform: rotate(45deg);
-webkit-box-shadow: 0 0 0 1px #1d212e inset,0 0 2px 1px #fff inset,0 0 1em #888;
-moz-box-shadow: 0 0 0 1px #1d212e inset,0 0 2px 1px #fff inset,0 0 1em #888;
box-shadow: 0 0 0 1px #1d212e inset,0 0 2px 1px #fff inset,0 0 1em #888;
color: #FF0;
display: block;
padding: .6em 3.5em;
position: absolute;
font: bold .82em sans-serif;
text-align: center;
text-decoration: none;
text-shadow: 1px -1px 8px rgba(0,0,0,0.60);
-webkit-user-select: none;
-moz-user-select: none;
user-select: none;
}

.definition {
background-color: #a00;
top: 1.2em;
right: -3.2em;
-webkit-transform: rotate(45deg);
-moz-transform: rotate(45deg);
transform: rotate(45deg);
-webkit-box-shadow: 0 0 0 1px #1d212e inset,0 0 2px 1px #fff inset,0 0 1em #888;
-moz-box-shadow: 0 0 0 1px #1d212e inset,0 0 2px 1px #fff inset,0 0 1em #888;
box-shadow: 0 0 0 1px #1d212e inset,0 0 2px 1px #fff inset,0 0 1em #888;
color: #FFF;
display: block;
padding: .6em 3.5em;
position: absolute;
font: bold .82em sans-serif;
text-align: center;
text-decoration: none;
text-shadow: 1px -1px 8px rgba(0,0,0,0.60);
-webkit-user-select: none;
-moz-user-select: none;
user-select: none;
}
Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file.
Empty file.
Binary file added week4/001clusteringExample/data/face.rda
Binary file not shown.
25 changes: 25 additions & 0 deletions week4/001clusteringExample/data/samsung.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@

## This file must be run in the
## UCI HAR Dataset/ directory

xvals <- read.table("train/X_train.txt")
yvals <- read.table("train/Y_train.txt")
features <- read.table('features.txt')
subject <- read.table("train/subject_train.txt")


colnames(xvals) <- features[,2]
yvals <- yvals[,1]
yvals[yvals==1]="walk"
yvals[yvals==2]="walkup"
yvals[yvals==3]="walkdown"
yvals[yvals==4]="sitting"
yvals[yvals==5]="standing"
yvals[yvals==6]="laying"

xvals$subject <- subject[,1]
xvals$activity <- yvals

samsungData <- xvals

save(samsungData,file="samsungData.rda")
Binary file added week4/001clusteringExample/data/samsungData.rda
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added week4/001clusteringExample/fig/oChunk.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added week4/001clusteringExample/fig/processData.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added week4/001clusteringExample/fig/randomData.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added week4/001clusteringExample/fig/svdChunk.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
205 changes: 205 additions & 0 deletions week4/001clusteringExample/index.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
---
title : Clustering example
subtitle :
author : Jeffrey Leek, Assistant Professor of Biostatistics
job : Johns Hopkins Bloomberg School of Public Health
framework : io2012 # {io2012, html5slides, shower, dzslides, ...}
highlighter : highlight.js # {highlight.js, prettify, highlight}
hitheme : tomorrow #
widgets : [mathjax] # {mathjax, quiz, bootstrap}
mode : selfcontained # {standalone, draft}
---


```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F}
# make this an external chunk that can be included in any file
options(width = 100)
opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache = T, cache.path = '.cache/', fig.path = 'fig/')
options(xtable.type = 'html')
knit_hooks$set(inline = function(x) {
if(is.numeric(x)) {
round(x, getOption('digits'))
} else {
paste(as.character(x), collapse = ', ')
}
})
knit_hooks$set(plot = knitr:::hook_plot_html)
```

## Samsung Galaxy S3

<img class=center src=assets/img/samsung.png height='80%'/>

[http://www.samsung.com/global/galaxys3/](http://www.samsung.com/global/galaxys3/)


---

## Samsung Data

<img class=center src=assets/img/ucisamsung.png height='60%'/>

[http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones](http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones)


---

## Slightly processed data

```{r loadData,cache=FALSE}
download.file("https://dl.dropbox.com/u/7710864/courseraPublic/samsungData.rda"
,destfile="./data/samsungData.rda",method="curl")
load("./data/samsungData.rda")
names(samsungData)[1:12]
table(samsungData$activity)
```

---

## Plotting average acceleration for first subject

```{r processData,dependson="loadData",fig.height=4.5,fig.width=8}
par(mfrow=c(1,2))
numericActivity <- as.numeric(as.factor(samsungData$activity))[samsungData$subject==1]
plot(samsungData[samsungData$subject==1,1],pch=19,col=numericActivity,ylab=names(samsungData)[1])
plot(samsungData[samsungData$subject==1,2],pch=19,col=numericActivity,ylab=names(samsungData)[2])
legend(150,-0.1,legend=unique(samsungData$activity),col=unique(numericActivity),pch=19)
```

---

## Clustering based just on average acceleration


```{r dependson="processData",fig.height=4,fig.width=4,cache=TRUE}
source("http://dl.dropbox.com/u/7710864/courseraPublic/myplclust.R")
distanceMatrix <- dist(samsungData[samsungData$subject==1,1:3])
hclustering <- hclust(distanceMatrix)
myplclust(hclustering,lab.col=numericActivity)
```


---

## Plotting max acceleration for the first subject

```{r ,dependson="processData",fig.height=4,fig.width=8}
par(mfrow=c(1,2))
plot(samsungData[samsungData$subject==1,10],pch=19,col=numericActivity,ylab=names(samsungData)[10])
plot(samsungData[samsungData$subject==1,11],pch=19,col=numericActivity,ylab=names(samsungData)[11])
```

---

## Clustering based on maximum acceleration

```{r dependson="processData",fig.height=4,fig.width=4,cache=TRUE}
source("http://dl.dropbox.com/u/7710864/courseraPublic/myplclust.R")
distanceMatrix <- dist(samsungData[samsungData$subject==1,10:12])
hclustering <- hclust(distanceMatrix)
myplclust(hclustering,lab.col=numericActivity)
```



---

## Singular value decomposition

```{r svdChunk,dependson="processData",fig.height=4,fig.width=8,cache=TRUE}
svd1 = svd(scale(samsungData[samsungData$subject==1,-c(562,563)]))
par(mfrow=c(1,2))
plot(svd1$u[,1],col=numericActivity,pch=19)
plot(svd1$u[,2],col=numericActivity,pch=19)
```

---

## Find maximum contributor

```{r dependson="svdChunk",fig.height=4,fig.width=4,cache=TRUE}
plot(svd1$v[,2],pch=19)
```


---

## New clustering with maximum contributer

```{r dependson="svdChunk",fig.height=4.5,fig.width=4.5,cache=TRUE}
maxContrib <- which.max(svd1$v[,2])
distanceMatrix <- dist(samsungData[samsungData$subject==1,c(10:12,maxContrib)])
hclustering <- hclust(distanceMatrix)
myplclust(hclustering,lab.col=numericActivity)
```


---

## New clustering with maximum contributer

```{r dependson="svdChunk",fig.height=4.5,fig.width=4.5,cache=TRUE}
names(samsungData)[maxContrib]
```

---

## K-means clustering (nstart=1, first try)

```{r kmeans1,dependson="processData",fig.height=4,fig.width=4}
kClust <- kmeans(samsungData[samsungData$subject==1,-c(562,563)],centers=6)
table(kClust$cluster,samsungData$activity[samsungData$subject==1])
```



---

## K-means clustering (nstart=1, second try)

```{r dependson="kmeans1",fig.height=4,fig.width=4,cache=TRUE}
kClust <- kmeans(samsungData[samsungData$subject==1,-c(562,563)],centers=6,nstart=1)
table(kClust$cluster,samsungData$activity[samsungData$subject==1])
```


---

## K-means clustering (nstart=100, first try)

```{r dependson="kmeans1",fig.height=4,fig.width=4,cache=TRUE}
kClust <- kmeans(samsungData[samsungData$subject==1,-c(562,563)],centers=6,nstart=100)
table(kClust$cluster,samsungData$activity[samsungData$subject==1])
```



---

## K-means clustering (nstart=100, second try)

```{r kmeans100,dependson="kmeans1",fig.height=4,fig.width=4,cache=TRUE}
kClust <- kmeans(samsungData[samsungData$subject==1,-c(562,563)],centers=6,nstart=100)
table(kClust$cluster,samsungData$activity[samsungData$subject==1])
```

---

## Cluster 1 Variable Centers (Laying)

```{r dependson="kmeans100",fig.height=4,fig.width=8,cache=FALSE}
plot(kClust$center[1,1:10],pch=19,ylab="Cluster Center",xlab="")
```


---

## Cluster 2 Variable Centers (Walking)

```{r dependson="kmeans100",fig.height=4,fig.width=8,cache=FALSE}
plot(kClust$center[6,1:10],pch=19,ylab="Cluster Center",xlab="")
```


Loading

0 comments on commit a9367a7

Please sign in to comment.