-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathcs434_hw3-1.tex
42 lines (39 loc) · 2.41 KB
/
cs434_hw3-1.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
\documentclass[12pt,letterpaper]{article}
\usepackage{amsmath} % just math
\usepackage{amssymb} % allow blackboard bold (aka N,R,Q sets)
\usepackage{ulem}
\usepackage{graphicx}
\usepackage{float}
\linespread{1.6} % double spaces lines
\usepackage[left=1in,top=1in,right=1in,bottom=1in,nohead]{geometry}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{floatrow}
\usepackage{blindtext}
\begin{document}
\setcounter{subsection}{2}
\begin{flushright}
\end{flushright}
\begin{flushleft}
\textbf{Eric Zounes, Ian Fridge} \\
\today \\
CS434: Assignment 3
\end{flushleft}
\section[1]{Implementation Assignment}
\begin{enumerate}
\item[1.] Implementing K-Means
\begin{enumerate}
\item Run your kmeans algorithm with $k = 3$. To verify that your algorithmn actually converges, please plot the objective of the Kmeans algorithm (SSE) as a function of the iterations. From one run to another run, this curve may look different. Just present the results of a typical run. \\
\includegraphics[width=6in]{SSEk3.eps} \\[15mm]
\item Plot the scatter plot of the given data, and inspect the scatter plot visually. How many clusters do you see in this data? \\
\includegraphics[width=6in]{kmeans_scatter.eps} \\[15mm]
I see 4 clusters. \\[10mm]
\includegraphics[width=6in]{kmeans_scatter_color.eps} \\[15mm]
The above graph is the kmeans clustering determined by our algorithm. Each color is associated with a cluster. \\
\item Now apply your kmeans implementaion to this dat with differnet values of k $(2,3, \ldots, 6$. For each value of k, please run your algorithm 10 times, each time with a different random initialization, record the lowest SSE value achieved in these 10 repeats for each value of k. Plot the recorded SSE values against the changing k value. Does the curve confirm your belief about the k value? Why or why not? \\
It shows that as we increase the value of k we have less and less SSE. This is because the clusters are smaller and smaller and thus the error in each is decreasing in size. The large clusters will have a larger average distance from the center of the cluster. Each point is the the minimum sum of the SSE for each iteartion for each value of k. We ran it 10 times at each value of k and summed the SSE of each iteration and plotted that value on the y axis with the value of k on the x axis.\\
\\
\includegraphics[width=6in]{minsse.eps} \\[15mm]
\end{enumerate}
\end{enumerate}
\end{document}