diff --git a/learning.ipynb b/learning.ipynb index 16bb4bd6b..0e4d97934 100644 --- a/learning.ipynb +++ b/learning.ipynb @@ -11,7 +11,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 2, "metadata": { "collapsed": true }, @@ -1778,6 +1778,275 @@ "source": [ "The Perceptron didn't fare very well mainly because the dataset is not linearly separated. On simpler datasets the algorithm performs much better, but unfortunately such datasets are rare in real life scenarios." ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## AdaBoost\n", + "\n", + "### Overview\n", + "\n", + "**AdaBoost** is an algorithm which uses **ensemble learning**. In ensemble learning the hypotheses in the collection, or ensemble, vote for what the output should be and the output with the majority votes is selected as the final answer.\n", + "\n", + "AdaBoost algorithm, as mentioned in the book, works with a **weighted training set** and **weak learners** (classifiers that have about 50%+epsilon accuracy i.e slightly better than random guessing). It manipulates the weights attached to the the examples that are showed to it. Importance is given to the examples with higher weights.\n", + "\n", + "All the examples start with equal weights and a hypothesis is generated using these examples. Examples which are incorrectly classified, their weights are increased so that they can be classified correctly by the next hypothesis. The examples that are correctly classified, their weights are reduced. This process is repeated *K* times (here *K* is an input to the algorithm) and hence, *K* hypotheses are generated.\n", + "\n", + "These *K* hypotheses are also assigned weights according to their performance on the weighted training set. The final ensemble hypothesis is the weighted-majority combination of these *K* hypotheses.\n", + "\n", + "The speciality of AdaBoost is that by using weak learners and a sufficiently large *K*, a highly accurate classifier can be learned irrespective of the complexity of the function being learned or the dullness of the hypothesis space." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Implementation\n", + "\n", + "As seen in the previous section, the `PerceptronLearner` does not perform that well on the iris dataset. We'll use perceptron as the learner for the AdaBoost algorithm and try to increase the accuracy. \n", + "\n", + "Let's first see what AdaBoost is exactly:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "\n", + "\n", + "
\n", + "def AdaBoost(L, K):\n",
+ " """[Figure 18.34]"""\n",
+ " def train(dataset):\n",
+ " examples, target = dataset.examples, dataset.target\n",
+ " N = len(examples)\n",
+ " epsilon = 1. / (2 * N)\n",
+ " w = [1. / N] * N\n",
+ " h, z = [], []\n",
+ " for k in range(K):\n",
+ " h_k = L(dataset, w)\n",
+ " h.append(h_k)\n",
+ " error = sum(weight for example, weight in zip(examples, w)\n",
+ " if example[target] != h_k(example))\n",
+ " # Avoid divide-by-0 from either 0% or 100% error rates:\n",
+ " error = clip(error, epsilon, 1 - epsilon)\n",
+ " for j, example in enumerate(examples):\n",
+ " if example[target] == h_k(example):\n",
+ " w[j] *= error / (1. - error)\n",
+ " w = normalize(w)\n",
+ " z.append(math.log((1. - error) / error))\n",
+ " return WeightedMajority(h, z)\n",
+ " return train\n",
+ "