Implement five different classifications. These models include Logistic regression, Decision trees, Support vector machines, Adaboost, and random forest, and will be applied to two text-based data sets. We have compared the accuracy and efficiency of all five models on the datasets. The results showed that trees had a lower efficiency and sometimes lower accuracy than the linear-based models. We believe this to be the result of numerous amounts of branches, due to the classification style of tree-based models.
Using the two datasets we were able to get an idea of how different classification methods compare to each other in terms of their efficiencies and accuracies. We learned that with dataset 1 linear models tended to not only be more efficient, but also more accurate in most cases. For example, Logistic regression and SVM were both more accurate and efficient than random forest and decision trees. Although Adaboost was the most inaccurate of all even though it was a linear model. This allowed it to be more efficient than both tree models, but efficiency is worth nothing if the model lacks accuracy.