WELCOME!
This GitHub is home to the report:
Natural Language Processing
Toxic Comment Classification on a Broad Data Set
ABSTRACT:
A Random Forest Classifier is used to identify different types of toxic comments from a data set. In addition to this, the positive affection of the performance of this algorithm by using parameter optimization is described. The optimized Random Forest Classifier is tested on whether it is possible for the algorithm to become a proper and useful toxic comment filter for day-to-day users. After conducting the test it was determined that the results were inconclusive and it is not possible for this model to become a proper toxic comment filter for the specified user group.
For viewing the Project Report, please open: NLP_Project_HILDERS_i6169337_TEEUWEN_i6169583.pdf
To see the code + additional information about the code: NotebookOfProject.ipynb
If you would like to see the python files of the project they are also there. However, they might not be completely up to date.
Thank you and enjoy!
Made by: Martijn Hilders and Cas Teeuwen