fake-news-classification-app

Fake news classifier web application now available! Click here!

Introduction

This is an end-to-end data science/machine learning project exploring a fake news dataset with exploratory data analysis, using NLP tools and machine learning to classify fake and genuine news. The model is used in a Django web application where a news article URL is entered as input and predicts whether the article is genuine or fake.

Folders

"0. fake-news-analysis-training" contains exploratory data analysis of the data, model training and evaluation.

"1. hyperparameter-tuning" contains a notebook examining how to hyperparameter tune a Random Forest model. Because of the large dataset used, only a sample (2000 example for each class) is used for this investigation. The current model is NOT tuned, and is up to the user whether to go down this route.

"2. app" contains the web application, using Django and Heroku. Using and virtual environment is highly advisable. See "How to load virtual environment" below for details.

Data

Data used for this project can be found from Kaggle. Credit goes to Clément Bisaillon for creating the dataset.

Data contains over 23k examples of fake news and over 21k examples of genuine news.

How to load virtual environment

When working with web applications, it is important to work within a virtual environment. This is because we require certain modules/libraries to be a specific version for our project which in a way does not affect the local version installed on the computer. The project will have specific versions of libraries that are boxed up and won't affect your computer.

If your virtual environment is not yet installed, run the following command:
pip install virtualenv

Next, in the directory where you are working from, create a virtual environment. For Windows:
virtualenv <ENVIRONMENT_NAME>

Once created, enter in the command line of the root directory:
.\<ENVIRONMENT_NAME>\Scripts\activate

and for Mac/Linux:
source <ENVIRONMENT_NAME>/bin/activate

You can tell you're in the virtual environment where at the beginning of the directory you see it in brackets (ENVIRONMENT_NAME)

Once you have the virtual environment up and running, you can go ahead and install the dependencies. This is done by running the following command: pip install -r requirements.txt

You can see what's installed by running pip freeze or pip list.

When you finish working within the environment, you can deactivate just by entering deactivate in the command line.

Running the Django application

In the root of the application directory where manage.py is located, run the following in the command line (and while in the virtual environment): python manage.py runserver

This will run the Django application, and you can view this by entering in the address bar of a web browser localhost:8000.

Room for improvement

There is room for improvement on the application. The model is by no means perfect and can be updated on a new dataset with current news. The application requries a valid news URL, but breaks if a non-URl is entered. This leads on to further written testing is required to prevent breaking and what-ifs.

Currently, the model is trained only on English language articles, so perhaps more models required for different languages.

Updates

2020/10/02

Fake news classifier web application now available! Updated django files and necessary heroku files available in 2. app folder.

2020/09/20

Added fake news notebook from Kaggle containing exploratory data analysis and machine learning model training, plus the save model pkl file.

2020/09/19

Added Random Forest hyperparameter tuning notebook. Contains RandomSearchCV, GridSearchCV, training with best hyperparameters, and comparison of best to base model.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
0. fake-news-analysis-training		0. fake-news-analysis-training
1. hyperparameter-tuning		1. hyperparameter-tuning
2. app		2. app
images		images
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fake-news-classification-app

Fake news classifier web application now available! Click here!

Introduction

Folders

Data

How to load virtual environment

Running the Django application

Room for improvement

Updates

2020/10/02

2020/09/20

2020/09/19

About

Releases

Packages

Languages

banluong/fake-news-classification-app

Folders and files

Latest commit

History

Repository files navigation

fake-news-classification-app

Fake news classifier web application now available! Click here!

Introduction

Folders

Data

How to load virtual environment

Running the Django application

Room for improvement

Updates

2020/10/02

2020/09/20

2020/09/19

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages