Disaster Response Pipeline Project

This project has the intent of representing a full ETL piepline from raw data through to a finished interactive webpage app. The app itself analyses text inputs from users in order to classify potential factors for diagnosing responses in areas hit by natural disasters. It does this by utilising a RandomForect multi-output classifier model trained through the ETL pipeline process.

Required installations:

Testing has been completed on Python 3.6.3 and 3.6.8. Some testing has been completed on Python 3.7 but I cannot attest to it's reliability at this point. Brackets next to libraries below for fully tested versions.

sys
pandas (0.23.3)
numpy (1.12.1)
sqlalchemy (1.2.18)
nltk (3.2.5)
plotly (2.0.15)
sklearn (0.19.1, also tested 0.20.3)
pickle

Instructions:

Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
Run the following command in the app's directory to run your web app. python run.py
Go to http://0.0.0.0:3001/ If this does not work and you are running on Windows, please follow instructions found here: https://stackoverflow.com/questions/30554702/cant-connect-to-flask-web-service-connection-refused

Modules

process_data.py

This module takes two input csv files, disaster_messages (containing raw text of messages) and disaster_categories (containing a list of classifications of these messages, referenced by ID, indicating which "response" categories apply to the messages), and outputs a single database for later use, with data cleaned and transformed to remove duplicates, effectively onehotencode categorical variables and merge the datasets.

train_classifier.py

This module uses the database created using process_data in order to create a a RandomForect multi-output classifier model for later use in classifying text inputs into the same categories used in the disaster_categories.csv file.

run.py

Runs a web app allowing user input of text to show classification per the disaster_categories framework. The web app also shows some basic statistics on first logging in.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.ipynb_checkpoints		.ipynb_checkpoints
app		app
data		data
models		models
.gitattributes		.gitattributes
README.md		README.md
Train_data_test.ipynb		Train_data_test.ipynb
load_data_testing.ipynb		load_data_testing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disaster Response Pipeline Project

Required installations:

Instructions:

Modules

process_data.py

train_classifier.py

run.py

About

Releases

Packages

Languages

blkemp/Disaster-Response-Piplines-project

Folders and files

Latest commit

History

Repository files navigation

Disaster Response Pipeline Project

Required installations:

Instructions:

Modules

process_data.py

train_classifier.py

run.py

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages