Skip to content

kumarvinit15/Data-Science-Capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science-Capstone Project

Real or Not? NLP with Disaster Tweets (Kaggle Challenge)

Table of Contents:

1. Description

2. Installation

3. File Descriptions

4. Dataset

5. Summary Blogpost

6. Licensing, authors and acknowledgement

1. Description:

Twitter has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencie are interested in programatically monitoring Twitter (i.e. disaster relief organizations and news agencies).

Main goal of this project is to build a machine learning model that predicts which Tweets are about real disasters and which one’s aren’t.

2. Installation:

Anaconda Python distribution was used to create the jupyter notebook for this project.There were no additional liabraries installed in support of this project.

The version of the notebook server is: 5.7.4.

The version of Python us: Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)].

3. File Descriptions:

Following files are uploaded in the repository:

  1. DS Capstone.ipynb: Contains all the analysis and modeling of the Boston and Seattle Airbnb datasets
  2. train.csv - the training set
  3. test.csv - the test set
  4. sample_submission.csv - a sample submission file in the correct format

4. Dataset:

Dataset is provided by Kaggle and can be found at below links:

https://www.kaggle.com/c/nlp-getting-started/data

5. Summary Blogpost:

Summary of data analysis and results can be found at below link on the medium portal:

https://medium.com/real-or-not-nlp-with-disaster-tweets/real-or-not-nlp-with-disaster-tweets-a-data-science-capstone-project-fafa6c35c16f

6. Licensing,authors and acknowledgement:

This dataset was created by the company figure-eight and originally shared on their ‘Data For Everyone’ website. Kaggle hosted a challenge to develop machine learning models to classify tweets into real disaster or not.

Disclaimer: The dataset for this competition contains text that may be considered profane, vulgar, or offensive.

About

Real or Not? NLP with Disaster Tweets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published