Skip to content

hritikchaturvedi11/EDA_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

EDA_Project

  • The sinking of the RMS Titanic in the early morning of 15 April 1912, four days into the ship's maiden voyage from Southampton to New York City, was one of the deadliest peacetime maritime disasters in history, killing more than 1,500 people.

  • The largest passenger liner in service at the time, Titanic had an estimated 2,224 people on board when she struck an iceberg in the North Atlantic.

  • The ship had received six warnings of sea ice but was travelling at near maximum speed when the lookouts sighted the iceberg.

  • Unable to turn quickly enough, the ship suffered a glancing blow that buckled the starboard (right) side and opened five of sixteen compartments to the sea.

  • The disaster caused widespread outrage over the lack of lifeboats, lax regulations, and the unequal treatment of the three passenger classes during the evacuation.

  • Inquiries recommended sweeping changes to maritime regulations, leading to the International Convention for the Safety of Life at Sea (1914), which continues to govern maritime safety.

  • The titanic.csv file contains data for 891 of the real Titanic passengers.

    • Each row represents one person.

    • The columns describe different attributes about the person including whether they survived, their age, their ticket-class, their sex and the fare they paid.


  • The goal of this analysis is to analyse the data set, explore it answering related questions using data visualization and statistical methods.

  • There are also some questions we would like to answer with Titanic dataset analysis.

    • What is passengers demographic structure analyzed in terms of attributes?

    • What is the overall passengers survival ratio?

    • Which groups have higher chances for survival?

Problem Statement

  • On April 15, 1912, the largest passenger liner ever made collided with an iceberg during her maiden voyage.

  • When the Titanic sank it killed 1502 out of 2224 passengers and crew.

  • This sensational tragedy shocked the international community and led to better safety regulations for ships.

  • One of the reasons that the shipwreck resulted in such loss of life was that there were not enough lifeboats for the passengers and crew.

  • Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others such as women, children, and the upper-class.


Conclusion

  • With the help of this notebook we learnt how exploratory data analysis can be carried out using Pandas plotting.
  • Also we have seen making use of packages like matplotlib and seaborn to develop better insights about the data.
  • We have also seen how pre-proceesing helps in dealing with missing values and irregualities present in the data. We also learnt how to create new features which will in turn help us to better predict the survival.
  • We also make use of pandas profiling feature to generate an html report containing all the information of the various features present in the dataset.
  • We have seen the impact of columns like Age, Embarked, Fare, SibSp and Parch on the rate of survival.
  • The most important inference drawn from all this analysis is, we get to know what are the features on which survival is highly positively and negatively correlated with.
  • This analysis will help us to choose which machine learning model we can apply to predict survival of test dataset.

Releases

No releases published

Packages

No packages published