This repository is related to the Kaggle competition: Titanic: Machine Learning from Disaster
In this competition, it is given a dataset (train.csv), with the survivors indications, and you must predict the output of a desired dataset (test.csv), if each person will survived or not.
You can check my analysis in the Titanic Jupyter-notebook.
It's a little bit tricky to deal with some features (some of them has more than 77% of missing values), but you can visualize the whole process to fill in these missing values.
This Titanic Jupyter-notebook gave me a score of 0.79425 (which means that it can predict 79.725% of the desired dataset correctly) and put me on the top 19%.
If you consider that there are a lot of cheaters (some of them can get a 100% score, which is verly unlikely in a event like that because random actions can interfer in the survival rate), it is a pretty decent score. Also, you can find the complete datasets with a quickly internet search.
I'm not sharing my final file (submission.csv) to avoid anyone just take it and submit to Kaggle, but you can run the code to get it easily.