This project focuses on studying United Nation's global refugee crisis pertaining to migration, refugee populations and demographics.
Focus is heavy on data exploration that includes migratory analysis based on population and demographics such as refugee status analysis based on country of residence and origins, total refugee population comparison in pythonic ETL using core built-in data structures like lists, dictionaries, nested dictionaries, tuples, and namedtuples. As for data visualization, Python data science packages include Matplotlib, Basemap, Pandas, Numpy, and RESTFUl API are used to illustrate key data insights.
Source: UNHCR (The UN Refugee Agency) http://popstats.unhcr.org/en/time_series. The data consists of year, population type, population count, origin and country of asylum. This study specifically focuses on the 10 year span from 2007 - 2016, except for the two refugee categories “asylum-seeking” and “refugee (incl. refugee-like situations)”, for which data is given only for the last 3 years, so the data from 2014 - 2016 is analyzed in more depth.
To compile any part of the code, you will need to install the following to your machine in python version 3 or higher: geopy, mpl_toolkits for Basemap, matplotlib, numpy, pandas, collections, and csv
This project will continue to evolve to improve data visualizations, better code modularity, and hoping to increase global awareness of refugee crisis at present using data science and data engineering to attract a bigger global audience.