Skip to content

Illustrates use of basic Funtions inside the Pandas Library

Notifications You must be signed in to change notification settings

revantg/basic_pandas

Repository files navigation

Pandas Tutorial

This repository makes use of most of the basic functions of pandas library. Nothing Fancy.

What does it do?

  • Can analyse 1082 indicators for almost all of the countries in existance. For eg
In [21]: temp_df.sample(10)
Out[21]:
                                                     0                     1
964  Tariff rate, applied, weighted mean, manufactu...  TM_TAX_MANF_WM_AR_ZS
816                Primary education, duration (years)           SE_PRM_DURS
68        Average precipitation in depth (mm per year)        AG_LND_PRCP_MM
634  Mortality rate, adult, male (per 1,000 male ad...        SP_DYN_AMRT_MA
584     Manufacturing, value added (constant 2000 US$)        NV_IND_MANF_KD
840            Pump price for gasoline (US$ per liter)        EP_PMP_SGAS_CD
254   Employees, services, male (% of male employment)     SL_SRV_EMPL_MA_ZS
920  Secondary education, vocational pupils (% female)  SE_SEC_ENRL_VO_FE_ZS
733  Nitrous oxide emissions in industrial and ener...     EN_ATM_NOXE_EI_ZS
100      Chemicals (% of value added in manufacturing)     NV_MNF_CHEM_ZS_UN
In [22]: len(indicators)
Out[22]: 1082
  • To narrow down the datatset and the declutter the graph, only asian countries have been analysed for the parameter(indicators).
In [23]: df_temp = pd.read_html('Country_Codes2.html')
    ...: df = df_temp[0]
    ...: a_c_codes = [df['alpha-3'][i] for i in range(len(df['country-code'])) if df['region'][i] == 'Asia' ]
    ...: print(len(a_c_codes))
Out[23] : 51
  • Built the DataFrame from the paramters for various countries and Pickle it for further use to save time
  • Calculate Percentage Change in the data from the previous value.
  • Calculate Percentage Change in the data from the starting (initial) value.
  • Plot both of the above DataFrames.
  • Pickle both the DataFrames for quick access.
  • Plot the correlations between different countries
In [24]: growth_correlations.describe()['IND']
Out[24]:
count    48.000000
mean      0.900618
std       0.079899
min       0.710692
25%       0.868522
50%       0.928836
75%       0.957534
max       1.000000
Name: IND, dtype: float64
In [25]: growth_correlations['IND'].head()
Out[25]:
AFG    0.726658
ARM    0.953290
AZE    0.932335
BHR    0.956719
BGD    0.767517
Name: IND, dtype: float64
  • Resample the data for every 2 years.
  • Observe the Standard Deviation between rolling values of every 2 years wrt to the original data. alt text
  • Compared Correlation of India-Japan and India-China for the selected Parameter alt text

Dependencies

  • [Pandas] - Bummer !
  • [Numpy] - For using arrays in Python
  • [Quandl] - For building Data-Sets
  • [Pickle] - For quick reuse of the DataFrame later
  • [Matplotlib] - For plotting the graphs

Thanks for your Time, Consider it giving a star?

About

Illustrates use of basic Funtions inside the Pandas Library

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published