Identify health-related data sets for inclusion #150

cliftonmcintosh · 2017-02-20T19:56:25Z

Open Nepal has many health-related data sets. Identify ones that might be good to include. Create an issue for each one and link the issue to the actual data set. Data sets that work well:

have data for each district. Even better if they have data for each VDC.
have counts for each district.

Examples of potentially good data sets include:

immunization rates (if available)
communicable disease rates
deaths by communicable and immunizable diseases
malaria infection and death rates

These are potential samples. Once data sets have been identified and issues have been created, then the team can prioritize which issues would or would not be valuable to include.

amitness · 2017-02-24T16:42:13Z

I'm participating in "Open Data Day Hackathon 2017" and we've been provided few datasets. One of them includes immunization dataset by district. Will this be helpful? @cliftonmcintosh

Immunization dataset by district last 2 year.csv.zip

cliftonmcintosh · 2017-02-24T19:14:23Z

I took a quick look, and it looks promising. We would have to understand what all the columns mean in order to make sense of it. I am not sure how we map to overall population numbers. We can probably use data like this even if we can't match to population numbers.

…

On Fri, Feb 24, 2017 at 10:42 AM Amit Chaudhary ***@***.***> wrote: I'm participating in "Open Data Day Hackathon 2017" and we've been provided few datasets. One of them includes immunization dataset by district. Will this be helpful? @cliftonmcintosh <https://github.com/cliftonmcintosh> Immunization dataset by district last 2 year.csv.zip <https://github.com/Code4Nepal/nepalmap_app/files/799930/Immunization.dataset.by.district.last.2.year.csv.zip> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#150 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADpbbF5tHcHej7rZYk9vO9Ex5kuUhl8wks5rfwhlgaJpZM4MGkLR> .

cliftonmcintosh · 2017-05-14T15:39:39Z

@ravinepal

As per our discussion via email, I have been looking at health data more closely. OpenNepal has quite a few data sets, but one of my concerns is that some of them are several years old. I have been looking at the most recent Annual Report from the Department of Health Services (available here as a PDF). This is a more recent version of the data sets that OpenNepal has digested. I believe we can extract the data from the tables in that report using Tabula and this will provide us with more recent data. I have extracted a few data sets this way. They need manipulating to convert them into a usable format, but I believe it will be worth the effort. Right now I have done the preliminary extraction for several of the tables in the "Safe Motherhood" section. These include data on:

antenatal maternity care
delivery methods and locations (home versus a health facility)
postnatal infant and mother care
newborn and maternal deaths
abortion care
nutrition in the first two years of a child's life

The data sets need more processing, and it may be that not all of them are valuable, but I think there is a lot we can mine from the document.

It would be nice if team members could look through those tables and see if they think see some data points that might be important to show.

ravinepal · 2017-05-14T16:43:54Z

thanks, @cliftonmcintosh! should i reach out to open nepal team to see if they can extract these datasets? (responded to your email as well.)

cliftonmcintosh · 2017-05-14T17:32:12Z

@ravinepal

Thanks for offering to reach out to Open Nepal for extracting the data, but I would like to try my hand at it for a couple of datasets first. This will allow me to convert the data in a way that is useful for NepalMap. Moving to a format that is useful for us from the format delivered by theTabula PDF converter is likely to be no more difficult than moving from the way OpenNepal presents the data.

ravinepal · 2017-05-14T17:46:28Z

sounds good, @cliftonmcintosh! @amitness has extracted some of census data in the past - so looping him to see if he can advise/help as well

amitness · 2017-05-15T11:39:37Z

@ravinepal @cliftonmcintosh Tabula is the best way to go. There is this useful wrapper for tabula in python called tabula-py. Also here is the Example on using it.

cliftonmcintosh · 2017-05-15T13:00:03Z

@amitness Thanks for the tips

cliftonmcintosh added health Research labels Feb 20, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify health-related data sets for inclusion #150

Identify health-related data sets for inclusion #150

cliftonmcintosh commented Feb 20, 2017

amitness commented Feb 24, 2017

cliftonmcintosh commented Feb 24, 2017 via email

cliftonmcintosh commented May 14, 2017

ravinepal commented May 14, 2017

cliftonmcintosh commented May 14, 2017

ravinepal commented May 14, 2017

amitness commented May 15, 2017

cliftonmcintosh commented May 15, 2017

Identify health-related data sets for inclusion #150

Identify health-related data sets for inclusion #150

Comments

cliftonmcintosh commented Feb 20, 2017

amitness commented Feb 24, 2017

cliftonmcintosh commented Feb 24, 2017 via email

cliftonmcintosh commented May 14, 2017

ravinepal commented May 14, 2017

cliftonmcintosh commented May 14, 2017

ravinepal commented May 14, 2017

amitness commented May 15, 2017

cliftonmcintosh commented May 15, 2017