- Find 5 different sources of publically available biomedical or public health data. Briefly describe (in about one or two sentences what each source is) the data and provide a link to the data source. Hint: We may have used data from some of these sources in previous assignments.
- Pick one of these sources and download a dataset from that source.
- Load the dataset into R and calculate some simple summary statistics (e.g. mean, median, min, max, variance, if the variable is continuous) for 5 variables.
Develop a report (I recommend a Word (or other text editor) document) for your problem set that includes answers to all of the questions posed above, showing plots where appropriate. Also include a functional R script or a screen shot of the R commands used to load the dataset into R and calculate the summary statistics. Be sure to also include your data file for validation and try to comment extensively in the R script to document what the different lines of code are doing.
Save your report as a PDF file and submit your report through the course 2GW site. Clean up your code and submit it as a supplementary file, along with your main report.
Friday, Week 6