Skip to content
Guy Yollin edited this page Mar 1, 2016 · 3 revisions

Background

WRDS (https://wrds-web.wharton.upenn.edu/wrds/) is a web-based business data research service from The Wharton School at the University of Pennsylvania. It is a common portal for accessing the Compustat database of corporate fundamental data and the CRSP database of security prices and returns. This data is typically downloaded in large flat files which need subsequent ECTL operations performed prior to the data being usable for research and modeling. The wrds package is intended to automate and simply this ECTL process thus significantly minimizing the time and effort required to begin research using CCM data.

Related work

Work on the wrds package began in 2015 as a GSOC project: https://github.com/wthielen/wrds. Basic functionality to download data from the WRDS website via R was implemented.

Details of your coding project

  • Function(s) for downloading CCM data from WRDS
  • Function(s) for inserting downloaded CCM data into a local SQLite database
  • Function(s) for extracting CCM data from the local SQLite database based on asset class, index constituent, etc.
  • Function(s) for aggregating/disaggregating and aligning data to different frequencies (i.e. quarterly, monthly, weekly)
  • Function(s) for interpolating missing data

Expected impact

Mentors, please explain how this project will produce a useful package for the R community.

Mentors

Each project needs 2 mentors. Ideally one should be an expert R programmer with previous package development experience, and the other can be a domain expert in some other field or application area (optimization, bioinformatics, machine learning, data viz, etc).

Tests

Several tests that potential students can do to demonstrate their capabilities for this particular project. Please modify the suggestions below to make them specific for your project.

  • Easy: something that any useR should be able to do, e.g. download some existing package listed in the Related Work, and run it on some example data.
  • Medium: something a bit more complicated. You can encourage students to write a script or some functions that show their R coding abilities.
  • Hard: Can the student write a package with Rd files, tests, and vigettes? If your package interfaces with non-R code, can the student write in that other language?

Solutions of tests

Students, please post a link to your test results here.

Clone this wiki locally