Skip to content
This repository has been archived by the owner on Apr 25, 2020. It is now read-only.

Latest commit

 

History

History
30 lines (22 loc) · 1.56 KB

File metadata and controls

30 lines (22 loc) · 1.56 KB

outreachy-datascience-2019

Gitter

This folder contains the file dataset.csv and data_description.txt. Please refer to the latter file for a description of data contained in dataset.csv. To contribute, please perform the following:

Tasks

  • Load the data into R or Python
    • Notebooks (e.g. Jupyter) make us smile, but scripts work too.
  • Perform exploratory data analysis.
    • Be verbose as to what you are looking at and why.
    • Keep in mind the primary task below.
  • Perform data cleaning, if necessary.
    • Again, explain your methodology and reasoning behind it.
  • Primary Task - Answer the following questions:
    • Which single field in dataset.csv best describes the SalePrice field?
    • Why did you choose this field? Please thoroughly explain your reasoning.

Submission

  • Please submit your final notebook/script by email to one of the mentors, and include which project you are interested in working on ("Improve understanding of Firefox growth metrics" or "Finding Representative Users of Prerelease Firefox")

Additional Information:

  • Visualizations make the world a better place! Use them liberally.
  • Show your code as much as possible.
  • Please also explain your code and your thinking process thoroughly and articulately.
  • Write in English as well as in code.
  • 3rd party libraries are fine. Just make sure to describe why and how you are using them.