This folder contains the file dataset.csv
and data_description.txt
. Please refer to the latter file for a description of data contained in dataset.csv
. To contribute, please perform the following:
- Load the data into R or Python
- Notebooks (e.g. Jupyter) make us smile, but scripts work too.
- Perform exploratory data analysis.
- Be verbose as to what you are looking at and why.
- Keep in mind the primary task below.
- Perform data cleaning, if necessary.
- Again, explain your methodology and reasoning behind it.
- Primary Task - Answer the following questions:
- Which single field in
dataset.csv
best describes theSalePrice
field? - Why did you choose this field? Please thoroughly explain your reasoning.
- Which single field in
- Please submit your final notebook/script by email to one of the mentors, and include which project you are interested in working on ("Improve understanding of Firefox growth metrics" or "Finding Representative Users of Prerelease Firefox")
- Visualizations make the world a better place! Use them liberally.
- Show your code as much as possible.
- Please also explain your code and your thinking process thoroughly and articulately.
- Write in English as well as in code.
- 3rd party libraries are fine. Just make sure to describe why and how you are using them.