Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 65: Chapter 2 adjustments #73

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion _sources/Introduction/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ discover your ability to perform complex analyses to solve real-world problem.

There is another definition of the learning zone that is related to what we
have been talking about. In this amazing
`TED talk: How to get better at the things you care about <https://www.ted.com/talks/eduardo_briceno_how_to_get_better_at_the_things_you_care_about>`_,
TED talk: How to get better at the things you care about,
Eduardo Briceño talks about the "performance zone" versus the "learning zone."
Please watch it.

Expand Down
Binary file added _sources/Statistics/Figures/crossreference1.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _sources/Statistics/Figures/crossreference2.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _sources/Statistics/Figures/crossreference3.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _sources/Statistics/Figures/crossreference4.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _sources/Statistics/Figures/datatypes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 24 additions & 5 deletions _sources/Statistics/cs1_exploring_happiness.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,21 @@ factors may contribute to the happiness of a country, and we will use spreadshee
explore and analyze what factors may be most
important in determining a country's happiness.

We will start by loading the
`happiness_2017.csv <../_static/happiness_2017.csv>`_ file into Google Sheets.
The list below gives a bit of detail about each of the columns on the
spreadsheet.
We will be using Google Sheets instead of Microsoft Excel. Google sheets is used because it is a preferred method for
sharing a link and working in real time with a team. The same Solver tool is available on Microsoft Excel however working
on Google Sheets is preferred because you can easily and quickly share data with other users, and work on the same dataset at the same time.

.. _googlesheet_setup:

We will start by loading the `happiness_2017.csv <../_static/happiness_2017.csv>`_ file into Google Sheets.

1. In order to do that you should go to `Google Sheets <https://www.google.com/sheets/about/>`_

2. Click on "Go to Sheets".

3. Open a blank and then at the top, click File and then Import the file `happiness_2017.csv <../_static/happiness_2017.csv>`_.

The list below gives a bit of detail about each of the columns on the spreadsheet.

The following definitions are reproduced from
`World Happiness Report 2018 <http://worldhappiness.report/ed/2018/>`_.
Expand All @@ -43,6 +54,7 @@ The following definitions are reproduced from
per capita, as this form fits the data significantly better than GDP per
capita.


2. The time series of healthy life expectancy at birth are constructed based on
data from the World Health Organization (WHO) and WDI. WHO publishes the data
on healthy life expectancy for the year 2012. The time series of life
Expand All @@ -53,33 +65,40 @@ The following definitions are reproduced from
country-specific ratios to other years to generate the healthy life
expectancy data.


3. Social support is the national average of the binary responses (either 0 or
1) to the Gallup World Poll (GWP) question "If you were in trouble, do you
have relatives or friends you can count on to help you whenever you need
them, or not?"


4. Freedom to make life choices is the national average of binary responses to
the GWP question "Are you satisfied or dissatisfied with your freedom to
choose what you do with your life?"


5. Generosity is a function of the national average of GWP responses to the
question "Have you donated money to a charity in the past month?" on GDP per
capita.


6. Perceptions of corruption are the average of binary answers to two GWP
questions: "Is corruption widespread throughout the government or not?" and
"Is corruption widespread within businesses or not?". Where data for
government corruption are missing, the perception of business corruption is
used as the overall corruption-perception measure.


7. Positive affect is defined as the average of previous-day affect measures for
happiness, laughter, and enjoyment for GWP waves 3-7 (years 2008 to 2012, and
some in 2013). It is defined as the average of laughter and enjoyment for
other waves where the happiness question was not asked.


8. Negative affect is defined as the average of previous-day affect measures for
worry, sadness, and anger for all waves.


In this first part, we will review and practice some spreadsheet calculations by
doing some exploratory data analysis. If you have never used a spreadsheet
before, don't worry, you will catch on quickly. Remember that we are just exploring at this
Expand Down Expand Up @@ -206,7 +225,7 @@ Summary Statistics
.. fillintheblank:: fb_avghappiness

Calculating the average happiness score. You should include three
digits to the right of the decimal point.|blank|
digits to the right of the decimal point.

- :5.399: Is the correct answer
:5.398: 5.3989 should be rounded up to 5.399
Expand Down
178 changes: 124 additions & 54 deletions _sources/Statistics/cs2_exploring_business_data.rst

Large diffs are not rendered by default.

10 changes: 9 additions & 1 deletion _sources/Statistics/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,19 @@ Definitions

**Histogram:** Is a graph used to display data.

**Mean:** is the average of a set of values.

**Median:** is the middle value of the dataset.

**Mode:** is the most common value on the dataset and shows frequency.

**Pearson correlation:** Is a type of measurement; it measures the strength and direction of a linear relationship between two variables. -1 has a strong negative relationship, and +1 has a strong positive relationship.

**Pivot table:** Is a function used in Google Sheets to summarize, organize, sort, and perform other operations on data sets.

**Standard Deviation:** Is used to measure the degree of variation of a set of values.
**Range:** is the difference between the lowest and highest values of the dataset.

**Standard Deviation:** Is used to measure the degree of variation of a set of values. It also shows the difference from the mean and how spread out the data is more than other types of variabilities.

Keywords
--------
Expand Down