Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: barryclark/jekyll-now
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: master
Choose a base ref
...
head repository: moniquewong/data-decisions
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
Able to merge. These branches can be automatically merged.
  • 12 commits
  • 7 files changed
  • 2 contributors

Commits on Sep 16, 2019

  1. Updated _config.yml

    wongmon committed Sep 16, 2019

    Unverified

    No user is associated with the committer email.
    Copy the full SHA
    2c7f150 View commit details
  2. Added blog post

    wongmon committed Sep 16, 2019
    Copy the full SHA
    0de8fcb View commit details
  3. Added base URL to _config.yml

    wongmon committed Sep 16, 2019
    Copy the full SHA
    c9bea2a View commit details
  4. Copy the full SHA
    ffccd64 View commit details

Commits on Sep 28, 2019

  1. Add text of blog post

    moniquewong committed Sep 28, 2019
    Copy the full SHA
    d13cc14 View commit details
  2. Added some formatting

    moniquewong committed Sep 28, 2019
    Copy the full SHA
    cf92649 View commit details
  3. Added Zuon images

    moniquewong committed Sep 28, 2019
    Copy the full SHA
    cddef69 View commit details
  4. Copy the full SHA
    67feecb View commit details
  5. Tried fixing image link

    moniquewong committed Sep 28, 2019
    Copy the full SHA
    2404013 View commit details
  6. Added all image links

    moniquewong committed Sep 28, 2019
    Copy the full SHA
    e24d402 View commit details
  7. Fixed headings

    moniquewong committed Sep 28, 2019
    Copy the full SHA
    50ca6a0 View commit details
  8. Fix spacing after images

    moniquewong committed Sep 28, 2019
    Copy the full SHA
    9de7289 View commit details
12 changes: 6 additions & 6 deletions _config.yml
Original file line number Diff line number Diff line change
@@ -3,13 +3,13 @@
#

# Name of your site (displayed in the header)
name: Your Name
name: Data and Decisions

# Short bio or description (displayed in the header)
description: Web Developer from Somewhere
description: Monique Wong - management consultant learning data science

# URL of your avatar or profile pic (you could use your GitHub profile pic)
avatar: https://raw.githubusercontent.com/barryclark/jekyll-now/master/images/jekyll-logo.png
avatar: https://scontent-sea1-1.xx.fbcdn.net/v/t1.0-1/p320x320/13015567_10204961184062778_1794221855304279572_n.jpg?_nc_cat=111&_nc_oc=AQlqv3m1UyE-DNw9POqfVKHD38LG1QBsPT2UMvKEFnz5yi-_A3_-pUkGjAtD5zyzi6g&_nc_ht=scontent-sea1-1.xx&oh=4873bd33d1036a5ed51415786cb6147f&oe=5DF69AF2

#
# Flags below are optional
@@ -21,12 +21,12 @@ footer-links:
email:
facebook:
flickr:
github: barryclark/jekyll-now
github: moniquewong/data-decisions
instagram:
linkedin:
pinterest:
rss: # just type anything here for a working RSS icon
twitter: jekyllrb
twitter:
stackoverflow: # your stackoverflow profile, e.g. "users/50476/bart-kiers"
youtube: # channel/<your_long_string> or user/<user-name>
googleplus: # anything in your profile username that comes after plus.google.com/
@@ -47,7 +47,7 @@ url:
# (http://yourusername.github.io/repository-name)
# and NOT your User repository (http://yourusername.github.io)
# then add in the baseurl here, like this: "/repository-name"
baseurl: ""
baseurl: "https://moniquewong.github.io/data-decisions/"

#
# !! You don't need to change any of the configuration flags below !!
10 changes: 0 additions & 10 deletions _posts/2014-3-3-Hello-World.md

This file was deleted.

14 changes: 14 additions & 0 deletions _posts/2019-09-16-Intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
layout: post
title: Data Science and Decision Making - "Day 1" Aspirations
---

Over the next year, I am embarking upon completing a Masters in Data Science degree. What is the long term aspiration? I want to use my experience as a management consultant and data science skillsets to help organizations formalize and institutionalize data-driven decision-making that is clear, principles-based and thoughtful. I hope that this will lead to happier, more engaged individuals who feel more purposeful in the place they call work.

The majority of the workforce is unhappy and disengaged: the surveys show it and I have witnessed it firsthand. Interacting with the individuals in clients I have worked with, I cannot help but believe that this negative energy is brought home, affecting our loved ones and the way we raise our children which shapes the worldview of the next generation. What I have also witnessed is that clarity of direction, values that are shared and good decision making makes a difference. This applies to C-suite executive level decisions as well as day to day choices that front-line staff make. My experience has also taught me that these decisions are nearly always made imperfectly, even when considering the time and resources (including data) that are available to us.

Take for example the strategy formation process. What continues to surprise me is how we expect to make good decisions when the process is designed so that we freeze and view the world at an arbitrary point in time (when we conduct the process) in order to make all the important decisions for the next 3-5 years of an organization’s lifetime. This process seems archaic for two reasons:
1) The world changes at a rapid pace now. To prove that, just ask yourself what technologies do you use now that didn’t exist 5 years ago? Instead of asking ourselves how we best position ourselves for the most likely future, we have to ask ourselves what possible futures could be true and given this, what principles we should use to make future decisions. This means not making decisions at one point in time but as the right information becomes available.
2) Data is ubiquitous and our ability to process this data has vastly improved. Nimble decision making is possible and enhanced because we can have different and deep insights presented to us continuously.

I believe more agile organizations with clear decision-making principles make for high-performing teams and happier individuals. People know they are a part of something greater because they know what the organizational direction is, why we may be changing course and how they fit into the bigger picture. Whatever path I pursue after the completion of this Data Science program, I hope to contribute to better decision making practices.
70 changes: 70 additions & 0 deletions _posts/2019-09-28-Discriminatory-Bias.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
layout: post
title: Is predicting better predicting fairly?

---
### Describing Discriminatory Bias in Machine Learning
Author: Monique Wong

Most of us have an idea of what discriminatory bias means. It’s when characteristics such as gender, age, race, religious beliefs or sexual orientation changes the way we make decisions about a person.

Machine learning may seem like a daunting concept. For now, let’s think of it as a way for a computer to recognize patterns in a lot of data in order to help us make decisions.

At first blush, the former seems like the domain of lawyers and social justice activists while the latter belongs in the world of statistics and computer science. The way these two concepts come together is not obvious. I assure you though, that the relationship between these concepts has important implications on whether organizations should embrace machine learning.

## A fictional example

Let’s imagine that on a fictional town of Zuon, there are two dominant population groups: Purple Rectangles and Orange Circles.

Purple Rectangles live in relative comfort. They live in beautiful homes nested in safe communities. Purple Rectangle children grow up happy, healthy and are productive members of society.

Orange Circles live on the outskirts of Zuon. These neighborhoods are not as friendly. Orange Circle children grow up afraid, untrusting and are unsure of how they fit into society. Some Orange Circle children learn to steal to provide for food and shelter. Often, they don’t finish high school. Many fall into a career of crime.

<img src="../images/zuon-1.png" width="300" height="300">

The mayor of Zuon believes that safe communities are the key to a higher quality of life. He begins to police the outskirts of Zuon more heavily. Children are arrested for stealing to deter early criminal tendencies. Not seeing the results, he hires a team of data scientists to predict the likelihood that a convicted criminal will re-offend. This way, he thought, he can do a better job of keeping potentially violent lawbreakers off the streets.

The data scientists work away and come up with a solution. The solution takes into account a dizzying number of factors from the age of the convicted person’s first crime to the criminal history of the person’s parents to make the prediction. This tool is then used to help Zuon judges make decisions like length of jail time.

<img src="../images/zuon-2.png" width="300" height="300">

Over the years, more and more Orange Circles end up in prison for ever longer periods of time. The system is objective (it’s based on math and data!) and is well-intended (who doesn’t want more criminals off the streets?). What went wrong?

<img src="../images/zuon-3.png" width="300" height="300">

It turns out that discrimination doesn’t have to be ill-intended. No one in Zuon was discriminating against shape or colour. With machine learning, we accidentally created a predictive model that is discriminatory. The model resulted in very different outcomes for Purple Rectangles compared to Orange Circles.

## How did this happen?

There are several causes for discriminatory bias in machine learning algorithms.

1) Data scientists are not telling the machine to optimize for fairness: When we ask a computer to do everything it can to accomplish a goal, it does that without consideration. So, when we ask it to assess the likelihood of a person re-offending while minimizing the errors it makes, it does only that. There is no consideration for whether it’s classifying one group as significantly more prone to crime than another. There is no “fairness” objective programmed into the machine although it is possible.

2) The data that data scientists use reflects a biased world: The data that we give to a computer to learn from is based on a reality that can be discriminatory towards certain groups. In Zuon, the neighborhoods where Orange Circles lived are more heavily policed so the chances of Orange Circles being caught are higher if both a Purple Rectangle and an Orange Circle commit a crime. The data reflects higher incarceration rates for Orange Circles. The algorithm learns society’s biases and perpetuates them in its predictions.

3) Data scientists select features that reduce error, not bias: A data scientist chooses what data an algorithm uses based on the performance of the model. The effect of feature selection on bias is often unclear. Zuon’s predictive model doesn’t explicitly label defendants as Purple Rectangles or Orange Circles but uses family criminal history and age of first arrest as inputs. With Zuon’s societal inequities, more Orange Circles come from families of crime and were first arrested at an early age compared to Purple Rectangles. Indirectly, the features selected creates a model that is biased against Orange Circles.

In the real world, these behaviours can lead to the unequal treatment of people based on gender, age, race, religion or sexual orientation based on the outcomes of a self-learning algorithm.

## Real-life examples

The story of Zuon is not as fictional as it seems. A similar predictive model being used across several states in the U.S. COMPAS stands for Correctional Offender Management Profiling for Alternative Sanctions and is an algorithm used to assess a criminal defendant’s likelihood of becoming a recidivist, someone who will re-offend.

ProPublica, a non-profit investigative journalist, analyzed COMPAS for discriminatory bias. When comparing actual recidivism rates to COMPAS predictions for 10,000 criminal defendants, the algorithm is as likely to correctly predict recidivism for black versus white defendants. However, black defendants were more often inaccurately predicted to have a high risk of recidivism whereas white defendants were more often inaccurately predicted to have a low risk of recidivism. COMPAS continues to be used in courts today to help judges determine sentencing.

COMPAS is not the only discriminatory application of machine learning. Examples of discriminatory bias have been discovered with Amazon’s recruiting system, Facebook’s advertisements and in mortgage lending.

## Last thoughts

As data scientists and decision makers, we have a responsibility to use technology wisely. While machine learning can certainly improve performance and quality of life, we need to be aware of and plan for its flaws. Christian Lou Lange, a Nobel Peace Prize winner once said: “Technology is a useful servant but a dangerous master.” Let us be thoughtful and use machine learning to move society forward instead of further entrench society’s inequities.

## Sources of inspiration:

- https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
- https://towardsdatascience.com/is-your-machine-learning-model-biased-94f9ee176b67
- https://pair-code.github.io/what-if-tool/ai-fairness.html
- https://www.technologyreview.com/s/612876/this-is-how-ai-bias-really-happensand-why-its-so-hard-to-fix/
- https://medium.com/@ericakochi/how-to-prevent-discriminatory-outcomes-in-machine-learning-3380ffb4f8b3
- https://towardsdatascience.com/machine-learning-and-discrimination-2ed1a8b01038
- https://towardsdatascience.com/understanding-and-reducing-bias-in-machine-learning-6565e23900ac
- https://www.theguardian.com/technology/2017/apr/13/ai-programs-exhibit-racist-and-sexist-biases-research-reveals
Binary file added images/zuon-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/zuon-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/zuon-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.