Skip to content
This repository has been archived by the owner on Nov 10, 2020. It is now read-only.

Properly identify withheld values in BEA self-employment data #1870

Closed
shawnbot opened this issue Sep 8, 2016 · 2 comments
Closed

Properly identify withheld values in BEA self-employment data #1870

shawnbot opened this issue Sep 8, 2016 · 2 comments
Assignees
Labels
data-mgmt Use this label for adding and/or updating data on the site. Also can be used for data management too p2: High This doesn’t prevent the site from being used, but needs to be addressed in the near term.

Comments

@shawnbot
Copy link
Contributor

shawnbot commented Sep 8, 2016

Expected behavior

We should have null values in our self-employment YAML files so that we can correctly identify them as withheld, per BEA's descriptions:

(D) Not shown to avoid disclosure of confidential information, but the estimates for this item
    are included in the total.
(L) Less than 10 jobs, but the estimates for this item are included in the total.

Actual behavior

We're parsing BEA's (D) and (L) values as zero, as mentioned in this comment.

This is kind of tricky to debug, because in order to get the self-employment counts we're subtracting the number of wage and salary "mining" (NAICS code 21) jobs from the total number of mining jobs. That happens in our get-bea-data.js script, and I've confirmed that it's producing some of the right numbers by spot-checking the values on the BEA site.

I think that the fix, though, is to treat the self-employment number as withheld if either the total or wage and salary values is listed as (D) or (L). What do you think, @meiqimichelle?

We'll know we're done when...

  • Delaware should have just 165 (181 - 16) jobs in 2006, something representing the 177 - L value (where L means "less [sic] than 10 jobs") in 2009, and withheld values instead of zero for 2008-2014. Note: Delaware in 2007 is the only instance of the L, so we might be able to get away with just subtracting 10 from 177 and calling it a day.
  • Main should have withheld values in 2008, 2009, and 2012.
  • Rhode Island should have withheld values in 2010, 2011, 2013, and 2014.
  • DC should have values for all years, because it has non-zero values for all employment and all zeroes for wage and salary jobs.
@shawnbot shawnbot added data-mgmt Use this label for adding and/or updating data on the site. Also can be used for data management too size:hours labels Sep 8, 2016
@shawnbot shawnbot self-assigned this Sep 8, 2016
@shawnbot shawnbot added this to the Sprint-SingingSwan milestone Sep 8, 2016
@shawnbot shawnbot changed the title Properly identify withheld values in BEA (self-employment) data Properly identify withheld values in BEA self-employment data Sep 12, 2016
@shawnbot shawnbot removed this from the Sprint-SingingSwan milestone Sep 21, 2016
@coreycaitlin
Copy link
Contributor

Let's try to address this along with #2587

@jennmalcolm jennmalcolm added the p2: High This doesn’t prevent the site from being used, but needs to be addressed in the near term. label May 3, 2018
@jennmalcolm jennmalcolm added this to the Sprint-BenevolentBats milestone May 8, 2018
@jennmalcolm jennmalcolm removed this from the Sprint-BenevolentBats milestone May 14, 2018
@jennmalcolm
Copy link
Contributor

Through content strategy #2799, we have determined we will not maintain this data in the future. So, I'm closing this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
data-mgmt Use this label for adding and/or updating data on the site. Also can be used for data management too p2: High This doesn’t prevent the site from being used, but needs to be addressed in the near term.
Projects
None yet
Development

No branches or pull requests

6 participants