Skip to content

trholmes/climate_survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

In general, this script is run by running plotData.py. At the bottom of this file, you'll find several example calls to the main functions.

When starting with a new dataset, you'll need to first populate selections.json with some metatdata about that dataset. In particular, any of the demographic questions that you plan to use to make comparative plots will need to be included by keyword here. Typically I include gender, race, year of entry / expected graduation year (depending if grad or undergrad), LGBTQIA+ status, and whether they completed the majority of their education in the US. You'll also add the file path to the CV you download from the google form here. A good way to quickly get the data you need to populate this file is by turning on the dump_questions option in plotData.py and running it.

In addition to populating these questions numbers, you also need to decide how you'll build large enough groups to be non-identifying when you plot. getSelections.py is an attempt at streamlining this (though it could be much better streamlined). This gathers up all the race/gender/LGBTQIA+ responses and puts them into simpler groups, essentially are you part of the "dominant" physics group (white, male, cis-het). If we ever had enough stats it would be great to separate further, but for now I have just been using two categories in these areas. You can take the output of this script and just paste it into the top of plotData.py where simplified_bins is being defined. This is sloppy and could obviously be done in a more automated way. To run using these simplified bins rather than plotting every category separately, you set simplified=True when calling a function in plotData.py.

You may also want to update the years going into the simplified bins to represent the full space. It also is different for grads and undergrads. The reason they're in here explicitly was to make sure they showed up in this order, and without gaps. You could also comment the line out entirely to just let it go by the data.

Once all of this is set up for a dataset, the main functions you'll want to use are at the bottom of plotData.py. To plot one year only, set the years variable to a list with just that value. To plot multiple years in comparison, you can put them all in that list, but don't try to do other demographic group comparisons at the same time. You can find some examples at the very bottom of the script of how you might track a given population's experiences over time by making use of the selections option. This is a little bit un-tested, so please verify that the results are right if using it.

About

Analyze the results of the climate survey

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages