Skip to content

Data and analysis supporting BuzzFeed News' reporting on Harvey-related industrial emissions in Texas.

Notifications You must be signed in to change notification settings

BuzzFeedNews/2017-09-harvey-emissions-update

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analysis of Harvey-related TCEQ emissions reports — September update

This repository contains data and Python code used to analyze emissions reports submitted by industrial facilities to the Texas Commission on Environmental Quality's Air Emission Event Reporting Database. It updates, and slightly expands, a similar analysis from late August.

Please see this related article for additional context.

Table Of Contents

Data

Inputs

The main inputs are the TCEQ emissions reports, scraped from the commission's database. For recent emissions reports, we started with report number 265500 and incremented the report number until we could find no more reports, as of the morning of Sept. 14, 2017. Those raw report pages, as HTML files, are available in the inputs/scraped-reports folder.

For the 48 facilities we identified as reporting Harvey-related emissions, we also scraped their reports of emissions events that began or ended in 2015 or 2016, using TCEQ's Central Registry (e.g.). Those historical emissions reports can be found in the inputs/scraped-reports-historical folder.

We also created a text file, disaster-declaration-counties.txt, listing the 54 counties that Gov. Greg Abbott included on the State Disaster Declaration through Aug. 27.

Finally, the file reports-to-ignore.txt includes emissions reports that, based on reporting, either appear to be duplicative of subsequent reports or appear to be unrelated to Harvey.

Outputs

In the 00-parse-reports notebook, we extract structured data from the raw HTML reports, and save it to two files:

In the 01-analyze-reports notebook, we analyze the data extracted from the reports, limiting the findings to reports (a) in the 54 counties above, (b) indicating an event-beginning date of August 23 or later, and (c) of the type "AIR SHUTDOWN", "AIR STARTUP", or "EMISSIONS EVENT". The main results can be found in these two files:

Reproducibility

To reproduce the findings, you'll need to do the following:

  • Ensure that you have installed Jupyter, Python, and the Python libraries listed in requirements.txt
  • Clear the output/ directory. (Shortcut: run make clear.)
  • Use Jupyter to run each notebook in the notebooks/ directory consecutively. (Shortcut: run make reproduce; requires Python 3.)

Feedback / Questions?

Contact Jeremy Singer-Vine at jeremy.singer-vine@buzzfeed.com.

Looking for more from BuzzFeed News? Click here for a list of our open-sourced projects, data, and code.

About

Data and analysis supporting BuzzFeed News' reporting on Harvey-related industrial emissions in Texas.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages