Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added descriptions of the EIA861 data #3808

Merged
97 changes: 83 additions & 14 deletions docs/templates/eia861_child.rst.jinja
Original file line number Diff line number Diff line change
@@ -1,11 +1,21 @@
{% extends "data_source_parent.rst.jinja" %}

{% block background %}
EIA Form EIA-861 (and the short form EIA-861S) make up the Annual Electric Power
Industry Report, collecting data from distribution utilities and power marketers.
It is a census of all United States electric utilities. As of 2023 PUDL
only incorporates the annual data, but there's also a less detailed monthly `Form
EIA-861M <https: //www.eia.gov/electricity/data/eia861m />`__.
EIA Form EIA-861 (and the short form EIA-861S) make up the Annual Electric Power Industry Report.
This survey is a census of all U.S. electric utilities and collects information about retail
sales of electricity and associated revenue on an annual basis.


As of 2023, the EIA-861 Form is organized into the following schedules:

* **Schedule 1:** Identification
* **Schedule 2:** Energy Sources Data
* **Schedule 3:** Distribution Information
* **Schedule 4:** Sales Data
* **Schedule 5:** Mergers and/or Acquisition Information
* **Schedule 6:** Demand-Side Management Information
* **Schedule 7:** Net-Metering Programs Information
* **Schedule 8:** Service Territory Information

There are more than 20 different spreadsheets included in the annual zipfiles. For
details, see the `official EIA-861 page <{{ source.path }}>`__
Nancy9ice marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -18,22 +28,81 @@ details, see the `official EIA-861 page <{{ source.path }}>`__
{% endblock %}

{% block availability %}
PUDL incorporates all annual EIA-861 data.
Earlier data (1990-2000) exists but has not been incorporated into PUDL yet.
Nancy9ice marked this conversation as resolved.
Show resolved Hide resolved
There is also a less detailed monthly
`Form EIA-861M <https: //www.eia.gov/electricity/data/eia861m />`__
that is not incorporated into PUDL.
{% endblock %}

{% block respondents %}
The Form EIA-861 must be completed by electric power industry entities such as, but not limited to,
electric utilities, all Demand Side Management (DSM) Program Managers (entities responsible for
conducting or administering a DSM program), wholesale power marketers, energy service providers,
electric power producers, transmission owners, transmission operators, and Third Party Owners of
solar PV (TPO). Responses are collected at the operating company level (not at the holding company level).

EIA-861S is intended for smaller bundled-service utilities and requires less detailed responses.
Respondents respond to either Form EIA-861 or Form EIA-861S.
However, to maintain data quality, respondents reporting to EIA-861S are required to complete
the full form every eight years.
{% endblock %}

{% block original_data %}
EIA typically publishes 861 data from the prior year in two rounds: an early release in the summer and a final release in the fall.
The data are published on the EIA website and distributed as a collection of spreadsheets.
The content of the spreadsheets varies from year to year as the questions in the form are updated.
EIA also periodically changes the naming and structure of the spreadsheets without warning.
Older “final” data may also be revised several years after it was published.
To ensure reproducible analyses, we archive `versioned snapshots of the EIA-861 data on Zenodo <https://zenodo.org/records/10204708>`__.
These archives are periodically refreshed with new data from the `EIA website <{{ source.path }}>`__.

To understand the details of how the form and data have evolved over time,
we recommend reading the Form Instructions from different years, linked above.
Nancy9ice marked this conversation as resolved.
Show resolved Hide resolved
{% endblock %}

{% block notable_irregularities %}
* In 2012-2013 the EIA-861 was overhauled. Several tables were discontinued, and many
more were added. Similar kinds of information reported in the two regimes may not be
directly comparable.
* Most of the EIA-861 data that's available in the PUDL DB has not yet been fully
normalized, meaning it contains many duplicate copies of the same information, which
may not always be internally consistent.
* We also have not yet integrated the Balancing Authority and Utility IDs reported in
the EIA-861 into our entity tables, so for the moment we don't have any foreign key
constraints enabled on the EIA-861 tables.

Non-unique utility IDs
----------------------
The Freedom of Information Act (FOIA), 5 U.S.C. §552, the Department of Energy (DOE) regulations,
10 C.F.R. §1004.11, implementing the FOIA, and the Trade Secrets Act, 18 U.S.C. §1905 allows
qualifying respondents to restrict access to their data. According to sources at the EIA,
approximately 3 respondents have used this as a means to keep their utility-level data proprietary.
These entries appear under the utility id ``88888`` in the data.

The EIA also performs state-level imputations and adjustments for more accurate state-level analysis.
These entries appear under the utility id ``99999`` and are not intended for use in utility-level
aggregations.

Form changes and table discrepancies
------------------------------------
Not all of the EIA-861 tables exist for all of the reporting years. Both the Form and its output files
have changed over time; most notably between 2012 and 2013. Beginning in 2013, the EIA split its
Demand Side Management spreadsheet into two: Demand Response and Energy Efficiency in order to reduce
ambiguity and ensure data reliability. While there are many similarities between the two years, EIA
officials recommend against combining data from these three files, citing subtle differences in Form
questions. For these files, the post-2012 DR and EE spreadsheets are considered more accurate.

Cost data rounded to the thousands
----------------------------------
It’s important to note that all costs are reported to EIA-861 in thousands of dollars. PUDL transforms
these values to dollars, but they will still reflect the values rounded to the thousands.

Not yet fully normalized
------------------------
Most of the EIA-861 data that's available in the PUDL DB has not yet been fully
normalized, meaning it contains many duplicate copies of the same information, which
may not always be internally consistent.

We also have not yet integrated the Balancing Authority and Utility IDs reported in
the EIA-861 into our entity tables, so for the moment we don't have any foreign key
constraints enabled on the EIA-861 tables.

2019 short form
---------------
In 2019 the data collected from the short form was incorporated into the other published
tables rather than as a standalone Short_Form_####.xlsx (where #### is the year) as in
other years. We haven't addressed this discrepancy yet; see details in issue :issue:`3654`.

{% endblock %}
Loading