Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft of plant_in_service_ferc1 DBF + XBRL transform #2025

Merged
merged 23 commits into from
Nov 11, 2022

Conversation

zaneselvans
Copy link
Member

@zaneselvans zaneselvans commented Oct 26, 2022

  • This PR moves the processing of the plant_in_service_ferc1 table into the new TableTransformer system, and integrates both DBF + XBRL data.
  • The reshaping operations aren't generalized/parameterized yet, since we need more context on what kinds of data they'll need to apply to, and where in the overall pipeline they'll need to be applied. That generalization work has been pushed into the mini-epic Create a general row-to-column table transformer #2012.
  • This PR also doesn't provide the metadata necessary for aggregating the plant in service table. I have another branch (off of this one) that does that, which I will make another PR from once this one is merged in.

@zaneselvans zaneselvans added ferc1 Anything having to do with FERC Form 1 rmi xbrl Related to the FERC XBRL transition labels Oct 26, 2022
@zaneselvans zaneselvans self-assigned this Oct 26, 2022
@codecov
Copy link

codecov bot commented Oct 26, 2022

Codecov Report

Base: 84.0% // Head: 84.2% // Increases project coverage by +0.2% 🎉

Coverage data is based on head (1c01f7d) compared to base (bfff651).
Patch coverage: 80.5% of modified lines in pull request are covered.

Additional details and impacted files
@@                Coverage Diff                 @@
##           xbrl_integration   #2025     +/-   ##
==================================================
+ Coverage              84.0%   84.2%   +0.2%     
==================================================
  Files                    72      72             
  Lines                  7935    7972     +37     
==================================================
+ Hits                   6668    6717     +49     
+ Misses                 1267    1255     -12     
Impacted Files Coverage Δ
src/pudl/extract/ferc1.py 87.3% <ø> (ø)
src/pudl/metadata/fields.py 100.0% <ø> (ø)
src/pudl/metadata/resources/ferc1.py 100.0% <ø> (ø)
src/pudl/transform/params/ferc1.py 100.0% <ø> (ø)
src/pudl/transform/ferc1.py 81.1% <80.5%> (+6.7%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@zaneselvans zaneselvans linked an issue Oct 26, 2022 that may be closed by this pull request
A prototype reshaping table transformer, which processes the electric plant in service
table.

* Added a new mapping that links historical DBF row numbers to XBRL column labels.
* Swapped old `plant_in_service` field and resource definitions for the new ones. This
  was a big change since the DB table is now long rather than wide form.
* Still need to generalize and parametrize the reshaping transform, but that's going
  to require working with some additional tables to get more familiar with what kinds
  of things need to be done.
Progress on #2012 #2014

* Fixed a bug in how the DBF row numbers that need to be mapped are
  identified. Now it looks for any time the row_literal associated with
  a row number has changed from one year to the next, rather than
  selecting the first instance of each distinct combination of
  row_literal and row_number.
* Also discovered that there's an obscure row_status field that
  differentiates between annual (A) and quarterly (Q) row literals, and
  is part of the f1_row_lit_tbl primary key, but it only shows up in
  association with the f1_schedules_list table. I integrated it but...
  maybe that table should just be excluded from the row mapping
  template?
* Added some (janky) helper functions to pudl.transform.ferc1 to manage
  the generation of the row maps. This location is temporary. They
  should probably become methods of a Ferc1 abstract transformer class
  for reshaped tables, or maybe end up in a different module. Not sure
  how they'll end up getting used yet though.
* Updated the dbf_to_xbrl.csv file to include all of the possible rows
  that could need mapping (4270 in total).
* Removed the XBRL specific metadata fields from the dbf_to_xbrl.csv
  file, since they should (hopefully) be available programmatically from
  the metadata @zschira is extracting from the XBRL taxonomies, and can
  be joined to this table based on the xbrl_column_stem.
* Updated the plant_in_service transform to use the new row map. Need to
  test on all of the years.
Copy link
Member Author

@zaneselvans zaneselvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for @cmgosnell and @TrentonBush in anticipation of their reviews!

src/pudl/metadata/fields.py Show resolved Hide resolved
src/pudl/package_data/ferc1/dbf_to_xbrl.csv Outdated Show resolved Hide resolved
src/pudl/package_data/ferc1/dbf_to_xbrl.csv Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Show resolved Hide resolved
src/pudl/transform/ferc1.py Show resolved Hide resolved
@zaneselvans zaneselvans marked this pull request as ready for review November 4, 2022 01:23
Copy link
Member

@cmgosnell cmgosnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 i'm excited about theeeees. but lots of lil questions/suggestions peppered around

src/pudl/transform/ferc1.py Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Show resolved Hide resolved
src/pudl/transform/ferc1.py Show resolved Hide resolved
src/pudl/transform/ferc1.py Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
src/pudl/transform/ferc1.py Outdated Show resolved Hide resolved
@zaneselvans zaneselvans changed the title Integrate plant_in_service_ferc1 DBF + XBRL First draft of plant_in_service_ferc1 DBF + XBRL transform Nov 11, 2022
@zaneselvans zaneselvans merged commit c6c4866 into xbrl_integration Nov 11, 2022
@zaneselvans zaneselvans deleted the plant-in-service-xbrl branch November 11, 2022 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ferc1 Anything having to do with FERC Form 1 rmi xbrl Related to the FERC XBRL transition
Projects
No open projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

Transform plant_in_srvce xbrl + dbf
2 participants