Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfixes for states=[ALL] and SQLite DB clobber check #890

Merged
merged 6 commits into from
Jan 20, 2021

Conversation

zaneselvans
Copy link
Member

Attempts to address issue #870 which notes we're checking for the
existence of an SQLAlchemy database URL, not a file path, when looking
to see if we're clobbering an existing PUDL database.

Also fixes a bug in which the special case of doing "ALL" states in the
EPA CEMS processing was failing, since they were listed as dictionary
keys, instead of a list or tuple.

Attempts to address issue #870 which notes we're checking for the
existence of an SQLAlchemy database URL, not a file path, when looking
to see if we're clobbering an existing PUDL database.

Also fixes a bug in which the special case of doing "ALL" states in the
EPA CEMS processing was failing, since they were listed as dictionary
keys, instead of a list or tuple.
@codecov
Copy link

codecov bot commented Jan 19, 2021

Codecov Report

Merging #890 (fda09b9) into sprint29 (7bc7e19) will increase coverage by 2.83%.
The diff coverage is 62.50%.

Impacted file tree graph

@@             Coverage Diff              @@
##           sprint29     #890      +/-   ##
============================================
+ Coverage     74.57%   77.40%   +2.83%     
============================================
  Files            44       44              
  Lines          5722     5722              
============================================
+ Hits           4267     4429     +162     
+ Misses         1455     1293     -162     
Impacted Files Coverage Δ
src/pudl/convert/datapkg_to_sqlite.py 40.28% <0.00%> (ø)
src/pudl/output/ferc714.py 95.14% <ø> (+79.86%) ⬆️
src/pudl/output/pudltabl.py 69.88% <66.67%> (+11.24%) ⬆️
src/pudl/constants.py 100.00% <100.00%> (ø)
src/pudl/analysis/demand_mapping.py 12.61% <0.00%> (+3.08%) ⬆️
src/pudl/extract/ferc714.py 95.00% <0.00%> (+5.00%) ⬆️
src/pudl/analysis/service_territory.py 39.17% <0.00%> (+5.83%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7bc7e19...efedcf5. Read the comment docs.

The pudl_out object wasn't working with the new datastore arrangement
since the call signatures had changed. Updated it to use the new
datastore interface. Also moved the eia861 and ferc714 output / interim
ETL tests into the fast_output_test module, since they only take a
couple of minutes to run, and this will test more code -- including the
stuff that was broken here.
The interim ETL functions for eia861 and ferc714 that are (temporarily,
hackishly) embedded within the PUDL output object need to know where to
look for the input data, which is a different place in CI than
(typically) on our local machines. This commit tells that the fast_out
object which is used for output testing (including running the interim
ETL functions for eia861 and ferc714) to look at whatever datastore is
within the pudl_datastore_fixture. Which still might not work because of
the hardcoded PUDL_YML in pudl.workspace.datastore but we'll see.
src/pudl/convert/datapkg_to_sqlite.py Show resolved Hide resolved
src/pudl/output/pudltabl.py Outdated Show resolved Hide resolved
test/interim_etl_test.py Show resolved Hide resolved
For incompletely integrated data sources like ferc714 and eia861, we
need to have access to a datastore within the PUDL output objects, so we
can read the raw input files and run the interim ETL process. With this
commit you can either pass in a Datastore object to use, or you leave it
set to None, the __init__ method will use the default Datastore
indicated by the user's .pudl.yml PUDL_IN path.
@zaneselvans zaneselvans merged commit 793419d into sprint29 Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants