Add csv-demand parser #995

ekatef · 2024-03-21T16:06:06Z

The aim of the PR is to provide an option to use csv files as demand inputs.

Checklist

I consent to the release of this PR's code under the AGPLv3 license and non-code contributions under CC0-1.0 and CC-BY-4.0.
I tested my contribution locally and it seems to work fine.
Code and workflow changes are sufficiently documented.
Newly introduced dependencies are added to envs/environment.yaml and doc/requirements.txt.
Changes in configuration options are added in all of config.default.yaml and config.tutorial.yaml.
Add a test config or line additions to test/ (note tests are changing the config.tutorial.yaml)
Changes in configuration options are also documented in doc/configtables/*.csv and line references are adjusted in doc/configuration.rst and doc/tutorial.rst.
A note for the release notes doc/release_notes.rst is amended in the format of previous release notes, including reference to the requested PR.

ekatef · 2024-03-22T21:49:35Z

CI currently fails during building a test configuration, when get_load_paths_gegis is called from Snakemake. It looks like there may be some difference in naming conventions.

…arth into add_demand_parser

ekatef · 2024-03-30T21:13:14Z

The tutorial workflow in CI fails due to an empty power plants matching located in the requested region. Not sure in which way it is connected with the changes introduced by the PR, and can't reproduce it locally yet.

ekatef · 2024-04-07T08:35:09Z

@davide-f could you please have a look? CI is failing due an some troubles in finding power plants for the region, which I can't reproduce locally... Any ideas be very welcome on what may go wrong!

davide-f

Great @ekatef :D

scripts/build_demand_profiles.py

Co-authored-by: Davide Fioriti <67809479+davide-f@users.noreply.github.com>

davide-f · 2024-04-16T13:57:34Z

scripts/build_demand_profiles.py

+            load_path = os.path.join(load_dir, continent + ext)
+            if os.path.exists(load_path):
+                load_paths.append(load_path)
+            break


Please, move it inside the if, otherwise the .csv never gets executed

Done! Thanks :)

ekatef · 2024-05-07T22:43:05Z

The changes in this PR should be aligned with #1017.

davide-f

Great @ekatef :D
The functionality is absolutely there!

Need to check why the CI is not working though

As minor comments:

plase revise the release_notes
it may be good to add something in the documentation to clarify on how to feed in the custom data and leverage on this amazing contribution. I understand this is time demanding. not sure if you have the time to craft something very very basic to be later improved or we leave it completely for a next PR.

What do you think?

Snakefile

This reverts commit 5b488b2.

ekatef · 2024-05-29T21:06:15Z

Great @ekatef :D The functionality is absolutely there!

Need to check why the CI is not working though

As minor comments:

plase revise the release_notes

it may be good to add something in the documentation to clarify on how to feed in the custom data and leverage on this amazing contribution. I understand this is time demanding. not sure if you have the time to craft something very very basic to be later improved or we leave it completely for a next PR.

What do you think?

Thanks a lot for reviewing @davide-f! 😄 I can confirm that the functionality has been tested successfully :) 🎉

The reason for CI failing has been that Snakemake is parsing all the inputs before starting to do anything. So, it didn't help much to place get_load_paths_gegis() in the inputs of build_demand_profiles, as Snakemake still executed get_load_paths_gegis() before loading the databundle when running the tutorial case. That leads to failure along the workflow, if a new modification of get_load_paths_gegis is used as it captures only the paths which really exist of the drive.

To fix this, I had to put back composing of load_data_paths, but add a condition to treat the paths differently, depending on the actual presence of the load data. Not sure it's the most elegant approach but seems to be an effective one.

Done for a release note :D Have also drafted a documentation on using custom load profiles. Could you please check if this addition looks clear enough to be used?

PS Can't help but say that it's amazing to have a documentation build as a part of CI 💚 Thanks a lot to you and @GbotemiB for adding this feature!

davide-f

Cool, the CI is running :D so closer to merge :)

davide-f · 2024-06-03T12:07:05Z

Snakefile

+if os.path.exists(os.path.join("data", config["load_options"]["ssp"])):
+    load_data_paths = get_load_paths_gegis("data", config, check_existence=True)
+else:
+    # demand profiles are not available yet -> no opportunity to use custom load files
+    load_data_paths = get_load_paths_gegis("data", config, check_existence=False)


Mmmm I think the check_existence can lead to misleading effects due to when that is evaluated.
I think we should not have this option and keep the previous behavior maybe?

The reason is that the file may not exist when the function is triggered.
For example, with a fresh run, the file does not exists at the beginning because it is moved here from retrieve_databundle.

Maybe the check on whether the file is missing should be moved when the files are actually read, but if the file is missing, the workflow breaks so it may not be needed.
Except the check on whether all selected countries are found, I expected that check to be there already but may be good to crosscheck if you wanted to address that here

Thanks for double-checking @davide-f 🙂 This file-existence issue is quite a tricky part, in fact.

The point is snakemake parses all the input paths before building DAG and starting to execute anything. Which means that load_data_paths is evaluated before any of the rule would run, including retrieve_databundle. So, it's possible that load_data_paths is evaluated when demand files are not loaded yet. It has been perfectly alright before, when we were not interested if the demand files exist. But the demand parser must check which exactly inputs present, and breaks if there is no any inputs yet.

If we would restore the original get_load_paths_gegis removing the check-existence condition, that will lead to the error in a fresh run. That has been a reason of CI being unhappy previously, which I have looked to fix. Not sure the current implementation is the most elegant one, but it works and hopefully doesn't introduce breaking changes into the workflow.

Agree that it can also work if we would check if the data folder exist directly in the load_data_paths (an example of a similar solution). Have implemented and tested the improvement and happy to have your opinion on that 🙂

Ahhhh right, good point! I catched the issue but not completely the side effects, great explanation.
I'm not a great fan of that long if case; I got another idea; adding a review comment and let's see what's your idea.
Already this is good :)

davide-f · 2024-06-05T23:12:03Z

scripts/build_demand_profiles.py

+        for continent in region_load:
+            for ext in [".nc", ".csv"]:
+                load_path = os.path.join(str(load_dir), str(continent) + str(ext))
+                if os.path.exists(load_path):
+                    load_paths.append(load_path)
+                    break
+
+        avail_regions = [
+            os.path.split(os.path.abspath(pth))[1].split(".nc")[0] for pth in load_paths
+        ]
+
+        logger.info(f" An assumed load folder {load_dir}, load path is {load_paths}.")
+
+        if len(load_paths) == 0:
+            logger.warning(
+                f"No demand data file for {set(region_load).difference(avail_regions)}. An assumed load folder {load_dir}."
+            )


What if we drop the exists of scenario path and we do:

Suggested change

for continent in region_load:

for ext in [".nc", ".csv"]:

load_path = os.path.join(str(load_dir), str(continent) + str(ext))

if os.path.exists(load_path):

load_paths.append(load_path)

break

avail_regions = [

os.path.split(os.path.abspath(pth))[1].split(".nc")[0] for pth in load_paths

]

logger.info(f" An assumed load folder {load_dir}, load path is {load_paths}.")

if len(load_paths) == 0:

logger.warning(

f"No demand data file for {set(region_load).difference(avail_regions)}. An assumed load folder {load_dir}."

)

regions_found = []

file_names = []

for continent in region_load:

sel_ext = ".nc"

for ext in [".nc", ".csv"]:

load_path = os.path.join(str(load_dir), str(continent) + str(ext))

if os.path.exists(load_path):

sel_ext = ext

regions_found.append(continent)

break

file_name = str(continent) + str(sel_ext)

load_path = os.path.join(str(load_dir), file_name)

load_paths.append(load_path)

file_names.append(file_name)

logger.info(

f"Demand data folder: {load_dir}, load path is {load_paths}.\n" +

f"Expected files: " + "; ".join(file_names)

)

If the file does not exist, this solution should backup to the .nc case.
Not sure about how much info to disclose;
I'm unsure it is good to advice about whether demand data files are existing or not: for the standard user that may be confusing because of the ordering about when snakemake makes this evaluation.

What do you think?

I love the solution! 😄 Implemented. Thanks a lot for that.

Agree that the no-demand warning feels redundant. Moreover, it's also not needed: in case no demand data provided and retrive_databundle: false, Snakemake just throws an error even before running anything. So, we can just remove this checking part.

Co-authored-by: Davide Fioriti <67809479+davide-f@users.noreply.github.com>

for more information, see https://pre-commit.ci

ekatef · 2024-06-08T18:47:59Z

Cool, the CI is running :D so closer to merge :)

@davide-f thanks :D Have implemented your suggestions, and the changes look now quite consistent with the original version which feels really great 🙃 Thanks a lot for reviewing!

Both local and CI testing are successful now.

Could you please check if everything looks well in the revised version?

ekatef added 2 commits March 21, 2024 01:48

Account for the file extension

0f5f09a

Add import

849c092

ekatef marked this pull request as draft March 21, 2024 16:45

ekatef added 4 commits March 22, 2024 23:54

Fix format of the message

cbe0dce

Add csv loading

a3b2f8c

Refactor path generation

6470034

Merge branch 'main' into add_demand_parser

0ff2282

ekatef added 3 commits March 23, 2024 12:36

Fix the check of demand availability

7eb36b4

Merge branch 'add_demand_parser' of https://github.com/ekatef/pypsa-e…

1d9a250

…arth into add_demand_parser

Comment-out the load path check

5504b79

davide-f reviewed Apr 7, 2024

View reviewed changes

scripts/build_demand_profiles.py Outdated Show resolved Hide resolved

scripts/build_demand_profiles.py Outdated Show resolved Hide resolved

scripts/build_demand_profiles.py Outdated Show resolved Hide resolved

scripts/build_demand_profiles.py Show resolved Hide resolved

ekatef and others added 7 commits April 12, 2024 00:32

Merge branch 'pypsa-meets-earth:main' into add_demand_parser

e12853e

Implement Davide's suggestion

c37bf1b

Co-authored-by: Davide Fioriti <67809479+davide-f@users.noreply.github.com>

Add a TODO comment

397a6d9

Put loading csv-s into a function

061c985

Use na-safe csv loading

9464d55

Fix the function output

8c27cd5

Add an option to combine nc and csv inputs

59fe870

davide-f reviewed Apr 16, 2024

View reviewed changes

ekatef added 3 commits April 30, 2024 00:47

Merge branch 'main' into add_demand_parser

a83693c

Fix breaking the loop

94378e5

Merge branch 'main' into add_demand_parser

bc66e5d

ekatef force-pushed the add_demand_parser branch 2 times, most recently from a159418 to d9276c6 Compare May 7, 2024 22:38

Put the path check back

87fb099

ekatef force-pushed the add_demand_parser branch from 4884d06 to 87fb099 Compare May 7, 2024 22:47

davide-f reviewed May 27, 2024

View reviewed changes

Snakefile Outdated Show resolved Hide resolved

ekatef added 11 commits May 28, 2024 23:32

Remove a redundant definition

5b488b2

Add type conversion

083a1ac

Add a (temporary) warning

b71eeea

Merge branch 'main' into add_demand_parser

eaf8552

Add a diagnostic logger info

31eec33

Revert "Remove a redundant definition"

9482e28

This reverts commit 5b488b2.

Account for different states of the demand inputs

18df6d3

Modify handling an empty demand case

049aaa1

Add release note

8e34213

Remove temporary comments

511140d

Add documentation

25451d7

davide-f reviewed Jun 3, 2024

View reviewed changes

ekatef added 4 commits June 4, 2024 12:40

Merge branch 'main' into add_demand_parser

c1c80b4

Move existence check into the demand parser

0584f17

Clean-up implementation

00b8d60

Remove an outdate commit

5c8ce7f

davide-f reviewed Jun 5, 2024

View reviewed changes

ekatef and others added 5 commits June 6, 2024 11:31

Implement Davide's suggestion

4fadf74

Co-authored-by: Davide Fioriti <67809479+davide-f@users.noreply.github.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

d21bbe6

for more information, see https://pre-commit.ci

Fix formatting

c6aa7f1

Remove an outdated check

14f4bca

Remove a not-used variable

ef4dc69

ekatef mentioned this pull request Jun 12, 2024

Support input demand data with csv format #987

Closed

ekatef linked an issue Jun 12, 2024 that may be closed by this pull request

Support input demand data with csv format #987

Closed

davide-f approved these changes Jun 17, 2024

View reviewed changes

davide-f merged commit 2c33699 into pypsa-meets-earth:main Jun 17, 2024
5 checks passed

ekatef deleted the add_demand_parser branch December 2, 2024 17:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add csv-demand parser #995

Add csv-demand parser #995

ekatef commented Mar 21, 2024 •

edited

Loading

ekatef commented Mar 22, 2024

ekatef commented Mar 30, 2024

ekatef commented Apr 7, 2024

davide-f left a comment

davide-f Apr 16, 2024

ekatef May 13, 2024

ekatef commented May 7, 2024

davide-f left a comment •

edited

Loading

ekatef commented May 29, 2024

davide-f left a comment

davide-f Jun 3, 2024

ekatef Jun 4, 2024

davide-f Jun 5, 2024

davide-f Jun 5, 2024

ekatef Jun 8, 2024

ekatef commented Jun 8, 2024

Add csv-demand parser #995

Add csv-demand parser #995

Conversation

ekatef commented Mar 21, 2024 • edited Loading

Checklist

ekatef commented Mar 22, 2024

ekatef commented Mar 30, 2024

ekatef commented Apr 7, 2024

davide-f left a comment

Choose a reason for hiding this comment

davide-f Apr 16, 2024

Choose a reason for hiding this comment

ekatef May 13, 2024

Choose a reason for hiding this comment

ekatef commented May 7, 2024

davide-f left a comment • edited Loading

Choose a reason for hiding this comment

ekatef commented May 29, 2024

davide-f left a comment

Choose a reason for hiding this comment

davide-f Jun 3, 2024

Choose a reason for hiding this comment

ekatef Jun 4, 2024

Choose a reason for hiding this comment

davide-f Jun 5, 2024

Choose a reason for hiding this comment

davide-f Jun 5, 2024

Choose a reason for hiding this comment

ekatef Jun 8, 2024

Choose a reason for hiding this comment

ekatef commented Jun 8, 2024

ekatef commented Mar 21, 2024 •

edited

Loading

davide-f left a comment •

edited

Loading