Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with build_powerplants #358

Closed
2 tasks done
hazemakhalek opened this issue May 31, 2022 · 23 comments
Closed
2 tasks done

Problem with build_powerplants #358

hazemakhalek opened this issue May 31, 2022 · 23 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@hazemakhalek
Copy link
Collaborator

Checklist

  • I am using the current main branch or the latest release. Please indicate.
  • I am running on an up-to-date pypsa-africa environment. Update via conda env update -f envs/environment.yaml.

Describe the Bug

The issue appear even with a new clean repo and environment.

Error Message

If applicable, paste any terminal output to help illustrating your problem.
In some cases it may also be useful to share your list of installed packages: conda list.

<rule build_powerplants:
    input: networks/base.nc, configs/powerplantmatching_config.yaml, data/custom_powerplants.csv, data/clean/africa_all_generators.csv
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log
    jobid: 16
    reason: Missing output files: resources/powerplants.csv
    resources: tmpdir=/tmp, mem=500
INFO:snakemake.logging:rule build_powerplants:
    input: networks/base.nc, configs/powerplantmatching_config.yaml, data/custom_powerplants.csv, data/clean/africa_all_generators.csv
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log
    jobid: 16
    reason: Missing output files: resources/powerplants.csv
    resources: tmpdir=/tmp, mem=500

INFO:snakemake.logging:
INFO:pypsa.io:Imported network base.nc has buses, lines
INFO:powerplantmatching.collection:Create combined dataset for GEO, GPD
Traceback (most recent call last):
  File "/home/user/PyPSA_models/pypsa-africa/.snakemake/scripts/tmpnwsb7y6r.build_powerplants.py", line 251, in <module>
    pm.powerplants(from_url=False, update=True, config=config)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/collection.py", line 223, in matched_data
    matched = collect(matching_sources, config=config, **collection_kwargs)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/collection.py", line 96, in collect
    dfs = parmap(df_by_name, datasets)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/utils.py", line 378, in parmap
    return list(map(f, arg_list))
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/collection.py", line 71, in df_by_name
    df = get_df(config=config)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/data.py", line 297, in GEO
    res = scale_to_net_capacities(res)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/heuristics.py", line 586, in scale_to_net_capacities
    factors = gross_to_net_factors()
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/heuristics.py", line 557, in gross_to_net_factors
    df.energy_source_level_2.fillna(value=df.energy_source, inplace=True)
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/site-packages/pandas/core/generic.py", line 5575, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'energy_source'
[Tue May 31 15:49:13 2022]
INFO:snakemake.logging:[Tue May 31 15:49:13 2022]
Error in rule build_powerplants:
    jobid: 16
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log (check log file(s) for error message)

ERROR:snakemake.logging:Error in rule build_powerplants:
    jobid: 16
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log (check log file(s) for error message)

RuleException:
CalledProcessError in line 329 of /home/user/PyPSA_models/pypsa-africa/Snakefile:
Command 'set -euo pipefail;  /home/user/anaconda3/envs/pypsa-africa/bin/python3.10 /home/user/PyPSA_models/pypsa-africa/.snakemake/scripts/tmpnwsb7y6r.build_powerplants.py' returned non-zero exit status 1.
  File "/home/user/PyPSA_models/pypsa-africa/Snakefile", line 329, in __rule_build_powerplants
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/concurrent/futures/thread.py", line 58, in run
ERROR:snakemake.logging:RuleException:
CalledProcessError in line 329 of /home/user/PyPSA_models/pypsa-africa/Snakefile:
Command 'set -euo pipefail;  /home/user/anaconda3/envs/pypsa-africa/bin/python3.10 /home/user/PyPSA_models/pypsa-africa/.snakemake/scripts/tmpnwsb7y6r.build_powerplants.py' returned non-zero exit status 1.
  File "/home/user/PyPSA_models/pypsa-africa/Snakefile", line 329, in __rule_build_powerplants
  File "/home/user/anaconda3/envs/pypsa-africa/lib/python3.10/concurrent/futures/thread.py", line 58, in run
Removing output files of failed job build_powerplants since they might be corrupted:
resources/powerplants_osm2pm.csv
WARNING:snakemake.logging:Removing output files of failed job build_powerplants since they might be corrupted:
resources/powerplants_osm2pm.csv
Shutting down, this might take some time.
WARNING:snakemake.logging:Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
ERROR:snakemake.logging:Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-05-31T154906.277052.snakemake.log
WARNING:snakemake.logging:Complete log: .snakemake/log/2022-05-31T154906.277052.snakemake.log>
@hazemakhalek hazemakhalek added bug Something isn't working help wanted Extra attention is needed labels May 31, 2022
@hazemakhalek
Copy link
Collaborator Author

@energyLS @davide-f

@davide-f
Copy link
Member

davide-f commented May 31, 2022

@Hazem-IEG That's the same issue experienced in #349 ; what is the country you are testing?
PR #359 solves the issue for DRC (country @ekatef was testing), yet I'm not sure if the problem (hence the needed fix) is the same

@pz-max
Copy link
Member

pz-max commented Jun 3, 2022

@Hazem-IEG thanks it fixed also now my new environment :)

Just one note. The energy_source is indeed still in the newest powerplantmatching version: https://github.com/FRESNA/powerplantmatching/blob/b0e5a05773b88d40e99f73fd28606cdc6ea3b240/powerplantmatching/heuristics.py#L543-L545

Just as reminder, we actually work with a fork from @davide-f (https://github.com/davide-f/powerplantmatching/tree/new_pypsa_africa). I am adding a fix there in the meantime.

@davide-f
Copy link
Member

davide-f commented Jun 3, 2022

I merged the issue, but have you tested that it works with that fix?
Why did the CI work and in your case not? Have you crosschecked that?

I'll work a bit on that as well

Update Moving energy_source to energy_source_level1 may lead to unexpected behaviors: in my debugging, both energy_source and energy_source_level1 are available: I reverted the PR

@davide-f
Copy link
Member

davide-f commented Jun 3, 2022

I've uninstalled the environment and reinstalled it.
Unfortunately, I cannot reproduce the error on the tutorial: could you better explain how to reproduce it?

@pz-max
Copy link
Member

pz-max commented Jun 5, 2022 via email

@davide-f
Copy link
Member

davide-f commented Jun 5, 2022

Max, I reverted the issue because that needs to be investigated better.
I debugged on my local repo with a working workflow and both energy_source and energy_source_level_1 columns, moreover, the contents of the two columns do not match.
Hence we cannot use the latter column instead of the former.
The issue must be somewhere else; I'd like to do that, but I need the error to be reproducible unfortunately, I have now the example I asked to Carlos to maybe try to debug it :)

@davide-f
Copy link
Member

davide-f commented Jun 5, 2022

P.S. having the datasources down may play a role as well...

@pz-max
Copy link
Member

pz-max commented Jun 5, 2022 via email

@davide-f
Copy link
Member

davide-f commented Jun 5, 2022

This is the link that is crashing: https://vfs.fias.science/f/3f4cc3876f/?raw=1

@davide-f
Copy link
Member

davide-f commented Jun 7, 2022

Now the host of GEO and GPD is back on :) we go back to the normal CI

@davide-f
Copy link
Member

davide-f commented Jun 7, 2022

@Hazem-IEG and @carlosfv92 I still cannot reproduce the issue unfortunately. I am unsure whether this issue is due to corrupted input files stored by powerplantmatching.

To eliminate such option, I'd recommend the following procedure:

  1. [just to be sure] clean and update the powerplantmatching installation
 pip uninstall powerplantmatching
 pip install git+https://github.com/davide-f/powerplantmatching.git@new_pypsa_africa#egg=powerplantmatching
  1. manually reset the datafiles stored by powerplantmatching
    To do so, I'd recommend to look for the data folder of powerplantmatching and manually delete it.
    In linux, you may try find /home/{username} -name global_energy_observatory_power_plants.csv
    In my case, the path is /home/davidef/.local/share/powerplantmatching/data/in/global_energy_observatory_power_plants.csv.
    Once you have found the path, please delete entirely the poweplantmatching folder, in my case it would be rm -r /home/davidef/.local/share/powerplantmatching

Then, please try to execute the workflow again and write here if there are news.

In the last days, the sever where GEO and GPD data are stored was offline, not sure if this has led somehow to issues.

@davide-f
Copy link
Member

@Hazem-IEG @carlosfv92 , is this still an issue or can we close it?

@hazemakhalek
Copy link
Collaborator Author

It's done for me

@davide-f
Copy link
Member

@Hazem-IEG Super!
Did the fix above work? Just asking for validation so that in the case it happens again, we can reference this issue.
I will close this issue after the answer

@hazemakhalek
Copy link
Collaborator Author

Works fine after I follow option 2 here:

@Hazem-IEG and @carlosfv92 I still cannot reproduce the issue unfortunately. I am unsure whether this issue is due to corrupted input files stored by powerplantmatching.

To eliminate such option, I'd recommend the following procedure:

1. [just to be sure] clean and update the powerplantmatching installation
 pip uninstall powerplantmatching
 pip install git+https://github.com/davide-f/powerplantmatching.git@new_pypsa_africa#egg=powerplantmatching
2. manually reset the datafiles stored by powerplantmatching
   To do so, I'd recommend to look for the data folder of powerplantmatching and manually delete it.
   In linux, you may try `find /home/{username} -name global_energy_observatory_power_plants.csv`
   In my case, the path is `/home/davidef/.local/share/powerplantmatching/data/in/global_energy_observatory_power_plants.csv`.
   Once you have found the path, please delete entirely the poweplantmatching folder, in my case it would be `rm -r /home/davidef/.local/share/powerplantmatching`

Then, please try to execute the workflow again and write here if there are news.

In the last days, the sever where GEO and GPD data are stored was offline, not sure if this has led somehow to issues.

@davide-f
Copy link
Member

Thank you hazem for confirmation. I'll close the issue then

@carlosfv92 , if you still experience the same issue, I recommend to do as described. In case that doesn't solve your issue, please post again and we reopen this issue.

@pz-max
Copy link
Member

pz-max commented Jul 20, 2022

Welcome back error. Hazem and @davide-f suggestions didn't help. Running on a fresh pypsa-africa installation, brand new environment and the config.test1.yaml. Installed the environment with mamba ... Installing with conda itself was taking more than 60min (stopped it). So that's why mamba -- Seems we have a general environment issue (to harsh env constraints).

WEIRD is that the CI works. I will check to run everything with miniconda instead of mamba

INFO:snakemake.logging:
INFO:pypsa.io:Imported network base.nc has buses, lines, transformers
INFO:powerplantmatching.collection:Create combined dataset for GEO, GPD
INFO:powerplantmatching.core:Retrieving data from https://vfs.fias.science/f/b4607c76b4/?raw=1
Traceback (most recent call last):
  File "/home/max/OneDrive/PHD-Flexibility/07_pypsa-africa/0github/pypsa-africa/uncertainty-esm/pypsa-africa/.snakemake/scripts/tmpxe943z_n.build_powerplants.py", line 260, in <module>
    pm.powerplants(from_url=False, update=True, config=config)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/collection.py", line 223, in matched_data
    matched = collect(matching_sources, config=config, **collection_kwargs)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/collection.py", line 96, in collect
    dfs = parmap(df_by_name, datasets)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/utils.py", line 378, in parmap
    return list(map(f, arg_list))
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/collection.py", line 71, in df_by_name
    df = get_df(config=config)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/data.py", line 303, in GEO
    res = scale_to_net_capacities(res)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/heuristics.py", line 586, in scale_to_net_capacities
    factors = gross_to_net_factors()
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/powerplantmatching/heuristics.py", line 557, in gross_to_net_factors
    df.energy_source_level_2.fillna(value=df.energy_source, inplace=True)
  File "/home/max/anaconda3/envs/pypsa-africa/lib/python3.9/site-packages/pandas/core/generic.py", line 5575, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'energy_source'

@pz-max pz-max reopened this Jul 20, 2022
@davide-f
Copy link
Member

davide-f commented Jul 20, 2022

@pz-max the environment has always taken long to install in my case but I never measured it.
The error we are experiencing may be an environment problem as the CI works.
ppl needs some input files; when using mamba, are you sure you deleted the right input files when making the suggested change?
If you have both miniconda and mamba installed you may have multiple folders with such inputs [not sure though].

BTW, we need a reproducible procedure to be able to reproduce it. have you tested the mentioned procedure from clean and/or using a different pc?

@pz-max
Copy link
Member

pz-max commented Jul 21, 2022

It worked now.

  • I manually deleted the entire ppm folder /home/max/.local/share/powerplantmatching which apparently solved the issue

I used mamba which just took 10min to install (conda install takes at least a couple of hours):

  • Install mamba in the base environment conda install mamba -n base -c conda-forge
  • delete previous pypsa-africa environment, conda env remove -n pypsa-africa
  • install pypsa-africa mamba env create -f envs/environment.yaml
  • rerun the whole workflow from scratch with the above config. snakemake -j1 solve_all_networks

@pz-max pz-max closed this as completed Jul 21, 2022
@pypsa-meets-earth pypsa-meets-earth deleted a comment from carlosfv92 Nov 3, 2022
@pypsa-meets-earth pypsa-meets-earth deleted a comment from ekatef Nov 3, 2022
@pypsa-meets-earth pypsa-meets-earth deleted a comment from hazemakhalek Nov 3, 2022
@pz-max
Copy link
Member

pz-max commented Nov 3, 2022

Deleted some responses to avoid wrong answers. Thanks @EmreYorat for reporting this confusion

@carlosfv92
Copy link
Contributor

carlosfv92 commented Feb 14, 2023

Hi everyone! after a while this problem showed up again so I thought I could share the "fix" I found:

  1. Change the environment.yaml file to install the most recent version on the ppm by adding a line on the file after the pip command on line 78 "- git+https://github.com/pypsa/powerplantmatching@master" and removing the powerplantmatching line after line 15,
  2. Then, create the environment is created and find the local ppm folder created in your computer and delete it. In my case, it was on "C:\Users\Lenovo\AppData\Roaming\powerplantmatching"
  3. Force snakemake to run the entire workflow from the beginning using "snakemake -j 1 solve_all_networks".

@pz-max
Copy link
Member

pz-max commented Feb 14, 2023

Hi everyone! after a while this problem showed up again so I thought I could share the "fix" I found:

1. Change the environment.yaml file to install the most recent version on the ppm by adding a line on the file after the pip command on line 78 "- git+https://github.com/pypsa/powerplantmatching@master" and removing the powerplantmatching line after line 15,

2. Then, create the environment is created and find the local ppm folder created in your computer and delete it. In my case, it was on "C:\Users\Lenovo\AppData\Roaming\powerplantmatching"

3. Force snakemake to run the entire workflow from the beginning using "snakemake -j 1 solve_all_networks".

Problem was that we required a new release for powerplantmatching since only the master of ppm was working for us. Davide just added now a new release. We hope this issue is gone for a while. Thanks for reporting a solution @carlosfv92. This will help anyone experience a similar issue in future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants