Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interpolate(inplace=True) doesn't work running after filter #806

Closed
gorkemgungormetu opened this issue Dec 27, 2023 · 3 comments
Closed

Comments

@gorkemgungormetu
Copy link

Referring to #240, I managed to walk around a similar problem with interpolate as below.

df.filter(model="MENR*", variable='Primary Energy*').timeseries()
model scenario region variable unit 2020 2030 2040 2050
MENR (2022) CO2 Turkey Primary Energy-Coal EJ/yr 1.699841 2.005477 0.376812
MENR (2022) CO2 Turkey Primary Energy-Gas EJ/yr 1.666346 1.997104 1.226732
MENR (2022) CO2 Turkey Primary Energy-Nuclear EJ/yr NaN 0.334944 3.068924
MENR (2022) CO2 Turkey Primary Energy-Oil EJ/yr 1.766830 2.294366 0.586152
MENR (2022) CO2 Turkey Primary Energy-Renewables EJ/yr 1.029953 1.699841 5.233500

Interpolating all the dataframe didn't work as remaining dataset had missing data for the year 2050. So I needed to interpolate this part and merge into the original dataframe which I managed in four command lines.

df_inter = df.filter(model="MENR*", variable='Primary Energy*')
df_inter.interpolate(2040, inplace=True)
df = df.filter(model="MENR*", variable='Primary Energy*', keep=False)
df = pyam.concat([df, df_inter])
@danielhuppmann
Copy link
Member

Thank you for starting this issue, but can you clarify why the "direct approach" did not work as expected?

df.interpolate(2040, inplace=True)

@gorkemgungormetu
Copy link
Author

gorkemgungormetu commented Jan 3, 2024

Because my dataframe includes additional rows with missing values in the year 2050. I expected the command would apply the method in the subgroup by using df.filter(model="MENR*", variable='Primary Energy*').interpolate(2040, inplace=True), similar to df.convert_unit('Mtoe/yr', to='EJ/yr', inplace=True), but it didn't put the interpolated data in the dataframe df.

@danielhuppmann
Copy link
Member

Ok, I see - there are two issues.

First, you are doing a chained operation. If you spell this out explicitly, it should be clear why inplace does not have the expected effect.

x = df.filter(model="MENR*", variable='Primary Energy*')
x.interpolate(2040, inplace=True)

So the inplace works on x, not df.

Second, yes, interpolate() raises an error if a timeseries does not have values before and after time. This is in line with the principle of "fail loud", but there may be better solutions or options.

I guess that your solution is indeed the best short-term strategy. I'll start a new, targeted issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants