-
Notifications
You must be signed in to change notification settings - Fork 970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add production for AR regions #4894
base: master
Are you sure you want to change the base?
Add production for AR regions #4894
Conversation
Thanks for the great work, looking forward to see this on the map! Long version: |
Agree with everything in here but I just wanted to add since you are using a fork it is probably better to use |
3ecbbb0
to
bd1280a
Compare
@unitrium @VIKTORVAV99 Thanks for letting me know! This is our first time contributing to open-source so we're still trying to gain familiarity with workflows - your advice is much appreciated :) I've rebased the branch onto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks promising! A few comments about some improvements that could be done on the code.
Also, we have a few exchanges for Argentina, you will need to create the regional exchanges in the config as well.
So for instance we have AR_BR in config/exchanges, you will need to find out in which region on the argentinian side this exchange lives and create the corresponding exchange config. It's probably something like AR-CEN_BR-S. Also note that Brazil is an aggregated country so make sure to do it on the regional level there.
As I said, we will probably want to keep the AR zone without it becoming an aggregation as we have direct data for it. I will discuss it with the team to decide on how we want to tackle this.
|
We could probably just check if the country level zones have any parsers instead of adding additional config keys? That way we could also use the parser information to infer what parts should be aggregated or not. parser:
production: "example parser" Here we have a country level config with a production parser so we would only need to aggregate say consumption and price from the subzones (assuming they have it). |
True that would be a better way of being more granular. I'm just afraid it might add some hidden mechanics.
True that would be a better way of having a more granular control, I'm just afraid it would result in some hidden mechanics people are not aware of. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes looks mostly good but I do have some questions about the parser itself.
First it seems like the regional zones only return one data point at once? Would it be possible to return a list like for the AR zone?
It also seems to be some availability issues? I tried ran though the zones and all of the sudden it stopped working? This would probably be resolved if it's possible to return it as a list as well.
parsers/AR.py
Outdated
assert ( | ||
regions_response.status_code == 200 | ||
), "Exception when fetching regions for AR: " "error when calling url={}".format( | ||
CAMMESA_REGIONS_ENDPOINT | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if this used f"string {object}"
instead, it's a little easier to work with and it's sligltly faster from a preformance point of view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better! I haven't been able to test it locally yet though.
If you are up for it, it would be great if you could write a unit test to demonstrate how it works. For this you could just add some mock data from the API and put it in the tests/mock folder. You can have a look at test/test_NTESMO.py for inspiration.
Otherwise just missing the exchange config and we're almost done!
zone_key, current_session | ||
) | ||
non_renewables_production = non_renewables_production_mix(zone_key, current_session) | ||
renewables_production = renewables_production_mix(zone_key, current_session) | ||
|
||
full_production_list = [ | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just thinking that here we will miss datapoints which have only non_renewables.
renewables = {
'2022-12-22 01:00': {},
'2022-12-22 02:00': {}
}
non_renewables = {
'2022-12-22 01:00': {},
'2022-12-22 03:00': {}
}
Here the 03:00 timestamp would be ignored for non_renewable.
Maybe it would be better to rewrite this a bit to merge them if they exist with the same timestamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably add unique keys for hydro (since it exits in both) and then just merge the dicts with the datetime as the key, that would ensure we get all data in one place grouped by the correct datetime and then format the data in a separate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@unitrium please see my comment here, which addresses your concern:
#4894 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like Robin I am unsure how to best handle this (based on the above comment) as there seem to be no way to ensure the two APIs are in sync.
@VIKTORVAV99 Both of these stem from the same reason, which is that the renewables endpoint for regional zones only returns live data, i.e. data for the current time, which is only 1 data point. The non-renewables endpoint, which AR uses, returns data in 5-minute intervals over the entire day. I tried looking for renewables data across the entire day like the non-renewables one, but it does not seem to be available. To merge the data from the two endpoints, I'm assigning the datetime for the renewables live data to the last 5min mark (e.g. 12:09 -> 12:05), and checking the non-renewables data for the datetime. So, to answer your first question, that's why there's only one data point - we only have both renewables and non-renewable data for the last 5min mark. To answer your second question, sometimes the non-renewables data takes a few minutes to update, so the last 5min mark might not yet be available (e.g. if it's 12:09, the datetime is set to be 12:05, but the non-renewables might have only updated up to 12:00). In this case, no data would be returned since the datetime cannot be found in the non-renewables data. A solution could be to merge the live renewables data with the last updated non-renewables data whatever the time is, but this adds a dependency on the non-renewable API: if the non-renewables data takes longer than just a few minutes to update (it could be the case that it hasn't been updated for 2+ hours) the data returned by the parser would be inaccurate/out of date. Not sure what to do about the first problem of there being only 1 data point, aside from just returning all of the non-renewables data even if there is no corresponding renewables data for the older datetimes. But, not sure if that is ideal/meaningful/accurate to have just non-renewables data for past times but both for live data. Any suggestions would be appreciated! |
@unitrium I've added the regional exchange under |
config/exchanges/AR-NEA_BR-S.yaml
Outdated
lonlat: | ||
- -56.407453 | ||
- -28.880006 | ||
parsers: | ||
exchange: BR.fetch_exchange | ||
rotation: 110 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BR.py
parser does not support the AR-NEA->BR-S
exchange so this will have to be implemented in the AR.py
parser. I would suggest adding it in #4893 and then we can merge that first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in: 44946b6
Once #4893 is merged we can proceed with this and create all the new exchange configs in this PR. |
Since #4893 is merged now we can proceed with this. |
Yes and no, we need a review from @unitrium (as he requested changes) and then we need to determine if the parser is stable enough. I think my latest changes improved things a little (at least during my local testing) but we might have to rework things further. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking so much time to come back to this!
A few more stuff about the exchanges, it should be easy to fix, it's just removing code.
Regarding exchanges we should query them at the smallest granularity level, here the regions. We should maintain a single source of truth for the data. The aggregation pipeline will then create the data for the country derived from those.
- -58.192 | ||
- -32.456 | ||
parsers: | ||
exchange: UY.fetch_exchange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we won't need that, see my comments below :)
It's the same as for DK and DK-1 DK-2.
- -58.192 | ||
- -32.456 | ||
parsers: | ||
exchange: UY.fetch_exchange |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this is merged, in the database we will remap all exchanges from AR->UY to AR-LIT->UY.
Regarding the conflicts that are on your branch you probably need to:
|
@unitrium I've fixed the things you mentioned, please take another look :) |
Hey @meiyiniu, One final request, would be to add a bit of unit testing so we can have a proof that the parser works and is not regressing in the future. I know we are a bit behind on that area, but it's probably essential so we don't break stuff. So it would be great to finish the PR by adding a test case for fetching regional production data and exchanges. While writing the unit test I found out that for the national data there seemed to be an error, the renewable production list was always empty, probably because it was missing this. Have a look at it and let me know if you agree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things look good from my side so there should be nothing blocking this from approval once the tests are in.
I also did a final cleanup as the fix for AR broke all the AR-* zones but now it should work for both.
…tymaps-contrib into meiyi/AR-regional-production
…yiniu/electricitymaps-contrib into meiyi/AR-regional-production
@unitrium Thanks for making the fixes! I've added a test for regional production and for exchange. Please take a look and let me know what you think. |
@unitrium what did you and @FelixDQ come up with during your discussion, is it safe to merge this as is assuming the tests check out (I'll give them a review later today) or do we need to make any other changes in here or in the backend? (Like the retry function in the parser itself we discussed really quick a while back?) It would be great to get this merged as it seems we are way off on the carbon emissions as described in #5196 which this would take care of. |
So to put my discussion with Victor here as well.
This means that we can eliminate the flakiness coming from the fact that we have to merge two dictionaries from two endpoints. We just have to return all the production events separately. The problem in this case is that we get two hydro events. One probably big hydro coming from the renewable endpoint and another coming from smaller hydro in the other endpoint. They should be summed not averaged. So they would still need to be merged together somehow. I don't think there's an easy way out in the backend. So we would prefer having this issue solved here. I would suggest the following:
|
I might be misunderstanding what you mean, but I don't think that is exactly what's happening. The time alignment would essentially decide that |
Issue
None
Description
Adds regional production data for Argentina (real-time renewable and non-renewable), pulling from the Cammesa data source which is currently used for Argentina national data.
Preview