Skip to content

Building a new parser_archive

Mathilde Daugy edited this page May 30, 2023 · 1 revision

A parser is a python script that defines one or more functions that fetch data for a particular zone and/or exchange.

A parser can be built if there is a public URL that contains electricity generation data for a specific zone. See the Technical Requirements for Parsers to verify that the source you have found does indeed fit these requirements.

If you're looking to contribute, but don't have a specific zone in mind, you can take a look at the Missing Countries Overview which has information on where we think a parser might be buildable.

Return value signature

A parser should return a dictionary or a list of dictionaries if multiple time values can be fetched. The backend will automatically update past values properly.

Each dictionary represents a parser event, a data point that provides the value returned by the parser for a given datetime. The expected format for a parser event follows:

{
    'datetime': '2017-01-01T00:00:00Z',
    'source': 'mysource.com',
    ***values
}

.

For parsers that return data for only a given zone (ex: production data), a zone key identifier must also be provided,

{
    'zoneKey': 'DE',
    'datetime': '2017-01-01T00:00:00Z',
    'source': 'mysource.com',
    ***values
}

and for exchange parsers, the zone identifier are sorted zone keys,

{
    'sortedZoneKeys': 'DE->FR',
    'datetime': '2017-01-01T00:00:00Z',
    'source': 'mysource.com',
    ***values
}

Parser arguments

All parsers must contain the following arguments:

  • zone key information. zone_key if the parser only fetches data for single zone. zone_key1 and zone_key2 if the parser fetches data for an exchange.
  • session: a python request session that you can re-use to make HTTP requests.
  • target_datetime: used to fetch historical data when available.
  • logger: a logging.Logger whose output is publicly available for anyone to monitor correct functioning of the parsers.

See below for complete signatures

Zone functions

fetch_consumption

Return the consumption at the current time, in the following format:

def fetch_consumption(
    zone_key: str,
    session: Session = Session(),
    target_datetime: Optional[datetime] = None,
    logger: Logger = getLogger(__name__),
) -> dict:
    return {
        'zoneKey': zone_key,
        'datetime': '2017-01-01T00:00:00Z',
        'consumption': 0.0,
        'source': 'mysource.com'
    }

The consumption values (MW) should never be negative.

fetch_price

Return the day-ahead price per MWh, in the following format:

def fetch_price(
    zone_key: str,
    session: Session = Session(),
    target_datetime: Optional[datetime] = None,
    logger: Logger = getLogger(__name__),
) -> dict:
    return {
        'zoneKey': zone_key,
        'datetime': '2017-01-01T00:00:00Z',
        'currency': 'EUR',
        'price': 0.0,
        'source': 'mysource.com'
    }

The currency values should be a three letter string representing the currency of the price value. View the code-symbol mapping from the currency-symbol-map node package.

price values is price per MWh and can be both positive and negative. It should, when possible, represent the day-ahead prices of the zone.

fetch_production

Return the production mix at the current time. It can either be a dictionary with the below fields, or a list of dictionaries if multiple time values can be fetched.

def fetch_production(
    zone_key: str = "FR",
    session: Session = Session(),
    target_datetime: Optional[datetime] = None,
    logger: Logger =getLogger(__name__),
) -> dict:
    return {
      'zoneKey': zone_key,
      'datetime': '2017-01-01T00:00:00Z',
      'production': {
          'biomass': 0.0,
          'coal': 0.0,
          'gas': 0.0,
          'hydro': 0.0,
          'nuclear': None,
          'oil': 0.0,
          'solar': 0.0,
          'wind': 0.0,
          'geothermal': 0.0,
          'unknown': 0.0
      },
      'capacity': {
          'hydro': 0.0
      },
      'storage': {
          'hydro': 0.0,
      },
      'source': 'mysource.com'
    }

The production values (MW) should never be negative. Use None, or omit the key if a specific production mode is not known.

storage values can be both positive (when storing energy) or negative (when the storage is discharged).

capacity values represent the installed capacity (MW) for each production mode.

fetch_consumption_forecast

def fetch_consumption_forecast(
    zone_key: str,
    session: Session = Session(),
    target_datetime: Optional[datetime] = None,
    logger: Logger = getLogger(__name__),
) -> dict:
    return {
        'zoneKey': zone_key,
        'datetime': '2017-01-01T00:00:00Z',
        'value': 0.0,
        'source': 'mysource.com'
    }

fetch_generation_forecast

def fetch_generation_forecast(
    zone_key: str,
    session: Session = Session(),
    target_datetime: Optional[datetime] = None,
    logger: Logger = getLogger(__name__),
) -> dict:
    return {
        'zoneKey': zone_key,
        'datetime': '2017-01-01T00:00:00Z',
        'value': 0.0,
        'source': 'mysource.com'
    }

fetch_wind_solar_forecasts

def fetch_wind_solar_forecasts(
    zone_key: str,
    session: Session = Session(),
    target_datetime: Optional[datetime] = None,
    logger: Logger = getLogger(__name__),
) -> dict:
    return {
        'zoneKey': zone_key,
        'datetime': '2017-01-01T00:00:00Z',
        'production': {
            'solar': 0.0,
            'wind': 0.0,
        },
        'source': 'mysource.com'
    }

Exchange functions

fetch_exchange

Return the cross-border flow at the current time, in the following format:

def fetch_exchange(
    zone_key1: str,
    zone_key2: str,
    session: Session = Session(),
    target_datetime: Optional[datetime] = None,
    logger: Logger = getLogger(__name__),
) -> dict:
    sortedZoneKeys = '->'.join(sorted([zone_key1, zone_key2]))
    return {
            'sortedZoneKeys': sortedZoneKeys,
            'datetime': '2017-01-01T00:00:00Z',
            'netFlow': 0.0,
            'source': 'mysource.com'
    }

The sortedZoneKeys value should be a string in the format zone_keyA->zone_keyB deciding which to put first based on alphabetical order.

The netFlow value can be positive or negative, dictating the direction of the flow. Respecting the alphabetical sort of the zone keys, a positive value means the first zone is exporting to the second zone while a negative value means the first zone is importing from the second zone.

fetch_exchange_forecast

def fetch_exchange_forecast(
    zone_key1: str,
    zone_key2: str,
    session: Session = Session(),
    target_datetime: Optional[datetime] = None,
    logger: Logger = getLogger(__name__),
) -> dict:
    sortedZoneKeys = '->'.join(sorted([zone_key1, zone_key2]))
    return {
        'sortedZoneKeys': sortedZoneKeys,
        'datetime': '2017-01-01T00:00:00Z',
        'netFlow': 0.0,
        'source': 'mysource.com'
    }

Final steps

Once you're done, add your parser to the zones.json and exchanges.json configuration files. Finally update the real-time sources.

After setting up your local development environment, you can run all of the parser tests with the following command from the root directory:

poetry run test

For more info, check out the example parser or browse existing parsers.

Test parsers locally

  1. Set up your local development environment

  2. Ensure dependencies are installed:

    poetry install -E parsers
    
  3. From the root folder, use the test_parser.py command line utility:

    poetry run test_parser FR price  # get latest price parser for France
    poetry run test_parser FR  # defaults to production if no data type is given
    
    # test a specific datetime (parser needs to be able to fetch past datetimes)
    poetry run test_parser DE --target_datetime 2018-01-01T08:00
    poetry run test_parser "CH->FR" exchange # get the exchange data between Switzerland & France

Many of the tests require API keys of the data or web service providers, and therefore fail with an error message like

Exception: No ENTSOE_TOKEN found! Please add it into secrets.env!

In such cases, please browse the website related to the provider and ask for an API key. Once you get hold of the API key, make it an environment variable. This fixes the error.

Clone this wiki locally